Abstract
With the continuous deepening of the urbanization process and the progress of science and technology, people transform nature and develop nature on a larger and larger scale, among which the most iconic transformation is a variety of building structures built by people. And with the passage of time, the building structure in the perennial wind and sun, there will be signs of “illness”, if not timely treatment, it will have a huge impact on the stability and safety of the building structure. Based on this, in this paper, according to the characteristics of crack identification on the surface of concrete structure, background subtraction algorithm is selected for image noise reduction processing. Through three steps of digital image noise reduction, crack extraction and crack parameter identification, the quantitative recognition of cracks is completed and a complete system of crack parameter identification is formed. The experimental results show that the machine learning model of building structure health monitoring and damage recognition algorithm proposed in this paper has excellent statistical performance, and the relative error accuracy of recognition can be controlled within 10%.
Keywords
Introduction
Different building structures also have their different service life, for example, in the design of building structures will be based on the different importance of specific buildings and choose different design service life, in this period of time, the building structure may also be like people “sick”. For example, due to the effect of various loads on the building structure in the process of use, or due to various sudden natural disasters such as typhoons and earthquakes, certain damage will occur to the structure. As time goes by, such structural damage will continue to accumulate slowly, coupled with the continuous aging of structural materials, such as the corrosion of steel bars and the carbonization of concrete. The strength of structural materials will continue to decrease, and eventually make the structure produce local damage or even overall damage, so that the structure can no longer bear the effect of the load, at this time the structure reaches its end of life. With the development of economy, the construction of large-span, high-rise and complex structural projects increases the risk of loss caused by structural failure [1–4].
The field of health monitoring and damage recognition of building structures using image processing involves a variety of algorithms and techniques. Below are some commonly used approaches. Algorithms for detecting cracks in building structures often involve image processing techniques such as edge detection, thresholding, and morphological operations. These algorithms aim to identify and quantify the presence of cracks in images captured through various means, including cameras and drones [1–4]. In addition, texture analysis algorithms assess the surface characteristics of building materials. Changes in texture patterns can indicate potential damage. Techniques such as co-occurrence matrices and Gabor filters are used to extract texture features from images [5, 6]. In terms of change detection, the algorithms compare images taken at different times to identify any structural changes. This can include changes in shape, size, or appearance. Image differencing, image registration, and feature matching are common techniques used for change detection [7, 8]. When it comes to Computer Vision and Deep Learning (CVDL), Convolutional Neural Networks (CNNs) and other deep learning models are increasingly employed for structural damage recognition. These models can learn complex features and patterns from images, making them effective for tasks such as crack detection, deformation analysis, and overall damage assessment. What is more, infrared thermography involves capturing thermal images of building structures. Anomalies in temperature distribution can indicate structural issues. Image processing algorithms are then applied to analyze these thermal images for potential damage or weaknesses [9, 10]. In the case of image segmentation, such techniques divide an image into meaningful regions, facilitating the identification of specific structural elements. Segmenting images allows for a more detailed analysis of individual components, aiding in the recognition of damage [11].
Furthermore, 3D reconstruction algorithms create three-dimensional models of building structures based on images. This can provide a comprehensive view of the structure and help identify deformations or irregularities. In terms of feature extraction and classification, such methods identify relevant features from images, and classification algorithms categorize these features into classes such as “healthy” or “damaged.” Support Vector Machines (SVM), Random Forests, and other classifiers are commonly used for this purpose. Additionally, remote sensing technologies, including LiDAR (Light Detection and Ranging), are used to capture detailed information about the topography and structure of buildings. Image processing algorithms analyze these data to identify structural issues. Ultimately, data fusion techniques integrate information from various sources, such as images, sensor data, and historical records. This holistic approach enhances the accuracy and reliability of damage recognition algorithms [12–15].
In recent years, there have been a lot of engineering accidents caused by structural damage. Some typical accidents, such as the sudden collapse of glass roof in Transvaal water park in Moscow in 2004, directly caused 40 deaths and 110 injuries. In 2013, the Rana Plaza collapse in the Bangladeshi capital killed 1,127 people and injured about 2,500 others, making it one of the deadliest industrial accidents in Bangladesh. In 2020, the Xinjia Hotel in Quanzhou city, Fujian Province, China, collapsed, killing 29 people and injuring 42. From the above tragic engineering accidents, we can realize that timely finding the damage of the structure during its service life and making a correct assessment of the damage degree are of great significance for preventing the occurrence of accidents and ensuring the safety of people’s lives and property.
In this regard, the experts have launched a wide range of research. In reference [1], Yeum (2018) proposed a new and powerful post-disaster assessment method, that is, automatic processing and analysis of large visual data. It realized the automatic extraction of interesting visual content from the collected images. In reference [2], Patterson (2018) developed a GUI wrapper that classifies structural damage based on deep learning algorithm. In this interface, any number of classifications can be selected, and the training network can be selected according to the type of task. In the research of building structure damage based on image processing, In reference [3], Tanaka (2012) introduced the concept of morphological operation into crack recognition for the first time. Its advantage is that some linear noises can be avoided from being misjudged as crack objects in the process of image noise reduction. In reference [4], Lee (2014) extracted the crack central axis and edge line of the decomposed individual crack after de-noising the image, extracted the vertical line from the central axis, and defined the intersection distance between the vertical line and the two edge lines as the local width of the crack. However, this method has the defect that it is easy to introduce errors in searching the direction of the vertical line. In reference [5], Hernik (2019) Stang established the relationship between crack width and stiffness degradation of reinforced concrete beams in view of stress cracking type cracks, and verified it with short concrete beams. In reference [6], Lin (2017) proposed a new method for structural damage detection using deep CNN, which can automatically obtain information from low-level waveform signals instead of relying on manual marking, showing its superiority.
Through the collation and induction of the above related literature, it can be seen that although the existing research methods have proposed detailed methods for damage identification of building structures, there are also problems such as large errors and inaccurate processing of image noise reduction. In addition, few researches have been carried out specifically on damage of concrete building structures. Based on this, an adaptive image noise reduction algorithm is proposed in this paper, and the damage recognition of building structures is completed through the feature extraction and parameter recognition of building structure cracks. Compared with the above literature studies, the biggest feature of this study is that it proposes an adaptive image background noise reduction algorithm and presents a new crack width search strategy. Thus, the precise location and display of crack width and distribution are realized.The key contributions of this paper are presented as follows: For color RGB digital images, the background of the image is subtracted by the terminating filter, the gray image is binarized by the local threshold method, and an adaptive image noise reduction algorithm is proposed. To realize the separation of the crack object and the background in the image, this paper adopts morphological operation to eliminate the crack holes in the binary graph, and then gives the process of decomposes the complex tree or network cracks into individual cracks and numbers them. The calculation method of crack length and width is given, and the method of crack parameter adjustment based on thin lens imaging model is given. Finally, three concrete building cracks are taken as examples to verify the effectiveness of the proposed building structure damage identification method.
Image processing
For color RGB digital images, from the initial crack image to crack extraction, it is necessary to go through gray-scale, noise reduction and binarization operations. The gray-scale process is mature and does not need to be discussed again. This section mainly studies and discusses the noise reduction and binarization process after gray-scale.
Background subtraction algorithm
In the background subtraction operation, the filter of smoothing operation must be determined first, and the median filter is used in this paper. In the
Value filters are often used to filter “salt (white) and pepper (black)” type of point-like noise. Because the parameter selection of the filter will affect the effect of the image processing, for the median filter, its variable parameters include the window type and window size. In practice, the window type of the square is usually selected, and the window size is expressed as the side length of the square. This practice is followed in this paper.
The noise reduction filtering operation is carried out after gray-scale. The data of pixels in the gray level map can be directly represented by a two-dimensional matrix [7]. Set the gray level value of pixels at the pixel coordinate of (x,y). After obtaining the gray level map, the background subtraction operation can be carried out by the following formula: I (x, y)
In the formula,
On the whole, there are two different methods for binarization of grayscale graphs, namely the global threshold method and the local threshold method [8]. Among them, the full threshold method is relatively poor in the recognition of fine features of the image, while the local threshold law, on the contrary, has a better fine object processing ability. Based on this, Niblack’s local threshold method is selected in this paper to binarize the grayscale image.
Niblack’s algorithm calculates the threshold value for this pixel according to the situation of the points in the neighborhood centered on the pixel. The formula for calculating the pixel threshold is as follows:
In the formula m is the mean value and s is the standard deviation.
Niblack’s algorithm generates a different binarization threshold for each pixel. Specifically, a window is first identified, and when binarizing each pixel, the threshold is determined by what happens to all the local pixels within the window surrounding that pixel [9]. Set (x, y) as the pixel coordinates of a certain pixel point in the image, and the Niblack’s binarization threshold of that point is calculated by the following formula:
In the formula, T (x, y) is the threshold of the pixel, m (x, y) is the mean value of the gray level of the pixel in the window, s (x, y) is the variance of the gray level of the pixel in the window, and k is the correction coefficient, whose value range is [–0.2, –0.1].
In Niblack’s algorithm, both window size and correction factor k will have an impact on the threshold. The window size represents the number of contrast pixels to be considered when the point is binarized, and the correction factor k represents the importance to the pixel variance in the window: when k is close to –0.2, the noise will be nearly eliminated, but the edge of the cracked object will be eliminated; When k is close to –0.1, the cracked object will remain unchanged, but more noise points will still exist. From this point of view, the binary processing of gray image is the continuation of image noise reduction processing [10]. The effect of using different Niblack’s algorithms to correct the coefficient k is shown in Fig. 1 below.

The effect of correcting coefficient k using different Niblack’s algorithms.
From this figure, it can be seen that in the approximate one-dimensional graph, the change of the threshold of binarization when different correction coefficients k are selected. The solid line represents the change of the gray value of the gray map, the dashed line represents the average value of the gray value in the range, and the dotted line and the dotted line represent the binarization threshold when k = –0.1 and k = –0.2, respectively. The significance of these two threshold lines is that when the gray value is greater than the threshold, the pixel after binarization is taken as the background, and when it is less than the threshold, the pixel after binarization is taken as the image object [11]. It can be seen that a threshold value of k = –0.2 will eliminate more noise than a threshold value of k = –0.1. In this article, k = –0.2 is chosen to filter the complex noise of the concrete surface.
The research shows that when Niblack’s binarization method is used to process crack images, it is found that the Niblack’s window size is the best when it is 2–3 times the crack width. The empirical value can provide a reference for the coordination window selection between the median filter and Niblack’s binarization method. However, in the process of crack image processing, the crack width can not be determined in advance, so this paper adopts a square coordination window to determine the window size of the median filter and Niblack’s binarization operation [12]. The adaptive method used to determine the window size is as follows.
Since the width information of crack can not be obtained in the process of image noise reduction and binarization, other methods can only be used to coordinate the selection of window size. In this paper, two factors are considered to determine the optimal size: Coordinate window size changes, if the object pixel of the image after noise reduction and binarization processing the number of changes has been stable, indicating that the coordination window has been appropriate. When the change of pixel number changes rapidly with the change of the coordination window size, it indicates that the noise point or object pixel is very sensitive to the coordination window size at this time, and the coordination window has not obtained the optimal value; When the change rate of pixel points tends to be stable, it means that the noise point or object pixel has reached a stable processing effect, and the coordination window has obtained the optimal value. The value of the coordination window should not be too large. When the binary image object pixel changes have been stable, after that, increasing the window size will cause the smoothing effect of the median filter to be too strong, making the noise reduction effect of the background subtraction operation poor [13]. In summary, the optimal harmonized window value is when the pixel change of the binary image is just stable.
In this paper, OFSI (Olptimal Filter Size Index) is defined to describe the changes of pixel points of binary image objects, and the sum of pixel points of image objects N is used to represent the total amount of object pixels of binary images: For example, there are 50 crack object pixels in the binary graph (black, expressed as 0), and all the remaining pixels (white, expressed as 1) are background pixels, then N = 50. OFSI index is calculated as follows:
In the formula, N (i) is the total number of pixels of the object when the coordinate window size is i; w is the width of the image and h is the height of the image.
In order to verify the validity of OFSI index, this paper takes single-crack image and multi-crack image as examples, carries out background subtraction operation for noise reduction and Niblack’s binarization, and calculates the crack width in the figure box at the same time. The calculation results of single crack image are shown in Fig. 2 below, and the calculation results of multi-crack image are shown in Fig. 3 below.

Convergence judgment of single crack image.

Convergence judgment of multi-crack image.
As can be seen from the examples in Fig. 2 and Fig. 3, no matter for crack images with a single crack or multiple cracks, the final stable coordinated window size value of OFSI index can ensure that the calculated widths of all cracks can reach accurate solutions [14]. The optimal coordinated window size value is indeed 2–3 times of the crack width.
In this section, an adaptive noise reduction algorithm using background subtraction operation and Niblack’s binarization method is proposed. The window parameters of the two types of filters in the algorithm are automatically determined according to the situation of the crack image, and the calculation of OFSI index is summarized. The flow chart is shown in Fig. 4 below:

Convergence judgment process.
After image noise reduction and binarization, the crack object in the image can be separated from the background [15]. Crack extraction refers to the processing of crack objects in the image after the completion of image noise reduction and binarization, so as to adapt to the requirements of crack parameter calculation in the next step. In this paper, based on the characteristics of cracks in concrete building structures, the elimination of internal cracks and complex cracks and numbering are analyzed.
Elimination of holes in cracks
For wider or shallower concrete cracks, they will express inconsistent color depth in the image, and cracks with uneven gray scale will correspondingly appear in the gray scale map. After the damage reduction and binarization by the background, the crack objects with lighter color may be converted into the background and appear in the crack body in the form of holes. In this original grayscale image, the grayscale inside the crack is not uniform, and it will evolve into the hole inside the crack after binarization [16]. If the holes are not filled before the crack parameters are calculated, the results of the width calculation will be wrong. For this reason, this paper adopts morphological operation to achieve the elimination of crack holes in the binary graph. The specific process is as follows: Extract the edge of the object in the crack image after binarization, this paper uses Canny operator to achieve edge extraction. The calculation formula of image gradient is as follows:
The formula for calculating the image amplitude is:
Close edges. For open edges, use the morphological operation of first expansion and then corrosion to make them as close as possible. Note that the closure of the edge is judged by the Euler number, before judging, number all individual edges in advance. Fill all closed edges. After this operation, the outer edge of the crack, as well as the inner hole, will be filled together, thus achieving the elimination of the hole inside the crack.
The cracks on the surface of the concrete building structure generally appear in the form of trees and show more complex forms. After the crack object is obtained by image processing, it is necessary to pretreat the complex crack body to facilitate the calculation of crack parameters. Under normal circumstances, it is most convenient to decompose the complex tree or network cracks into individual cracks, because the processing of each individual crack can be completed by batch and repeated operations. The specific process is: After getting the crack body, it is necessary to extract the crack skeleton line, as shown in Fig. 5 below. The fracture skeleton line represents the most central axis of the crack, which is 8 connected and only 1 pixel wide. It is more convenient and accurate to operate the fracture skeleton line than to operate the crack body directly [17]. After the skeleton line is obtained, it should be noted that there may be small single pixel “burrs” in the skeleton line, which need to be eliminated by morphological operation. Number all the crack bodies in the image, and number the skeleton lines with the corresponding crack bodies The values are stored. Look for all bifurcation points in the skeleton line. Bifurcation points represent points where multiple (greater than or equal to 3) individual cracks meet. If no bifurcation point is found in a single crack body, the crack body is considered to consist of only one crack. All crack bodies are decomposed by bifurcation points.

fracture skeleton line.
Crack parameter identification is the process of calculating the width, length and other information of the crack, and output the result. It mainly includes the calculation of the length and width of the crack, the proportion adjustment of the crack parameters and the analysis of the accuracy of the crack parameter identification.
Calculation of crack length and width
The calculation of the crack length is directly carried out by summing the number of pixels of the skeleton line [18]. The specific steps are as follows: after the skeleton line is obtained and the burrs are removed, the skeleton line is a standard 1-pixel line, and the pixel sum represents the total length of the skeleton line.
For the calculation of the width of the crack, it is necessary to use two crack edge lines, and the local vertical line of the crack. In this paper, Canny edge operator is used to calculate the edge line. It should be noted that the resulting edge lines are 8-connected. The flow of calculating the crack width is as follows: Determine the tangent direction of the skeleton line at the width point; Determine the vertical direction of the skeleton line at the width point; Search the intersection point of the vertical line with the 2 edge lines; Calculate the distance between the 2 intersections, which is the width of the crack at that width point.
Proportion adjustment of crack parameters
The length and width of the crack are in pixels, so it is necessary to adjust the proportion before the parameter output, and convert the unit into a standard length unit such as millimeter and centimeter [19]. In the shooting of the crack image, the parameters of the camera instrument will affect the comparison and adjustment of the image, but all kinds of imaging equipment can generally be simplified into a thin lens imaging model, including the combination of SLR camera and telephoto lens. In this paper, the thin lens imaging model is adopted to adjust the crack parameters, as shown in Fig. 6 below.

Thin lens imaging model.
Set the image distance to v, object distance to u, focal length to f, adjust the size of the image back to the original size of the object, need to use the ratio of image distance to object distance. The imaging formula for thin lenses is:
The imaging ratio adjustment formula is:
In the formula, W s is the crack width after adjustment, the unit is mm; Wp is the crack width, the unit is pixel; PIS is the camera imaging element parameter.
The algorithm presented in this paper can realize the identification of small cracks on the concrete surface at a long working distance. In general, before practical application, it is necessary to know what is the narrowest identifiable crack under the determined imaging model, or the imaging requirements that need to be met in the identification of a specific crack. It is necessary to determine how many pixels wide cracks can be identified using the width recognition method in this paper. From the perspective of practical operation, since at least 3 feature lines of the crack need to be obtained in the identification process, including 1 skeleton line and 2 edge lines, the width of the crack should be at least 3 pixels. Considering the existence of the search Angle deviation, when the crack width is greater than or equal to 5 pixels, the search error will be significantly reduced than 3 pixels. Therefore, in the research process of this paper, the width of the narrowest crack should be greater than or equal to 5 pixels in the image.
In order to verify the effectiveness of the building structure health monitoring and damage recognition method based on digital image processing established in this paper, 3 images of indoor reinforced concrete with cracks were selected and discussed. This experiment was carried out in a university in China, and the test object was the reinforced concrete reaction wall in the laboratory. Due to the concrete shrinkage in the wall pouring and the repeated load in the perennial test, there are many small cracks on the wall. These cracks are the objects of observation and identification in this test. The camera used in this experiment is SONY Alpha 6400M, 23.5×15.6 mm, resolution is 6000×4000, lens focal length is fixed at 55 mm. The working distance was measured using a Bosch DLR130 laser rangefinder, and the true width of the crack was measured using a crack ruler for reference comparison.
Experimental images
The crack image of concrete building used for experimental identification is shown in Fig. 7 below. Where A is vertical crack, B is transverse crack, and C is multiple cracks. The blue circle in the crack image represents the measurement point.

Crack image of concrete building structure.
The adaptive image noise reduction algorithm proposed in this paper is used to calculate the OFSI index of each graph separately. Through the analysis of the calculation results, it can be seen that there is only one peak value in the OFSI index of the two single-crack charts in Fig. A and B, while there are two peaks in the OFSI index of the multi-crack chart in Fig. C. According to the analysis results, the dimensions of the coordination window selected in this paper are 20, 36, 74.
Calculation of crack width
The crack width calculation method proposed in section 4.1 is used to calculate the crack width of the image, and the specific calculation results are shown in Table 1 below.
Pixel size of crack recognition width
Pixel size of crack recognition width
The calculated and actual measured values of crack widths at the three marked points in Fig. 7 and the statistical results of object distance data are shown in Table 2 below, and the experimental calculation results are shown in Table 3 below.
Experimental result data
Experimental result data
Experimental error data
As can be seen from the data in Tables 2 and 3, the relative error of Fig. A, Fig. B and Fig. C is controlled within 10%, among which figure C has the lowest relative error. The reason is that Fig. C has a smaller object distance, so that the cracks in the image have more pixels along the width direction, thus reducing the error in the width calculation. On the whole, the concrete building damage identification algorithm based on image processing established in this paper has high effectiveness and accuracy, especially when the crack width pixels are larger, the accuracy of the calculation results will be higher.
In summary, this chapter proposes a set of concrete surface crack recognition method based on digital image processing, which realizes adaptive noise reduction and binarization operation of crack image, crack body extraction and parameter recognition, and displays crack morphology and width information. The advantage of this algorithm is that it can remove the strong noise such as dark spots and stains that may exist on the surface of concrete by using the background reduction algorithm, which has a good effect. At the same time, it can also search the width of the crack to avoid missing the edge line, and give a definite error range for the search Angle deviation. The experimental analysis shows that the damage identification algorithm of concrete building structures constructed in this paper has good performance, especially in the case of larger crack width pixels, it will have higher computational accuracy. On the whole, the research on image damage recognition using machine learning model in this paper has reached the expected level, but there are still problems such as high requirements on image pixels and limitations of image technology, and the depth recognition of cracks cannot be realized. In this regard, further exploration will be carried out in future research work, and ultrasonic, infrared and other methods will be adopted. Combined with the new recognition technology research.
