Abstract
In recent years, the lack of thermal images and the difficulty of thermal feature extraction have led to low accuracy and efficiency in the fault diagnosis of circuit boards using thermal images. To address the problem, this paper presents a simple and efficient intelligent fault diagnosis method combined with computer vision, namely the bag-of-SURF-features support vector machine (BOSF-SVM). Firstly, an improved BOF feature extraction based on SURF is proposed. The preliminary fault features of the abnormally hot components are extracted by the speeded-up robust features algorithm (SURF). In order to extract the ultimate fault features, the preliminary fault features are clustered into K clusters by K-means and substituted into the bag-of-features model (BOF) to generate a bag-of-SURF-feature vector (BOSF) for each image. Then, all of the BOSF vectors are fed into SVM to train the fault classification model. Finally, extensive experiments are conducted on two homemade thermal image datasets of circuit board faults. Experimental results show that the proposed method is effective in extracting the thermal fault features of components and reducing misdiagnosis and underdiagnosis. Also, it is economical and fast, facilitating savings in labour costs and computing resources in industrial production.
Introduction
Circuit abnormalities caused by component failures can make the entire system fail or even cause catastrophic accidents [1, 2]. Rapid and accurate identification of circuit board faults is beneficial to improving maintenance efficiency and restoring equipment operation as early as possible. However, it is difficult to obtain fault data in practical engineering applications due to the relative reliability of circuit boards and the complexity and variability of the operating environment, which poses a serious challenge to the fault diagnosis of circuit boards [3].
In recent years, infrared thermography has become an emerging non-destructive testing method, which is of great interest to researchers in the field of fault diagnosis due to its advantages of no contact, no downtime, simple operation, and temperature visualization [4–6]. The earliest methods of circuit board fault diagnosis using infrared thermography were the “difference method” and the “sequence method”. These methods only require comparison with standard boards and focus on image processing, including alignment, fusion, and segmentation of visible and thermal images, with the aim of visualizing the location of components [7–9]. The “difference method”, in which the difference at a given moment between the thermal images of the standard board and the board under test is compared with a set temperature threshold to identify obvious thermal faults; the “sequence method” is used for fault diagnosis by comparing the temperature profiles of the components in the thermal sequence images of the standard board and the board under test [10]. Nevertheless, the “difference method” is prone to serious misdiagnosis and underdiagnosis as it usually relies on subjective human experience when setting thresholds, while the “sequence method” is complex to operate and data-intensive due to collecting temperature profiles of all components.
With the development of artificial intelligence, machine learning has become the dominant method in the field of infrared thermography-based circuit board fault diagnosis. The diagnostic process is divided into image preprocessing, temperature feature extraction, and pattern recognition. Wang et al. and Hao et al. did a lot of research on infrared thermography-based fault diagnosis of airborne circuit boards, using the preprocessed thermal sequence images to obtain the temperature feature matrix of the components, extracting multiple temperature features reflecting the faults and a multi-level intelligent fault diagnosis model is finally established by combining SVM classifier and D-S evidence theory, which effectively improves the efficiency and accuracy of identifying different fault states of circuit boards [11, 12]. Mehra R used two-dimensional discrete wavelet transform (2D-DWT) and principal component analysis (PCA) for feature extraction and dimensionality reduction of the thermal images of the circuit boards respectively and then fed the feature vectors into SVM to finally diagnose the “health” and “fault” states of circuit boards [13]. Although machine learning-based diagnostic methods have achieved good results, it currently suffers from the following problems: firstly, the fault diagnosis process is rather cumbersome, involving techniques such as alignment, fusion, and segmentation of thermal and visible images, which is relatively inefficient. Secondly, as the temperature sensitivity of a thermal imaging camera affects the accuracy of fault diagnosis, an accurate temperature matrix needs to be obtained before extracting temperature features, which greatly increases the hardware cost of purchasing infrared equipment.
Luckily, with the continuous optimization of image feature extraction algorithms, computer vision-based classification methods are widely used in many fields and have obtained better classification results. Ahmed W et al. utilized the SURF algorithm to extract features and the K-means clustering algorithm to build a visual feature bag and then combined it with a multi-class shallow classifier to classify the three health states based on the infrared images of the PV panel with an accuracy of 97% [14]. K Bakour et al. proposed a new generic image classification method based on local and global image features, a bag of visual words, and machine learning techniques for Android malware classification [15]. Li et al. proposed a noisy partial discharge (PD) pattern recognition method based on the SURF algorithm and improved SVM, which effectively improved the accuracy of PD recognition [16]. Arora G et al. proposed a BOF and SVM-based method for the early diagnosis of skin cancer, resulting in a 3% improvement in diagnostic performance [17]. Therefore, aiming at the shortcomings of current machine learning-based circuit board fault diagnosis methods, the paper introduces a computer vision approach into the machine learning-based diagnostic method. It integrates the bag-of-features model (BOF) with the speeded-up robust features algorithm (SURF) to extract the BOSF fault features from thermal images of circuit boards and trains a lightweight multi-classification SVM model to diagnose faults.
The goal of this study is to design an optimal model for circuit board fault classification that satisfies fast diagnosis while maximizing all evaluation indicators, especially accuracy. The objectives can be formulated as follows:
The main contributions of this paper are summarized as follows. A computer vision approach is introduced to simplify the diagnostic process. The BOSF feature extraction provides a more accurate and robust representation of thermal faults, thus avoiding complex image alignment, image segmentation, and component temperature extraction operations. The multi-classification SVM model with strong generalization used is well suited to the situation of small samples and high-dimensional data in circuit board thermal images. The superiority of this method is that by replacing manual diagnosis with an intelligent method, it greatly saves computational resources in industrial production, improves the accuracy and efficiency of fault diagnosis, and has strong robustness and real-time performance.
SURF algorithm
SURF, originally proposed by Bay et al. [19] in 2006, is a robust and fast algorithm for local feature points detection and description and is the speedy version of Scale-invariant feature transform (SIFT). It is widely used in object recognition, image retrieval and image registration due to its advantages of fast computational speed and excellent scale, rotation and noise invariance [20–22]. SURF consists of two main phases: feature point detection and feature point descriptor generation.
The first phase uses the Hessian matrix, when its discriminant takes a local maximum to determine the feature points; The second phase uses the Harr wavelet to determine the principal direction of the feature points and generate 64-dimensional descriptors.
BOF model
BOF is derived from the Bag-of-Words model in the field of text retrieval and was transplanted to image processing by CSURKA G et al. [23] in 2004. The BOF model is composed of three steps. Detect interesting points in the image and construct descriptors for each point using a local feature algorithm. Use a clustering algorithm, such as K-Means, to cluster the descriptor data set into multiple classes. Construct a frequency histogram for each image based on the frequency of descriptors in the image that belong to each clustering class.
SVM classifier
SVM is a machine learning algorithm based on statistical learning theory and structural risk minimization, mainly used to solve binary classification problems. The core idea is to find a marginal maximization decision boundary to separate two classes of samples [24].
It is also widely used in the field of multi-classification because of its powerful performance like neural networks. It will be converted to several binary classification problems by one-versus-one strategy (ovo) or one-versus-rest strategy (ovr) when solving multi-classification problems [25,26, 25,26].
A thermal images-based fault diagnosis method: BOSF-SVM
When a circuit board is energized, the energy-consuming components heat up and produce thermal radiation which is detected by the thermal imaging camera and converted into thermal images to show its temperature field. The brightness and darkness of the thermal image correspond to the intensity of the thermal radiation, which is different in normal and faulty conditions [18]. When a component is short-circuited, the current through it increases, resulting in an increase in thermal radiation, a brighter thermal image, expansion of the heated area, and more visible texture; when a component suffers from open-circuited or a dummy connection, the current through it decreases, resulting in less thermal radiation, a darker thermal image and a less distinct texture.
Consequently, due to the above characteristics of circuit board thermal images, this paper correlates the texture features of thermal images with different fault states of circuit boards and proposes the fusion of BOF and SURF as a local feature descriptor to extract texture features of the faulty components from thermal images according to Section 3.3, and then applies a multi-classification SVM model as a classifier to realize circuit board fault diagnosis as noted in section 3.4. In the end, a circuit board thermal image fault diagnosis model with good generalization capability is constructed. Figure 1 gives the flowchart for the method proposed in this paper. The specific fault diagnosis algorithm in this paper is given in Table 1.

Flowchart of the proposed method.
BOSF-SVM modelling algorithm
Given that there is no publicly available circuit board thermal image datasets, this paper uses two homemade datasets for the fault identification experiments. Experiments were conducted on the human reaction speed test board and the simulated remote control fan board, abbreviated as fan board and speed board, as shown in Fig. 1, introducing faults to the normal board by making artificial faults. The speed board includes 5 artificially set fault states and 1 normal state; the fan board includes 2 artificially set fault states and 1 normal state. The experimental setup has been shown in Fig. 1 with a thermal imaging camera, circuit boards, a programmable power supply, a PC, storage hard disk and communication buses. Using the InfiRay Xtherm II T2L thermal imaging camera to capture thermal images of both boards at full capacity after a period of power-up, 100 saturated thermal images were taken for each state of both boards at the same ambient temperature, 600 for the speed board and 300 for the fan board.
Image preprocessing
The image size was turned to 1420*1080 since the thermal images were taken with the thermal imaging camera connected to the mobile phone and transferred to the PC when stored later. To reduce the computational complexity, the thermal images were first downscaled to 355*270 and then greyed out. In the end, contrast-limited adaptive histogram equalization was adopted to enhance contrast, thus highlighting the features of the region of interest in thermal images while playing a role in noise suppression.
BOSF feature extraction
BOSF is constructed by the fusion of the BOF model and the SURF algorithm. Firstly, it uses SURF to extract the preliminary fault features of the circuit board thermal images. It represents an image as a collection of local SURF feature descriptors. Then, a visual vocabulary is generated using K-means and the descriptors are assigned to the visual words according to the nearest neighbour principle for thermal image vectorization and dimensional reduction.
Given thermal images IR = {x i : i = 1, 2, 3, . . . , N}, for each thermal image x i , the calculation of its BOSF vector BOSF i can be expressed by 3 sub-calculations:
1) It selects the points of interest of a circuit board thermal image using the Hessian matrix:
When the discriminant of the Hessian matrix achieves a local maximum, the current point is determined to be a key point which is brighter or darker point than other points in the surrounding neighbourhood. Thus, p i , a collection of SURF feature points, is obtained.
2) To make the feature point rotationally invariant, SURF determines the principal direction vector by calculating the Haar wavelet response values on the x and y axes for all points in the neighbourhood of the feature point. After obtaining the principal directions of the feature points, a square area with a side length of 20σ is determined with the feature point as the center and the principal direction as the x-axis, the square area is divided into 4*4 sub-regions, and the Haar wavelet response values d
x
and d
y
in the horizontal and vertical directions are calculated for each area. Then, we represent the sub-region by a four-dimensional vector as Eq. (6).
Finally, the vectors of the 16 sub-regions are concatenated to form a 64-dimensional descriptor. A preliminary feature matrix m i is formed by all the descriptors from a thermal image.
3) K-means is used to cluster all SURF descriptors of the training set m
train
10 times, each time randomly selecting the initial clustering centers and taking the best clustering result as the final clustering result, thus obtaining k clustering centers with which to construct a fault visual vocabulary of size k as
The lightweight and fast multi-classification SVM model is adopted to meet the requirement of real-time processing. The BOSF vectors obtained after feature extraction was used as the final valid input to the multi-classification SVM model. The “ovr” strategy is employed to build the same number of SVM classifiers as the fault types, where the goal of each SVM is to maximize the margin of decision boundary to separate one class from the rest samples. Each binary classifier is calculated as follows.
Given N training samples where
Introduce the Lagrange multiplier method to solve the optimization problem. When the Lagrangian function satisfies the KKT condition, the problem is converted into solving the maximum of the Lagrangian pairwise functions as shown in Eq. (11).
The experiments are based on a Python 3.7 environment, using AMD Ryzen 5 5600H, 16GB RAM, NVIDIA GeForce RTX 3050Ti, mainly using libraries such as scikit-learn and OpenCV.
Introduction of data sources
To verify the effectiveness and superiority of the proposed method, this paper uses two homemade thermal image datasets for the circuit board fault identification experiments. The dataset will be parted into training and testing (30 : 70) as shown in Tables 2 3.
Fan dataset division
Fan dataset division
Speed dataset division
Considering that the data acquired in real-world industrial production may have the issue of uneven samples of different classes, thus for a multi-classification problem with m classes of samples, this paper uses a weighted-average calculation to evaluate the classification performance of the fault diagnosis model on both datasets. The range for all four evaluation indicators is [0,1]. A higher value means better performance.
The BOSF-SVM method extracts the fault features for each thermal image through the BOSF model to obtain a BOSF vector of length k, which is used as the actual input to the final multi-classification SVM model to achieve fault classification. An important parameter of the BOSF-SVM is the vocabulary size K, whose value will directly affect the generalization of the classifier. Hence, in this paper, different k values were selected in steps of 50 within the range of [50,300] and the effect of k on the average classification accuracy was observed on the test set.
As can be seen from Fig. 2, the speed dataset and fan dataset exhibit the highest fault diagnosis accuracy at k = 200 and k = 100, with 99.76% and 99.05% respectively, which is due to the limited valid feature points in the faulty component region of the thermal images. As a result, the accuracy of the algorithm no longer increases when k exceeds a certain value and even decreases due to classification errors caused by redundant descriptions.

The average accuracy of fault classification for different k values (a) The speed dataset (b)The fan dataset.
To verify the superiority of the feature extraction method in this paper, it was compared with four image feature extraction methods, namely gray-level histogram statistical features (GHSF) [27], gray level cooccurrence matrix (GLCM) [28], hu invariant moment (huMoment) [29] and histogram of directional gradient (HOG) [30].
The illustration and parameters of different feature extraction methods are depicted in Table 4. Here the ovr SVM classifier with the RBF kernel function was established for the experiment. To avoid overfitting, a grid search method with 5-fold cross-validation was performed by taking 10 values separately at C ∈ [1, 100] and γ ∈ [10-5, 10] to obtain the best classification results.
Illustration & parameters of different feature extraction methods
Illustration & parameters of different feature extraction methods
In Fig. 3, it can be seen that HOG and BOSF perform well, with scores above 0.95 for all the four evaluation indicators, as they utilize the local gray-level gradient information of the thermal images, while the other three methods perform poorly, with scores below 0.9 for all the four evaluation indicators. GHSF performs the worst as it only uses the global gray-level information of the thermal images. It is worth noting that BOSF has all scored close to 1. This is because the proposed method can further extract useful information from the preliminary fault feature matrix of each thermal image while reducing the feature dimensionality and better describing the thermal fault information of the components in the thermal image of the board, which basically does not lead to misdiagnosis and underdiagnosis.

Comparison of the scores of different feature extraction methods (a) The speed dataset (b) The fan dataset.
To prove that the multi-classification SVM model has stronger generalization capability, experiments are performed in conjunction with the BOSF fault feature extraction method. It was compared with two classifiers, error back propagation neural network (BP) [31] and Naive Bayesian (NB) [32].
The illustration and parameters of different classification methods are shown in Table 5. Here, the BOSF method was chosen for feature extraction, using the optimal parameters from Section 4.3.
Illustration & parameters of different classifiers
Illustration & parameters of different classifiers
In Fig. 4, it can be seen that when BOSF is applied for feature extraction, the three classifiers score above 0.91 for the four evaluation indicators, which indicates that the component fault features extracted by BOSF have strong robustness. It is worth noting that the multi-classification SVM model in this paper has the best performance with all the four evaluation indicators higher than 0.995, while the other two have poor performance with all the four evaluation indicators lower than 0.95, and the BP classifier is the worst. It can be seen that the multi-classification SVM model used in this paper is more generalizable and more accurate than the other two classifiers.

Comparison of the scores of different classifiers (a) The speed dataset (b) The fan dataset.
In order to verify the advantages of the proposed method over the traditional deep learning method under small sample datasets, BOSF-SVM and Resnet18 [33] were compared in experiments under the same conditions of dataset division, as shown in Table 6 and Fig. 5. Also, the potential of the proposed method is shown by verifying its performance relative to the state-of-the-art computer vision-based diagnostic methods for circuit boards. The competitors include SVM+D-S [12], 2D-DWT+PCA+SVM [13] and a CNN-based DL model [34], as summarized in Table 7.
Comparison of the results of different diagnostic methods
Comparison of the results of different diagnostic methods

The confusion matrix of different methods (a) Resnet18 on the speed dataset (b) BOSF-SVM on the speed dataset (c) Resnet18 on the fan dataset (d) BOSF-SVM on the fan dataset.
Comparison of the results of computer vision-based diagnostic methods
The parameters of the BOSF-SVM method were the best parameters obtained in section 4.3 and section 4.5. The Resnet18 method used the tensorflow-gpu 2.6 deep learning framework, setting epoch to 70 and batch size to 8, selecting the Adadelta optimizer and choosing the cross-entropy loss function.
The best results are marked in bold in Table 6. When faced with the task of small sample datasets, although the accuracy of both methods is high on all four evaluation indicators, the proposed method is better than Resnet18 on both datasets. In terms of diagnostic accuracy, the method is at least 2% higher than Resnet18 in both datasets. In terms of diagnostic speed, the single image prediction time of the method is milliseconds in both datasets, much faster than Resnet18. Besides, it can be seen from the confusion matrices in Fig. 5 that the proposed method has few sample classification errors on both datasets. It can be seen in Table 7, the proposed method performs well under both circuit board thermal image datasets, outperforming the latest comparative literature methods under optimal parameters with accuracy above 99% and diagnostic speed under 0.03s.
In this work, a thermal imaging-based intelligent fault diagnosis approach BOSF-SVM has been presented. It first employs secondary manual extraction of features to provide a more precise description of the faulty components in the thermal images of the circuit boards. Secondly, the multi-classification SVM model has the advantages of targeting a small sample size, high generalization capability, fast speed, and adjustable parameters, and is well suited to the situation where it is difficult to obtain a large number of fault thermal image samples from the board. Combining the above two aspects, the proposed method effectively reduces misdiagnosis and underdiagnosis, making the fault diagnosis process simple and efficient. Using an intelligent method to automate circuit board fault diagnosis, it is of practical significance to improve maintenance efficiency and provide a strong guarantee for early recovery of equipment operation in actual industrial production.
As the circuit boards used in this paper is relatively simple, the fault features are obvious, so the experimental effect is relatively good. However, the actual circuit boards are complex and diverse, and their fault features are not easy to extract, coupled with the possible problem of unbalanced samples of various types, so the fault diagnosis model will be further optimized in the future to make it more suitable for practical application scenarios.
Footnotes
Acknowledgments
This work has been supported by National Natural Science Foundation of China (No. 52175379), Liaoning Provincial Science and Technology Department (No. 2022JH2/101300268), and Postgraduate Education Reform Project of Liaoning Province(2-4).
