Motion blur image restoration algorithm for wind power equipment based on Fourier convolution and HBN mechanism

Abstract

To address the motion blur problem in inspection images of wind power equipment, this paper proposes a fast motion deblurring method based on a Multi-Input Multi-Output (MIMO) framework. First, considering the presence of both linear and nonlinear motion blur in wind power equipment images, we construct a real-world motion-blur dataset for wind turbine inspection. Second, to capture the frequency characteristics of blurred inspection images, a Fourier domain convolution model was designed, enabling the network to better capture global differences between blurred and sharp images. This improves performance while reducing model size. Then, a Half-Batch Normalization (HBN) module is proposed to retain more original feature information during normalization, further improving the algorithm's effectiveness. Additionally, an Efficient Channel Attention (ECA) mechanism is integrated into the network to expand the receptive field of convolution operations and enhance the performance of Fourier convolution, thereby improving deblurring quality. Experimental results demonstrate that the proposed method outperforms existing algorithms in terms of Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) on wind power inspection images. Furthermore, the model size is compressed to 39.5 MB, and the restoration speed is increased to 0.52 s. The code is available at : https://github.com/lingzhiy/Motion.

Keywords

Wind turbine inspection image restoration Fourier domain batch normalization

1 Introduction

Wind energy is a sustainable energy source with minimal environmental pollution. Wind turbines are the primary devices used to convert natural wind energy into electrical power through the rotation of their blades. Routine inspection of wind turbines primarily relies on identifying surface damage,^1,2 such as leading-edge erosion, surface cracks, damaged lightning arresters, and broken vortex generators. Wind turbines are typically constructed on a large scale to ensure sufficient power output. For example, mainstream wind turbine models in China have a capacity of 1500 kW, with rotor diameters of approximately 77 meters and tower heights reaching 70 meters. Due to their large size and height, manual inspection is difficult and often impractical. Therefore, wind turbines are commonly monitored using drone-captured images.^3–5

When aerial vehicles capture images of wind power equipment, motion blur often occurs. The presence of blur in the collected images degrades the performance of damage detection, thereby hindering timely maintenance and potentially causing severe equipment damage. Motion blur is caused by relative movement between the camera and the target object. Linear motion produces linear blur, while nonlinear motion results in rotational blur.⁶ During flight at varying speeds, influenced by high-altitude airflow or operator errors, the aerial vehicle's speed fluctuates, causing linear motion blur in the captured images. Conversely, during stationary shooting, due to the large rotor diameter of wind turbines and a constant angular velocity, the blades closer to the edge experience greater rotational displacement than those near the center. This results in different degrees of blur across various parts of the blades, constituting nonlinear rotational motion blur.^7,8 For wind power equipment, rapid and effective image processing is required, making it crucial to develop methods capable of quickly restoring both linear and nonlinear rotational motion blur.

Current research on motion blur image restoration primarily focuses on two categories: prior knowledge-based algorithms and deep learning-based methods. Prior knowledge-based approaches typically assume a predefined blur kernel. Researchers such as Jia,⁹ Levin,¹⁰ Perrone,¹¹ and Fei Wen¹² have proposed various methods to optimize these priors; however, the restored images often suffer from poor quality, with unclear details and edge information. These methods estimate the blur kernel by imposing multiple constraints while simultaneously solving for the latent sharp image. For instance, Chen et al.¹³ proposed a method that heavily relies on regularization terms. Consequently, prior-based restoration algorithms commonly exhibit significant artifacts, fixed multi-scale iteration counts, and unsatisfactory restoration performance, limiting their applicability in practical scenarios.

With the continuous advancement of deep learning technologies, deep learning-based deblurring methods have emerged that do not rely on prior knowledge of blur kernels, directly restoring blurred images through deep networks and thereby overcoming the limitations of prior-based approaches. These algorithms are primarily built upon frameworks such as CNNs, GANs, and Transformers. Nah¹⁴ first proposed a kernel-free end-to-end convolutional neural network (CNN) framework that utilizes an improved residual structure¹⁵ to learn the difference between blurred and sharp images. Subsequently, Kupyn et al.^16,17 and Zhang¹⁸ introduced GAN-based methods to further enhance image restoration quality. Following this, Zamir,¹⁹ Suin,²⁰ Mao,²¹ and others incorporated various modules within network architectures to improve learning capacity, though this resulted in larger models and longer training times. Gao et al.²² applied diffusion models to address wind turbine blade image restoration; however, the introduced diffusion process significantly reduces restoration speed, which adversely affects real-time processing. Li et al.²³ conducted a systematic review of image restoration methods based on diffusion models, and pointed out that diffusion models have already surpassed traditional GAN methods in deblurring tasks, while also noting that frequency domain analysis provides a new perspective for image restoration. Prameeladevi Chillakuru et al.²⁴ employed an improved version of GoogleNet to conduct restoration experiments on a custom dataset, achieving superior signal-to-noise ratio, structural similarity, and speed compared to traditional methods. Zheng et al.²⁵ used a multi-scale attention fusion module to dynamically weight cross-layer features with global attention, achieving efficient hierarchical feature aggregation and performing well in image restoration for general dynamic scenes. With the advent of Transformers, Wan,²⁶ Lin,²⁷ and Zamir²⁸ proposed various Transformer-based networks for motion blur restoration. Although Transformers offer strong global context modeling capabilities that mitigate CNN drawbacks such as limited receptive fields and input content adaptability, their computational complexity grows quadratically with spatial resolution, rendering them impractical for motion blur restoration tasks in wind power equipment images. Despite the promising restoration performance of these large-scale models, their long training times and slow inference speeds cannot meet the demands for fast and efficient processing required in wind power inspection scenarios. Jiang et al.²⁹ reviewed various types of degradation phenomena from the perspective of frequency and proposed the SFHformer framework, integrating the fast Fourier transform mechanism into the Transformer architecture. Through dual-domain hybrid modeling in the spatial and frequency domains, they achieved excellent performance in multiple restoration tasks. This work validates the universal value of frequency-domain information in image restoration. Chen et al.^30,31 proposed the HINet and AdaRevD algorithms, while Chu et al.³² introduced the Test-time Local Converter (TLC) method; These studies have effectively reduced network model sizes, improved restoration speed, and enhanced algorithm practicality. However, since all these methods perform convolution operations in the spatial domain, they still require stacking multiple convolutional layers to enlarge the receptive field. For motion-blurred images in wind turbine inspection, issues of relatively large model size and slow restoration speed persist. In terms of agent-based methods, Jiang et al.³³ proposed the MAIR multi-agent image restoration system, which divides real-world degradation into three categories: scene degradation, image degradation, and compression degradation, improving restoration quality while reducing inference costs. Sun et al.³⁴ proposed AdaPrompt-IR, which decomposes degradation representation codebooks and captures various types of degradation, combined with degradation semantic mining and prompt learning, achieving unified processing across multiple restoration tasks. This method verified the feasibility of cross-task learning of degradation commonalities.

In addition to the aforementioned deep learning-based motion deblurring methods, another category of related research explores the intrinsic structure of data through weakly supervised or semi-supervised learning methods to reduce reliance on large-scale paired datasets. In the field of unsupervised graph learning, Wang et al.³⁵ proposed a discrete multi-view graph clustering framework that learns the similarity between samples by directly optimizing discrete clustering indicators. Although such methods are primarily used for clustering tasks, their ability to capture the global distribution and local similarity of data provides a new perspective for handling non-uniform blur distributions in images. In semi-supervised image restoration, Su et al.³⁶ addressed the image dehazing task by proposing training with both synthetic hazy images and real hazy images: for real hazy images, multiple prior-based dehazed images are used as pseudo-clear images, and a supervision signal is formed through an image quality-guided adaptive weighting scheme; in weakly supervised image restoration, Wang et al.³⁷ captured lighting information and haze distribution by estimating atmospheric light, scattering coefficients, and scene depth, and designed a discrete wavelet discriminator to enhance the model's generalization ability to real scenes from both spatial and frequency dimensions. The semi-supervised/weakly supervised learning strategies, domain adaptation mechanisms, and physics model-guided ideas adopted in the above studies provide important methodological references for the deblurring task of wind power inspection images in this paper, particularly offering insights into reducing reliance on paired data and improving generalization capability in real-world scenarios.

Research indicates that high-frequency information covers the entire receptive field of an image; In other words, in the frequency domain, there is no need to stack convolutional layers to expand the receptive field. Therefore, this paper proposes a frequency-domain-based motion blur restoration algorithm tailored for wind power equipment images, involving the following steps: first, a convolutional block that exclusively learns high-frequency information is designed, which reduces network depth, compresses model size, and enhances algorithm performance; Second, to address the suboptimal restoration of linear motion blur and poor contour recovery, we introduce a Half-Batch Normalization (HBN) module. This module enables the network to capture more global information and improves restoration quality; Finally, an Efficient Channel Attention (ECA) mechanism³⁸ is incorporated into the network to strengthen local convolutional feature interaction, thereby enhancing deblurring quality. The proposed algorithm, targeting wind power equipment images, effectively addresses both linear motion blur and nonlinear rotational motion blur problems exhibiting similar characteristics.

2 Algorithm design

Inspection images of wind power equipment are primarily acquired using aerial vehicles, and the causes of motion blur in these images can be categorized into two main types: First, during high-altitude flights, aerial vehicle body vibrations caused by airflow disturbances lead to camera shake, resulting in motion blur; Since the airflow direction is generally stable, this motion blur is characterized as linear motion blur. Second, when the aerial vehicle hovers during imaging, the normal operation of the wind turbine causes the blades to rotate, generating motion blur in the captured images due to blade movement; this blur manifests as nonlinear rotational motion blur. The proposed algorithm is specifically designed to address and restore these two types of motion blur encountered in wind power equipment inspection images.

2.1 Algorithm overall framework

This paper adopts the overall framework of the MIMO³⁹ algorithm and, considering the specific characteristics of motion-blurred wind power equipment images and the need for rapid and efficient processing, redesigns the encoder part of the U-shaped structure within the MIMO algorithm. The encoder's residual structure consists of conventional 3 × 3 convolutional blocks and FT-Blocks, where the FT-Blocks are constructed using FT-Conv, Efficient Channel Attention (ECA), and Half-Batch Normalization (HBN). FT-Conv directly compensates for the high-frequency information loss caused by motion blur through a global receptive field in the frequency domain, fundamentally addressing the issue of insufficient receptive fields in spatial domain CNNs, while also reducing the number of parameters in the encoder; HBN, through a structured design of ‘half regularization, half identity mapping,’ retains the original feature information while ensuring training stability, providing a stable propagation path for the frequency domain features extracted by FT-Conv; ECA, with an almost negligible increase in parameters, further amplifies the effective channel response based on the well-constructed features by FT-Conv and HBN. This collaborative design enables the model to achieve high-quality restoration of both linear and nonlinear motion blur in wind power inspection images while maintaining lightweight (39.5 MB) and real-time inference (0.52 s per frame), the detailed structure is illustrated in Figure 1.

Figure 1.

Overall architecture of the algorithm.

In order to better understand how the input image is ultimately transformed into the restored image through each module, Table 1 clearly describes the overall process of the model.

Table 1.

Algorithm flowchart.

Description
Input blurred image $I_{b l u r} \in R^{H \times W \times 3}$ , modules of the model that process images, initialize parameters: $α = δ = λ = 1$
Shallow feature extraction, the image passes through a 3 × 3 convolution to extract the initial feature map. F₀ = Conv₃ _× ₃ (I_blur)
Fourier convolution: transform the feature map F0 from the previous stage into the frequency domain using the discrete Fourier transform, perform the convolution operation, and finally transform it back to the spatial domain through the inverse Fourier transform. F₁ = FFT (F₀)→Conv_freq→IFFT (Conv_freq)
The HBN module divides the feature map F1 into two parts, one part passes through batch normalization (BN), while the other part directly retains the original feature information, and finally they are concatenated to obtain a new feature map. F₂ = Concat (BN (F₁), F₁)
ECA module, adaptively calculates channel weights through shift convolution and applies them to feature map F2. F3 = ECA (F2)
The encoder extracts multi-scale features through progressive downsampling. F4 = Encoder (F3)
The decoder gradually upsamples to restore spatial resolution, and finally fuses with low-level details through skip connections to output the restored image features. F5 = Decoder (F4)
Output the restored image, mapping the feature map back to the RGB space through a 3 × 3 convolutional layer to obtain the restored image. I_restored = Conv₃ _× ₃(F₅)
Output the restored image I_restored.

2.2 Fourier convolution (FT-Conv) design

Existing deep learning-based motion deblurring methods primarily perform convolution operations in the spatial domain, which corresponds to low-frequency information. To enlarge the receptive field in the spatial domain, these methods require stacking an increasing number of convolutional layers, resulting in larger network models that fail to meet the demands for fast and efficient processing in wind power equipment image applications. Since frequency-domain representations inherently capture global information, the network can achieve a strong receptive field from the early stages. Moreover, a comparison between the frequency spectra of sharp and blurred images (as shown in Figure 2) reveals that blur information is predominantly distributed in the high-frequency regions. Therefore, this study introduces the Fourier transform (FT) into the design of deep learning convolutional blocks, transforming image data from the spatial domain to the frequency domain and performing convolution operations specifically on the high-frequency components.

Figure 2.

Frequency Spectra of blurred and sharp images.

The proposed Fourier Convolution (FT-Conv) in this paper applies the Discrete Fourier Transform (DFT) to the spatial domain information of the image, as shown in Equation (1).

X [k] = \sum_{n = 0}^{N - 1} x [n] e^{- j \frac{2 π}{N} k n}

(1)

In this equation, $x [n]$ represents the pixel values of the image, and $x [k]$ denotes the spectrum at frequency $W_{k} = 2 π k / N$ , where j is the imaginary unit. It is important to note that the DFT of a real signal $x [n]$ is conjugate symmetric, as shown in equation (2).

X [N - K] = \sum_{n = 0}^{N - 1} x [n] e^{- j \frac{2 π}{N} (N - k) n} = X * [k]

(2)

As shown in Equation (2), since the Fourier transform decomposes a signal into its constituent frequency components, Fourier convolution can simultaneously consider all frequency components, thereby capturing global information. Based on this principle, this paper designs the Fourier Convolution (FT-Conv) as follows: first, the spatial feature maps are transformed into the frequency domain using the Discrete Fourier Transform (DFT); Then, conventional convolution is applied in the frequency domain, which avoids the high computational cost associated with direct Fourier-domain convolution, enabling effective learning of high-frequency information. Due to the properties of the FT-Conv, the algorithm captures global image information from the early stages, thereby better capturing the global differences between blurred and sharp image pairs.

The performance advantages of FT-Conv can be analyzed from the perspective of signal processing. According to the convolution theorem, convolution in the spatial domain is equivalent to pointwise multiplication in the frequency domain: $F (f * g) = F (f) \cdot F (g)$ . This means that each point in the frequency domain contains information about the entire image, so FT-Conv naturally has a global receptive field, fundamentally solving the problem that spatial domain CNNs need to stack multiple layers to expand the receptive field. From the perspective of motion-blurred frequency features, the blurring process essentially attenuates high-frequency components: assuming the spectrum of a sharp image is $F_{s h a r p} (u, v)$ , and the frequency response of the blur kernel is $H (u, v)$ , then the spectrum of the blurred image is $F_{b l u r} = F_{s h a r p} \cdot H$ . The goal of FT-Conv is to learn a frequency-domain filter $G (u, v)$ such that $F_{s h a r p} \approx F_{b l u r} \cdot G$ , directly compensating for the high-frequency components.

2.3 Half-batch normalization (HBN) mechanism

In the feature extraction network of the proposed algorithm, a large amount of feature information is extracted. Applying standard Batch Normalization⁴⁰ to these features can easily lead to the loss of original information, resulting in degraded restoration quality of linear motion-blurred images and poor recovery of contour edges in wind turbine images. To address this issue, a novel normalization module called Half-Batch Normalization (HBN) is designed, as illustrated in Figure 3.

Figure 3.

Schematic diagram of the half-batch normalization mechanism.

The core design idea of HBN is to perform selective regularization of features along the channel dimension, with half of the channels undergoing BN to ensure training stability, and the other half maintaining an identity mapping to preserve the original feature information. Specifically, this module first splits the input feature map into two parts along the channel dimension. The data in one half of the channels is processed by standard batch normalization, as shown in equation (3)

y_{B N} = \frac{x - μ}{\sqrt{δ^{2} + ε}} * γ + β

(3)

In this equation, x and y denote the input and output data, while $μ$ and $δ^{2}$ represent the mean and variance of the mini-batch data, $ε$ is a small constant, and both $γ$ and $β$ are learnable parameters that adjust the data distribution, accelerate network convergence, and enhance the network's expressive capabilities.

Half of the channel data remains unprocessed, and the original channel information is directly output. The feature information from the processed and unprocessed channels is then combined to produce the final output $y_{H B N}$ , as shown in equation (4).

y_{H B N} = y + y_{B N}

(4)

Here, y denotes the output without any processing, and $y_{B N}$ represents the output obtained after Batch Normalization (BN) processing.

The output information entropy of standard BN is mainly limited by the estimation accuracy of batch statistics, whereas HBN, by retaining the original distribution of half of the channels, ensures that the lower bound of the overall output entropy is higher than that of standard BN. In other words, HBN can preserve more of the original feature information while performing regularization. Let the information entropy of the original features be H(X); HBN's output information entropy satisfies $H (H B N (X)) = h (X_{i d}) + h (B N (X_{b n})) \geq H (B N (X_{b n}))$ , proving that its information retention capability is superior to that of standard BN. Furthermore, HBN only needs to learn $γ$ and $β$ parameters for half of the channels, reducing the number of parameters by 50%. The gradient of the identity mapping branch is directly backpropagated, independent of batch statistics, providing a stable gradient path for the network and enhancing training robustness.

Therefore, this module not only retains the features of the original BN module that make the data distribution more stable and improve model performance and generalization ability, but also, by preserving half of the original feature information of the channels, it enhances the fusion of high-level and low-level information in blurry images, allowing the algorithm to acquire more global information, and the $γ$ and $β$ of half of the channels do not need to be learned, achieving a certain reduction in the number of parameters.

2.4 Enhanced channel attention (ECA) mechanism

To further enhance the algorithm's performance while minimizing model size and parameters,⁴¹ an Enhanced Channel Attention (ECA) module is incorporated after the convolutional operations. This module dynamically adjusts the weights of different channels based on their inter-channel dependencies, thereby improving the restoration capability of the algorithm. The ECA mechanism effectively enlarges the receptive field of convolutional operations, enabling better capture of contextual information within images, which in turn boosts the performance of the CNN and improves image deblurring quality, as illustrated in Figure 4.

Figure 4

ECA schematic diagram.

2.5 Loss function

This paper uses a composite loss function to train the network, which mainly consists of three parts: MSE Loss, Frequency Loss, and Edge Loss. The total loss function is mathematically defined as equation (5), where $λ_{m s e} = 0.9$ , $λ_{f r e q} = 0.05$ , and $λ_{e d g e} = 0.05$ are the weighting coefficients for each loss, respectively.

L_{t o t a l} = λ_{m s e} \cdot L_{m s e} + λ_{f r e q} \cdot L_{f r e q} + λ_{e d g e} \cdot L_{e d g e}

(5)

MSE Loss measures the pixel-wise difference between the restored image and the true clear image, ensuring that the restored image is consistent with the real image at the pixel level. Let the predicted image be Y, the true label image be y, the image size be H × W, and each pixel position be (i, j), as in equation (6):

M S E L oss(Y, y) = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} {(X_{i, j} - Y_{i, j})}^{2}

(6)

The Frequency Loss constrains the consistency between the restored image and the real image in the frequency domain. First, a discrete Fourier transform is applied to the two images separately, and then the Euclidean distance between their spectra is calculated to guide the network in better restoring high-frequency detail information. Here, let the input image be X, the predicted image be x, $N = H \times W$ be the total number of pixels, $F (\cdot)$ denotes the two-dimensional discrete Fourier transform, and $| \cdot |$ denotes the complex amplitude, as shown in equation (7):

F r e q u e n c y L o s s = \frac{1}{N} \sum {(| F | (X) - | F | (x))}^{2}

(7)

Edge Loss enhances the clarity of edge contours in restored images, which is beneficial for the restoration of the edge contours of wind turbine blades. Here, W is the real image, w is the predicted image, and $ε (\cdot)$ represents the edge extraction operator, as in equation (8):

E d g e L o s s = \frac{1}{H W} \sum_{i, j} {(ε (W_{i, j}) - ε (w_{i, j}))}^{2}

(8)

The above composite loss function constrains the network training process from the three dimensions of pixel level, frequency domain, and edges, enabling the model to restore clear high-frequency details and edge information while maintaining overall structural consistency.

3 Experiments

3.1 Dataset preparation

Since this algorithm is supervised learning-based, it requires paired sharp and blurred images during training.^42,43 Aiming to better adapt the algorithm to wind power inspection scenarios, where the images primarily contain linear motion blur and nonlinear rotational motion blur, this study constructs two datasets: Subset A and Subset B.

Subset A is a linear motion blur dataset generated by applying convolution operations using linear blur kernels of varying sizes with the same motion blur intensity, as well as kernels of the same size with varying motion blur intensities, on sharp images to produce blurred counterparts. The dataset comprises 2000 training pairs and 1000 testing pairs. As illustrated in Figure 5, the kernel sizes and motion blur intensities ranging from 0 to 1 effectively simulate a wide variety of linear motion blur scenarios encountered in wind power equipment images.

Subset B is a nonlinear rotational motion blur dataset created by capturing wind power equipment videos at 60 frames per second (fps). The videos were frame-extracted and interpolated to 120 fps. Every 7 consecutive frames were averaged to generate blurred images, effectively simulating nonlinear rotational motion blur. As shown in Figure 6, the synthesized images realistically exhibit increasing blur intensity toward the blade edges and pronounced fast motion blur at the outermost tips of the blades. This dataset consists of 2400 training pairs and 600 testing pairs.

Figure 5.

Blurred images of subset A.

Figure 6.

Subset B synthetic image.

3.2 Data preprocessing and experimental setup

To enhance the generalization capability of the network, various data augmentation techniques were applied to the training dataset, including horizontal flipping, random affine transformations, transposition, non-rigid deformations, random adjustments to hue, saturation, brightness, and contrast, and random erasing. Experiments were conducted on a system with an Intel i7-10700 CPU and an NVIDIA RTX 3080 GPU, running the Windows 10 operating system, and utilizing the PyTorch deep learning framework. During the training process, the batch size was set to 8, and the initial learning rate was set to 0.001. An adaptive learning rate adjustment strategy was used: when the validation loss did not decrease for 5 consecutive epochs, the learning rate was multiplied by 0.1 to decay. All models were trained until the loss function fully converged, with the total number of training epochs set to 300.

In the evaluation on public datasets (GOPRO, NFS, DVD), this paper uses exactly the same hyperparameter settings, namely batch size = 8, initial learning rate = 0.001, adaptive learning rate decay strategy, and epoch = 300, to ensure a fair comparison between the method in this paper and existing methods.

3.3 Experimental results and analysis

To validate the effectiveness of the proposed method, ablation studies and subjective visual comparison experiments were conducted on the wind power equipment test dataset developed in this study. Additionally, objective evaluation metrics were compared on both the proposed wind power inspection test set and publicly available datasets, including GOPRO, NFS, and DVD.

3.3.1 Ablation study

Based on the MIMO network framework, this study introduces a customized design tailored to the characteristics of motion-blurred wind power equipment images. To evaluate the effectiveness of each proposed module, ablation experiments were conducted on Subset A and Subset B. The results in terms of PSNR, SSIM, parameter count, and restoration speed for different configurations are presented in Tables 2 and 3.

Table 2.
Subset A ablation experiments.

MIMO MIMO + FT MIMO + HBN MIMO + ECA MIMO + FT + HBN Ours

Dataset A A A A A A

PSNR 27.0837 29.0758 27.2941 27.1859 29.8731 30.4316

SSIM 0.8791 0.9012 0.8954 0.9001 0.9086 0.9186

Parameter 6.8 MB 3.2 MB 3.2MB 3.2MB 3.2 MB 3.2 MB

Speed 0.55s 0.53s 0.55s 0.55s 0.52s 0.52s

	MIMO	MIMO + FT	MIMO + HBN	MIMO + ECA	MIMO + FT + HBN	Ours
Dataset	A	A	A	A	A	A
PSNR	27.0837	29.0758	27.2941	27.1859	29.8731	30.4316
SSIM	0.8791	0.9012	0.8954	0.9001	0.9086	0.9186
Parameter	6.8 MB	3.2 MB	3.2MB	3.2MB	3.2 MB	3.2 MB
Speed	0.55s	0.53s	0.55s	0.55s	0.52s	0.52s

Table 3.

Subset B ablation experiments.

	MIMO	MIMO + FT	MIMO + HBN	MIMO + ECA	MIMO + FT + HBN	Ours
Dataset	B	B	B	B	B	B
PSNR	49.7443	50.6577	49.9018	49.6891	51.0564	51.7212
SSIM	0.9941	0.9936	0.9936	0.9934	0.9940	0.9959
Parameter	6.8 MB	3.2 MB	3.2MB	3.2MB	3.2 MB	3.2 MB
Speed	0.55s	0.53s	0.55s	0.55s	0.52s	0.52s

The test results of different image restoration algorithms on Subset A and Subset B are presented in Tables 2 and 3. The reported metrics—PSNR, SSIM, model size, and restoration speed—are averaged over all images in the respective test sets. To comprehensively evaluate the independent contribution of each module, we individually added the FT-Conv, HBN, and ECA modules on the MIMO baseline. Adding FT-Conv alone (MIMO FT) increased PSNR by 1.9921 and SSIM by 0.0221 on subset A, and increased PSNR by 0.9134 on subset B, while reducing parameters by 3.6MB and improving restoration speed by 0.02 s. The improvement was the most significant, indicating that the global receptive field in the frequency domain is crucial for motion blur restoration. Adding HBN alone (MIMO HBN) increased PSNR by 0.2104 and SSIM by 0.0163 on subset A, and increased PSNR by 0.16 on subset B. The improvement was relatively limited, suggesting that HBN's advantage of ‘preserving original features’ is hard to fully leverage when used alone. Adding ECA alone (MIMO ECA) increased PSNR by 0.1022 and SSIM by 0.021 on subset A, but decreased PSNR by 0.06 on subset B, with weak or even slightly negative effects, because ECA requires sufficient global context information to effectively allocate attention weights. Adding HBN on top of FT-Conv (MIMO FT HBN), compared with FT-Conv alone, further increased PSNR by about 0.80 and SSIM by 0.0076 on subset A, increased PSNR by about 0.40 on subset B, and improved restoration speed by 0.01 s, validating the complementarity of FT-Conv and HBN—FT-Conv provides global frequency domain features, while HBN preserves these features during the normalization process. Adding ECA on top of FT-Conv and HBN (complete model), compared with MIMO FT HBN, further increased PSNR by about 0.56 and SSIM by 0.01 on subset A, increased PSNR by about 0.40 and SSIM by 0.0049 on subset B, while keeping parameters and restoration speed unchanged. This indicates that the channel attention mechanism of ECA can only be effective when based on the well-constructed features from FT-Conv and HBN.

3.3.2 Subjective comparison of restored images

Images captured by an industrial camera were processed using the proposed method, as well as the AdaRevD,⁴⁴ EVSSM,⁴⁵ and MIMO algorithms for deblurring. The restored images were then compared by magnifying defect regions and edge contours to evaluate subjective visual quality. To further demonstrate the superiority of the proposed algorithm, images with varying backgrounds and lighting conditions were selected for comparison. Figures 7 and 8 show examples of linear motion-blurred images under different lighting conditions, while Figures 9 and 10 present nonlinear motion-blurred images resulting from wind turbine blades rotating at different speeds.

Figure 7.

Linear motion blur images (bright).

Figure 8.

Linear motion blur images (dark).

Figure 9.

Non-linear motion blur images.

Figure 10.

Non-linear motion blur images.

By comparing the experimental results shown in Figures 7 and 8, it can be observed that for linear motion blur, where the degree of blur is relatively severe, both the AdaRevD and EVSSM methods fail to restore the images effectively. The images restored by AdaRevD show limited visual improvement and exhibit noticeable color imbalance. The EVSSM method, on the other hand, produces severe artifacts, including prominent checkerboard patterns. In contrast, both the MIMO method and the proposed approach achieve better restoration quality, with visible cracks on the wind turbine blades. Although the MIMO-restored images appear visually appealing, they suffer from over-smoothing and noticeable deformation along the blade edges. The proposed method demonstrates a higher degree of restoration fidelity, effectively preserving the blade contours while clearly recovering the crack details.

By comparing the experimental results in Figures 9 and 10, where Figure 9 corresponds to a blurred image captured during slow blade rotation and Figure 10 corresponds to one captured during fast rotation, it can be observed that the AdaRevD method continues to exhibit severe color imbalance. While the EVSSM, MIMO, and the proposed method are all capable of restoring the blurred images to some extent, in cases of faster blade rotation, both EVSSM and MIMO struggle to accurately recover the blade edges in regions where the color contrast between the blades and the background is low, resulting in noticeable artifacts and incomplete restoration. In contrast, the proposed method demonstrates superior performance in preserving blade edge contours, even under challenging conditions involving high-speed rotational motion blur.

In order to further verify the effectiveness of the method proposed in this paper in practical engineering environments, we collected on-site detection images from a real wind farm for testing. As shown in Figure 11, the test images cover wind turbine blurry images from different angles and heights. The experimental results show that the model proposed in this paper can still restore images well in these challenging real-world scenarios and outperforms the comparison models in image restoration performance, confirming its good engineering generalization capability.

Figure 11.

Restored image of a real blurred fan blade.

3.3.3 Objective evaluation metric comparison

The proposed method, along with the AdaRevD, EVSSM, and MIMO algorithms, was evaluated using objective metrics including Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). In addition, the restoration speed per image and the size of the trained model were recorded for comprehensive performance comparison. This evaluation provides a quantitative assessment of each method's effectiveness in terms of restoration quality, computational efficiency, and model compactness.

(1)
On the proposed blurred image test dataset, identical training epochs were conducted on Subsets A and B for all methods. The PSNR, SSIM, parameter count, and restoration speed for each approach are summarized in Table 4. Compared to AdaRevD, EVSSM, MIMO, and WTE-DDPM,²² the proposed algorithm is significantly more lightweight, the weight model of this article is only 39.5MB, which is 27.6MB less than HINT,⁴⁶ and its restoration speed is faster. The PSNR and SSIM achieved by the proposed algorithm show substantial improvements over other methods, attributable to the incorporation of the FT-Conv, HBN, and ECA modules.

Table 4.
Subset A and subset B comparison experiments.

Dataset AdaRevD EVSSM HINT WTE-DDPM MIMO Ours

A PSNR 30.7905 30.640 31.2679 31.4957 28.0837 31.4316

SSIM 0.8761 0.8236 0.8834 0.9297 0.8790 0.9186

B PSNR 49.7902 49.4923 50.6891 51.5231 49.7443 51.7212

SSIM 0.9940 0.9939 0.9932 0.9944 0.9941 0.9959

Model size 306.8 MB 65.7 MB 24.87MB 246.6MB 80.0 MB 39.5 MB

Speed 0.81s 0.66s 0.82s 3min12s 0.55s 0.52s

From the inference time shown in Table 4, the average restoration speed of a single image using the method in this paper is 0.52 s, faster than AdaRevD (0.81 s), EVSSM (0.66 s), HINT (0.82 s), WTE-DDPM (192 s), and MIMO (0.55 s). Compared with AdaRevD, which has the largest number of parameters (306.8 MB), the method in this paper reduces the number of parameters by about 87% while increasing inference speed by approximately 36%. Compared with EVSSM and MIMO, which have a similar number of parameters, the method in this paper has obvious advantages in both parameter count and inference speed. It is particularly noteworthy that although WTE-DDPM uses a diffusion model architecture and achieves a higher PSNR on subset B, its inference time is as long as about 3 min and 12 s, and its number of parameters reaches 246.6 MB, making it difficult to meet the demand for fast processing in practical wind power inspection scenarios. HINT, as a lightweight model (24.87 MB), achieves a PSNR of 31.27 dB on subset A, slightly lower than the 31.43 dB of the method in this paper, but its inference speed of 0.82 s is slower than that of this method. The above comparisons indicate that the method in this paper achieves the best balance between restoration quality, inference speed, and model size: compared with the diffusion model WTE-DDPM, the method in this paper trades a minimal PSNR difference for nearly 370 times faster inference; Compared with the lightweight model HINT, the method in this paper achieves higher PSNR while being faster in inference. This result validates that the method in this paper has good computational efficiency and practicality for real-world wind power inspection scenarios. The efficiency advantage of the method in this paper mainly benefits from: 1) The FT-Conv module reduces network depth through a global receptive field in the frequency domain, reducing computational redundancy; 2) The HBN module only needs to learn BN parameters for half of the channels, further compressing computational cost. (2)
To evaluate the capability of the proposed algorithm in handling other types of blurred images, training and testing were conducted on publicly available datasets GOPRO, NFS, and DVD. The PSNR, SSIM, model size, and restoration speed for different methods are summarized in Table 5. On the GOPRO and DVD datasets, the proposed algorithm achieves higher PSNR values compared to AdaRevD, EVSSM, and MIMO, while its SSIM scores are slightly lower than those of the other methods. For the NFS dataset, the proposed method attains PSNR values higher than AdaRevD and MIMO but lower than EVSSM, and its SSIM surpasses all competing algorithms. Notably, the proposed model remains more lightweight and achieves faster restoration speeds across all datasets. The comparative experiments on GOPRO, NFS, and DVD demonstrate that the proposed approach is effective for general blurred image restoration tasks, with performance close to mainstream algorithms, while maintaining a smaller model size and faster inference speed.

Table 5.
GOPRO, DVD, NFS public datasets comparison experiments.

Dataset AdaRevD EVSSM MIMO Ours

Gopro PSNR 34.5370 34.7772 30.3547 34.8500

SSIM 0.9368 0.9378 0.9350 0.9342

DVD PSNR 34.7036 35.5605 33.6442 35.9205

SSIM 0.9056 0.9449 0.9300 0.9125

NFS PSNR 35.2233 34.6162 34.5147 34.6110

SSIM 0.9497 0.9634 0.9607 0.9645

Model size 306.8 MB 65.7 MB 80.0 MB 39.5 MB

Speed 0.81s 0.66s 0.55s 0.52s

4 Conclusions

Dataset		AdaRevD	EVSSM	HINT	WTE-DDPM	MIMO	Ours
A	PSNR	30.7905	30.640	31.2679	31.4957	28.0837	31.4316
SSIM	0.8761	0.8236	0.8834	0.9297	0.8790	0.9186
B	PSNR	49.7902	49.4923	50.6891	51.5231	49.7443	51.7212
SSIM	0.9940	0.9939	0.9932	0.9944	0.9941	0.9959
Model size		306.8 MB	65.7 MB	24.87MB	246.6MB	80.0 MB	39.5 MB
Speed	0.81s	0.66s	0.82s	3min12s	0.55s	0.52s

Dataset		AdaRevD	EVSSM	MIMO	Ours
Gopro	PSNR	34.5370	34.7772	30.3547	34.8500
SSIM	0.9368	0.9378	0.9350	0.9342
DVD	PSNR	34.7036	35.5605	33.6442	35.9205
SSIM	0.9056	0.9449	0.9300	0.9125
NFS	PSNR	35.2233	34.6162	34.5147	34.6110
SSIM	0.9497	0.9634	0.9607	0.9645
Model size	306.8 MB	65.7 MB	80.0 MB	39.5 MB
Speed	0.81s	0.66s	0.55s	0.52s

This paper proposes a frequency-domain transformation and HBN-based deblurring algorithm for wind power equipment. It analyzes the causes of motion blur in wind power equipment images and the limitations of existing deblurring methods. Building upon the MIMO network framework, an FT-Conv convolutional approach is designed to enable the algorithm to cover the entire image receptive field from the initial stages, thereby better capturing the global differences between blurred and sharp image pairs. This design reduces network depth, improving restoration speed and quality.

To address the unique characteristics of motion-blurred wind power equipment images, this study constructs two specialized datasets representing linear motion blur and nonlinear rotational motion blur. Additionally, a novel Half-Batch Normalization (HBN) block is introduced to preserve more original feature information during regularization, enhancing the fusion of high-level and low-level features to capture more global context and improve performance. The network also incorporates the Enhanced Channel Attention (ECA) mechanism to expand the convolutional receptive field, thereby boosting the effectiveness of FT-Conv and improving image deblurring quality.

Experimental validation on the proposed datasets demonstrates that the proposed method achieves higher PSNR and SSIM scores compared to other deep learning methods. The network model size is compressed to 39.5 MB, and the restoration speed per image is improved to 0.52 s, enabling fast and effective processing. Further evaluation on public datasets GOPRO, NFS, and DVD shows that the proposed method attains PSNR and SSIM values comparable to state-of-the-art approaches, while significantly reducing model size and hardware requirements. The accelerated restoration speed makes the method well-suited for efficiently handling large volumes of blurred wind power equipment images, meeting the practical demands of wind power equipment inspection.

Although the method proposed in this paper achieves better performance compared to other methods in the task of wind power equipment motion blur image restoration, every method has inherent limitations. This paper discusses, from an objective perspective, the scenarios in which the proposed method may perform poorly. It has limitations in restoring extremely nonlinear motion models, although the method performed well on the subset B of nonlinear motion blur. However, in actual wind farms, extreme situations such as typhoon conditions or severe vibrations of flying devices can cause motion blur that severely weakens or even completely removes high-frequency information, making it difficult to effectively extract global differential information; it is sensitive to low signal-to-noise ratio (SNR) or high-noise environments. In low SNR environments, the presence of significant noise components may cause the model to inadvertently amplify noise during restoration, severely affecting the visual quality of the restored image; it has insufficient adaptability to scenarios without completely matched paired data. The proposed method belongs to a fully supervised learning framework, and the training process relies on a large number of paired blurred-clear images. In some scenarios, it is impossible to obtain clear images from the same perspective and at the same moment, which can lead to a significant decline in restoration performance.

Therefore, while the proposed method shows excellent performance in the task of wind power equipment motion blur image restoration, it has certain limitations in extreme data scenarios and parameter generalization. When the application scenario is close to the experimental conditions of this paper, the proposed method can achieve optimal performance.

Footnotes

Acknowledgments

We sincerely express our gratitude to every author for their contributions during the completion of the article. We also extend our respect to the editors and every reviewer for their efforts. We appreciate your dedication. Additionally, we would like to thank the author who provided the reference code for this article.

ORCID iDs

Haiquan Jin

Ethical approval and informed consent statements

Consent was sought from each author and the subjects who provided data for the dataset in this paper, and no ethics were involved in the rest of the paper.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability statement

Due to the nature of this research, participants of this study did not agree for their data to be shared publicly, so supporting data is not available.

References

Yang

Yanfeng

Wei

, et al. Image recognition of wind turbine blade damage based on a deep learning model with transfer learning and an ensemble learning classifier[J]. Renew Energy 2021; 163: 386–397.

Zou

Jiangwei

, et al. Damage detection in wind turbine blades based on an improved broad learning system model[J]. Appl Sci 2022; 12: 5164.

Ruiz

Mujica

Alférez

, et al. Wind turbine fault detection and classification by means of image texture analysis[J]. Mech Syst Signal Process 2018; 107: 149–167.

Songyue

LIU

Qiusheng

Bin

, et al. Prediction of offshore wind turbine wake and output power using large eddy simulation and convolutional neural network[J]. Energy Convers Manag 2025; 324: 119326.

Omidvarnia

Sarhadi

. Nature-inspired designs in wind energy: a review[J]. Biomimetics 2024; 9: 90.

Chillakuru

Madiajagan

Prashanth

, et al. Enhancing wind power monitoring through motion deblurring with modified GoogleNet algorithm[J]. Soft Comput 2024; 28: 13965–13975.

Gao

Wang

. Motion deblurring algorithm for wind power inspection images based on ghostnet and SE attention mechanism[J]. IET Image Process 2023; 17: 291–300.

Gao

Zhou

. Inspection of wind turbine blades using image deblurring and deep learning segmentation[C]. In: Health Monitoring of Structural and Biological Systems XVIII. SPIE, 2024, vol. 12951, pp. 109–120.

Jia

. Single image motion deblurring using transparency[A]. IEEE, 2007, pp. 1–8.

10.

Levin

Yair

Fredo

, et al. Efficient marginal likelihood optimization in blind deconvolution[A]. IEEE, 2011, pp. 2657–2664.

11.

Perrone

Paolo

. A clearer picture of total variation blind deconvolution[J]. IEEE Trans Pattern Anal Mach Intell 2015; 38: 1041–1055.

12.

Wen

Rendong

Yipeng

, et al. A simple local minimal intensity prior and an improved algorithm for blind image deblurring[J]. IEEE Trans Circuits Syst Video Technol 2020; 31: 2923–2937.

13.

Chen

Yang

Guo

, et al. Hyper-Laplacian regularized non-local lowrank prior for blind image deblurring[J]. IEEE Access 2020; 8: 136917–136929.

14.

Nah

Tae

Kyoung

. Deep multi-scale convolutional neural network for dynamic scene deblurring[A]. 2017, pp. 3883–3891.

15.

Zhang

Ren

, et al. Deep residual learning for image recognition[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.

16.

Kupyn

Budzan

Mykhailych

, et al. Deblurgan: blind motion deblurring using conditional adversarial networks[C]. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8183–8192

17.

Kupyn

Martyniuk

, et al. Deblurgan-v2: deblurring (orders-of-magnitude) faster and better[C]. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 8878–8887.

18.

Zhang

Wenhan

Yiran

, et al. Deblurring by realistic blurring[A]. 2020, pp. 2737–2746.

19.

Zamir

S-W

Aditya

Salman

, et al. Multi-stage progressive image restoration[A]. 2021, pp. 14821–14831.

20.

Suin

Kuldeep

Rajagopalan

A-N

. Spatially-attentive patch-hierarchical network for adaptive motion deblurring[A]. 2020, pp. 3606–3615.

21.

Mao

Yiming

Wei

, et al. Deep residual fourier transformation for single image deblurring[J]. arXiv preprint arXiv:2111.11745, 2021.

22.

Gao

Jin

Wang

, et al. Wind turbine equipment motion blur image restoration algorithm optimized by diffusion model denoising diffusion probabilistic model strategy[J]. J Electron Imaging 2025; 34: 023002–023002.

23.

Ren

Jin

, et al. Diffusion models for image restoration and enhancement: a comprehensive survey[J]. Int J Comput Vis 2025; 133: 8078–8108.

24.

Chillakuru

Madiajagan

Prashanth

, et al. Enhancing wind power monitoring through motion deblurring with modified GoogleNet algorithm. Soft Comput 2024; 28: 13965–13975.

25.

Zheng

Zhu

, et al. A lightweight adaptive image deblurring framework using dynamic convolutional neural networks. Sci Rep 2025; 15: 33284.

26.

Wang

Xiaodong

Jianmin

, et al. Uformer: A general u-shaped transformer for image restoration[A]. 2022, pp. 17683–17693.

27.

Lin

Yuanhao

Xiaowan

, et al. Flow-guided sparse transformer for video deblurring[J]. arXiv preprint arXiv:2201.01893, 2022.

28.

Zamir

S-W

Aditya

Salman

, et al. Restormer: Efficient transformer for high-resolution image restoration[A]. 2022, pp. 5728–5739.

29.

Jiang

Zhang

Gao

, et al. When fast Fourier transform meets transformer for image restoration[C]. In: European Conference on Computer Vision, 2024, pp. 381–402. Springer Nature Switzerland.

30.

Chen

Xin

Jie

, et al. HINet: Half instance normalization network for image restoration[A]. 2021, pp. 182–192.

31.

Chen

Xiaojie

Xiangyu

, et al. Simple baselines for image restoration[J]. arXiv preprint arXiv:2204.04676, 2022.

32.

Chu

Liangyu

Chengpeng

, et al. Improving Image Restoration by Revisiting Global Information Aggregation (Supplementary Material)[J].

33.

Jiang

Chen

, et al. Multi-agent image restoration[J]. Int J Comput Vis 2026; 134: 205.

34.

Sun

Wang

, et al. AdaPrompt-IR: adaptive learning to perceive degradation semantic and prompting for all-in-one image restoration[J]. Pattern Recognit 2026; 169: 111875.

35.

Wang

Cui

Zhang

, et al. Discrete multi-view graph-based clustering via hierarchical initialization and supercluster similarity minimization[J]. Neurocomputing 2026; 664: 132099.

36.

Wang

Cui

, et al. Real scene single image dehazing network with multi-prior guidance and domain transfer[J]. IEEE Trans Multimed 2025; 27: 5492–5506.

37.

Wang

Cui

, et al. Weakly supervised image dehazing via physics-based decomposition[J]. IEEE Trans Circuits Syst Video Technol 2025; 36: 637–652.

38.

Wang

Zhu

, et al. ECA-Net: efficient channel attention for deep convolutional neural networks[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11534–11542.

39.

Liu

Zongru

Nico

. Global attention mechanism: Retain information to enhance channel-spatial interactions[J]. arXiv preprint arXiv:2112.05561, 2021.

40.

Ran

Qing

Boya

, et al. Lightweight oriented object detection using multiscale context and enhanced channel attention in remote sensing images[J]. IEEE J Sel Top Appl Earth Obs Remote Sens 2021; 14: 5786–5795.

41.

Cho

S-J

Seo-Won

Jun-Pyo

, et al. Rethinking coarse-to-fine approach in single image deblurring[A]. 2021, pp. 4641–4650.

42.

Zhang

Shen

Lin

, et al. Learning to un-derstand image blur. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6586–6595.

43.

Kohler

Hirsch

Mohler

, et al. Recording and ¨playback of camera shake: benchmarking blind deconvolution with a real-world database. In: European Conference on Computer Vision, 2012, pp. 27–40. Springer.

44.

Mao

Wang

. Adarevd: adaptive patch exiting reversible decoder pushes the limit of image deblurring[C]. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 25681–25690.

45.

Kong

Dong

Tang

, et al. Efficient visual state space model for image deblurring[C]. In: Proceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 12710–12719.

46.

Zhou

Pan

, et al. Devil is in the uniformity: exploring diverse learners within transformer for image restoration[C]. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 12307–12317.