Abstract
The accuracy and stability of relative pose estimation of an autonomous underwater vehicle (AUV) and a target depend on whether the characteristics of the underwater image can be accurately and quickly extracted. In this paper, a whale optimization algorithm (WOA) based on lateral inhibition (LI) is proposed to solve the image matching and vision-guided AUV docking problem. The proposed method is named the LI-WOA. The WOA is motivated by the behavior of humpback whales, and it mainly imitates encircling prey, bubble-net attacking and searching for prey to obtain the globally optimal solution in the search space. The WOA not only balances exploration and exploitation but also has a faster convergence speed, higher calculation accuracy and stronger robustness than other approaches. The lateral inhibition mechanism can effectively perform image enhancement and image edge extraction to improve the accuracy and stability of image matching. The LI-WOA combines the optimization efficiency of the WOA and the matching accuracy of the LI mechanism to improve convergence accuracy and the correct matching rate. To verify its effectiveness and feasibility, the WOA is compared with other algorithms by maximizing the similarity between the original image and the template image. The experimental results show that the LI-WOA has a better average value, a higher correct rate, less execution time and stronger robustness than other algorithms. The LI-WOA is an effective and stable method for solving the image matching and vision-guided AUV docking problem.
Introduction
Image matching plays an important role in pattern recognition, computer vision and image navigation. The purpose of image matching is to accurately locate the template image in the original image and maximize the similarity between the two images. Image matching is a technique to determine whether the selected template image matches a part of the original image. Image matching methods include two main types: intensity-based methods and feature-based methods [1, 2]. Intensity-based methods are effective methods for finding the maximum similarity between the template image and the original image. Feature-based methods match the position of the image according to the image characteristics, such as the edges, contours, textures, entropy, energy, color and corners. Computer vision and underwater image matching technology are closely related. Vision robots can not only collect useful information from underwater images but also provide great help for image data processing and analysis. Underwater image matching lays a solid foundation for image processing, feature extraction, recognition and target tracking. The purpose of underwater image matching is to find the exact position that must be matched in the original image according to the template image. At the same time, the matching correct rate, convergence speed, calculation accuracy and execution time are used as important criteria to verify the overall matching effect of underwater images. It is very important that the feature extraction and image matching of the captured image are performed quickly and accurately. Recently, meta-heuristic algorithms have been proposed to solve the image matching problem, such as the bat algorithm (BA) [3], biogeography-based optimization (BBO) [4], imperialist competitive algorithm (ICA) [5], particle swarm algorithm (PSO) [6], and sine cosine algorithm (SCA) [7].
Image matching plays an important role in vision-guided AUV docking [8–12]. The accuracy of the entire system depends mainly on the matching accuracy, and the matching operation time occupies a considerable part of the whole operation time. Yan et al. proposed a novel image matching algorithm based on quanta particle swarm optimization to solve the vision-guided AUV docking problem; the proposed algorithm has high precision and fast operation speed to obtain the best matching position [13]. Yan et al. presented a visual positioning algorithm based on an L-shaped light array to achieve AUV docking, and the effectiveness and feasibility of the proposed algorithm were verified [14]. Yang et al. demonstrated sonar image matching based on a convolutional neural network to guide the AUV navigation, and the image matching accuracy was greatly improved [15]. Dou et al. combined the wavelet transform and scale-invariant feature transform to solve robust image matching, and the algorithm could improve the accuracy of the image matching and reduce the computational load [16]. Chen et al. applied the exploiting dominant edge orientation approach to alleviate the robust visible-infrared image matching [17]. Bürgmann et al. conducted a deep learning approach for an automated matching of SAR-derived GCPs to optical image elements, and the proposed algorithm obtained the global best solution [18]. Xu et al. proposed a stall detection algorithm based on symmetrized dot pattern analysis and image matching, and the algorithm could, in a timely and accurate manner, detect a centrifugal fan [19]. Wu et al. developed local central-tendency similarity to eliminate the effects of noise and intensity changes; this method could achieve higher computational efficiency and better matching performance [20]. Sun et al. used an effective method for matching underwater images in glass-flume experiments, and the method could lead to significantly higher accuracies [21]. Jung designed the k-center algorithm to solve hierarchical binary template matching [22]. Xu et al. proposed a simple and effective underwater target recognition algorithm to enhance the operational ability of unmanned underwater vehicles with an optical vision system [23]. Lou et al. present the spotted hyena optimizer based on a lateral inhibition mechanism to solve the image matching problem; their experiments indicated that the proposed algorithm is more effective and reliable compared with other algorithms [24].
These previous studies in the literature adopted different matching methods to effectively solve the image matching problem, which could obtain faster convergence, higher calculation accuracy and a better matching effect, but these algorithms have a certain complexity that takes more search time to complete the image matching problem. The whale optimization algorithm (WOA) mainly simulates encircling prey, the bubble-net strategy and the search for prey to perform global optimization [25]. The WOA has advantages in terms of fast convergence speed, high calculation accuracy, strong robustness, simple algorithm framework and easy operation and implementation. The lateral inhibition (LI) mechanism is utilized to preprocess the original image and the template image, which is effective for image enhancement and edge extraction [26–28]. Therefore, in this paper, we used the whale optimization algorithm based on lateral inhibition to solve the image matching and vision-guided AUV docking problem, which not only achieves complementary advantages to improve the underwater image matching effect but also balances the exploration and exploitation to find the global optimal solution.
This article is divided into following sections: Section 2 introduces the basic principles of the whale optimization algorithm. Section 3 describes the lateral inhibition mechanism. Section 4 details the implementation procedures of our proposed LI-WOA for image matching. The experimental results and analysis are conducted in Section 5. Finally, conclusions and future research are given in Section 6.
The basic principle of whale optimization algorithm
The whale optimization algorithm [25] based on the bubble-net hunting strategy is a novel swarm intelligence optimization algorithm. In WOA, each humpback whale is a candidate solution for the optimization problem, which is called a search agent. The WOA uses the optimization rules to update the search agents until the end condition, and then, it determines the global optimal solution. The WOA is divided into three main stages: encircling prey, bubble-net attacking strategy and search for prey. The model of the bubble-net feeding behavior is shown in Fig. 1.

Bubble-net feeding behavior of humpback whales.
Humpback whales can sense the position of prey and encircle them. Since the location of the optimal solution in search space is usually unpredictable in the modeling process, WOA assumes that the current optimal candidate solution is the target prey. After determining the optimal search agent, other search agents will update their locations to approximate the current optimal search agent. The location update is given as follows:
Humpback whales not only move along a conical logarithmic spiral motion but also spit out a bubble-net to create traps for predation. In the model, the shrinking encircling mechanism is implemented by reducing the value of
where is the distance of the ith whale to the prey, l is a random value in [- 1, 1], and b is a constant for defining the shape of the logarithmic spiral.
The humpback whales shrink the encirclement and swim with a spiral-shaped path to the prey. For the encircling prey strategy and the bubble-net attacking strategy, each strategy is selected with a probability of 50%. The location update is given as follows:
where p is a random number in [0, 1].
To ensure exploration and convergence, when
The WOA effectively balances exploration and exploitation to determine a globally optimal solution. The pseudo code of the WOA is shown in Algorithm 1.
The lateral inhibition mechanism [26–28] was first discovered and verified by Hartline. Through an electrophysiology experiment on the limulus’ vision, each microphthalmia of the limulus’ ommateum is regarded as a receptor. The receptor has a certain inhibitory effect on the surrounding receptors, and the inhibitory effect has a spatial sun effect. The closer the receptor is to a specific (nearby) receptor, the stronger the inhibitory effect will be.
In a retinal image, the intensively excited receptors in an illuminated light area inhibit those in illuminated dark areas more strongly than the latter do to the former. Therefore, the lateral inhibition mechanism can effectively improve the contrast and enhance the distortion of sensory information. The important characteristics of vision scenes and intensity gradients in retinal images have been strengthened. This mechanism is used to preprocess the template and the original image, which can increase the spatial resolution and accuracy of the image matching.
The classical lateral inhibition model is shown as follows:
where r p is the pulse issuance frequency after the wide sensing unit is suppressed, e p is the pulse emission frequency of the photoreceptor unit when it is individually illuminated, kp,j represents the lateral inhibition coefficients, r j is the frequency of the pulse, and rp,j is the threshold for generating lateral inhibition.
To introduce the lateral inhibition mechanism to image processing, the model is modified in two-dimensional and gray form, and the gray value of the pixel (m, n) in the images is shown as follows:
where αi,j is the lateral inhibition coefficient of the pixel (i, j) to the central pixel, I0 (m, n) is the original gray value of the pixel (m, n), R (m, n) is the gray value of the pixel (m, n) as processed by lateral inhibition, and M × N is the scale of the receptive field. The schematic diagram of the lateral inhibition model under the condition that M = N = 2 is shown in Fig. 2.

Schematic diagram of lateral inhibition model M = N = 2.
In this paper, the size of the receptive field that is chosen is 5 × 5. The competing coefficient of the lateral inhibition is the following:
Because the vision nerve cells are situated at the same input plane and the competing coefficients are close to zero, the lateral inhibition modulus satisfies.
In this paper, we choose α0 = 1, α1 = -0.075, α2 = -0.025 to form the following matrix as the modulus:
The modulus template U is combined with R (m, n), and we obtain a new grayscale version of the image. Finally, the image’s edge is shown to be the following:
The fitness function of LI-WOA
The searching strategy and the similarity measurement are two major aspects of image matching. The fitness function is applied to calculate the fitness value of each whale individual in different situations and actual practices. If the size of the given image is relatively large, this operation will take more time to calculate the fitness function. The schematic diagram of the template process is shown in Fig. 3.

Schematic diagram of template process.
To overcome the shortage, the fitness function of the lateral inhibition image processed is shown as follows:
where P × Q is the size of the template image (m, n), and it is the coordinate of the pixel in the original image, and I (m + i, n + j) is the processed gray value of the pixel (m + i, n + j) by Equation (15). If P × Q is the size of the template image, and A × B is the size of the original image, the scope of the coordinates in the original image for matching is 1 ⩽ m ⩽ A - P + 1 and 1 ⩽ n ⩽ B - Q + 1. The maximum value f (m, n) is the globally optimal solution of the image matching.
The procedure of the LI-WOA achieves complementary advantages between the efficiency of WOA and the accuracy of the lateral inhibition mechanism. The LI-WOA has good optimization performance for solving the image matching problem. The main procedures of the LI-WOA are shown in Algorithm 2. A detailed flow chart of the LI-WOA for image matching is shown in Fig. 4.

Flow chart of LI-WOA for image matching.
Experimental setup
The numerical experiment is set up on a computer with an Intel Core i7-8750H 2.2 GHz CPU, a GTX1060, and 8 GB of memory running on Windows 10.
Parameter setting
To verify the effectiveness and feasibility of the LI-WOA for the image matching and AUV vision-guided docking problem, the proposed algorithm is compared with other meta-heuristic swarm intelligence optimization algorithms. Examples are LI-BA, LI-BBO, LI-ICA, LI-PSO and LI-SCA. For a fair comparison, the size of each algorithm is 100, the maximum number of iterations is 300, and the number of independent runs is 30. The control parameters of each algorithm are representative empirical values and are derived from the original article, as shown in Table 1.
Initial parameters of each algorithm
Initial parameters of each algorithm
The threshold for image edge extraction is defined as T = 110. The purpose of image matching is to successfully find the position that needs to be matched in the original image based on the template image. The average value (Ave), correct rate (CR) and CPU time (sec) are used as important indicators to evaluate the effect of the image matching. The comparative optimization results for the LI-WOA and other algorithms are given in Table 2, and the optimal values are shown in bold.
Comparative optimization results by LI-WOA and other algorithms
Comparative optimization results by LI-WOA and other algorithms
The WOA has a strong global search ability to avoid falling into a local optimum, and it can also effectively balance the global search ability and the local search ability to obtain the optimal solution. The lateral inhibition mechanism is a feasible method to preprocess the original image and the template image and to improve the matching accuracy and enhance the image features. The LI-WOA is used to solve the image matching and vision-guided AUV docking problem, and the purpose of the optimization is to obtain a better fitness value and higher matching accuracy. Comparative optimization results for the LI-WOA and other algorithms are given in Table 2. For images 1, 2 and 3, the average value and the correct rate of the LI-WOA are significantly better than those of the other algorithms, which indicates that the LI-WOA has a higher optimization performance to obtain the optimal solution and that the LI-WOA can find the position that needs to match in the original image according to the template image. The execution time of the LI-WOA is superior to those of other algorithms, which indicates that the LI-WOA has higher optimization efficiency to complete the image matching problem. To summarize, the LI-WOA has a faster convergence speed, higher calculation accuracy, higher matching correct rate and less execution time, and the LI-WOA is effective and feasible in solving the image matching and vision-guided AUV docking problem.
The P-value is used to verify whether there is a significant difference between the LI-WOA and the other algorithms [29]. The results of the P-value Wilcoxon rank-sum test are given in Table 3. If p < 0.05, this criterion indicates that there is a significant difference between the LI-WOA and the other algorithms. If p ⩾ 0.05, this criterion indicates that there is no significant difference between the LI-WOA and the other algorithms, and the values are shown in bold. The P-value is less than 0.05 in most cases, which not only detects the robustness and effectiveness between the LI-WOA and the other algorithms but also ensures that the data are not obtained by accident and have certain practical significance.
Results of the p - value Wilcoxon rank-sum test
The purpose of the image matching is to obtain the maximum similarity between the template image and the original image and to match the exact position in the original image based on the template image. Underwater image matching has important guiding significance for feature extraction, recognition and target tracking, pattern recognition and image analysis. Figures 5(a)–7(a) and Figs. 5(b)–7(b) are the original images and the template images, respectively. Figures 5(c)–7(c) and Figs. 5(d)–7(d) are the original images and the template images processed by the lateral inhibition mechanism, respectively. The lateral inhibition mechanism can effectively preprocess the original image and the template image to improve the efficiency of edge the extraction and to improve the accuracy of the image matching. Figures 5(e)–7(e) are the final image matching results gained by the LI-WOA. The WOA has strong robustness and global search ability. The combination of the lateral inhibition mechanism and the WOA can achieve complementary advantages to improve the convergence speed and calculation accuracy. The size of each algorithm is 100, the maximum number of iterations is 300, and the number of independent runs is 30. The LI-WOA can not only find the position information that needs to match in the original image but also complete the AUV docking problem, which indicates that the LI-WOA has a higher convergence accuracy and matching correct rate. Figures 5(f)–7(f) are the evolutionary curves of different algorithms based on 30 independent runs. Compared with the other algorithms, the LI-WOA has a higher calculation accuracy and better convergence effect, which indicates that the LI-WOA has strong search ability in terms of jumping out of a local optimum and finding the globally optimal solution. Figures 5(g)-7(g) are the histograms of the execution time for all of the algorithms. The LI-WOA consumes less time to obtain better matching results. To summarize, the LI-WOA can switch the global search ability and the local search ability in solving the image matching and vision-guided AUV docking problem.

Comparative results for Image 1. (a) Original image (1332 × 889). (b) Template image (57 × 50). (c) Original image processed by lateral inhibition. (d) Template image processed by lateral inhibition. (e) The final template matching result obtained by the LI-WOA. (f) Evolution curves of the comparative result for Image 1. (g) Execution time versus different algorithms.

Comparative results for Image 2. (a) Original image (1332 × 889). (b) Template image (57 × 64). (c) Original image processed by lateral inhibition. (d) Template image processed by lateral inhibition. (e) The final template matching result obtained by the LI-WOA. (f) Evolution curves of the comparative result for Image 2. (g) Execution time versus different algorithms.

Comparative results for Image 3. (a) Original image (1332 × 889). (b) Template image (106 × 122). (c) Original image processed by lateral inhibition. (d) Template image processed the lateral inhibition. (e) The final template matching result obtained by the LI-WOA. (f) Evolution curves of the comparative result for Image 3. (g) Execution time versus different algorithms.
Statistically, the LI-WOA is used to solve the image matching and vision-guided AUV docking problem, which can accurately match the template image to the original image and obtain a better matching effect. The reasons are as follows: First, the WOA has certain advantages, such as having a simple algorithm structure, few control parameters and ease of implementation for the programing. Second, the WOA has not only a strong global search ability to avoid falling into a local optimum but also a strong local search ability to improve the calculation accuracy. The position updating mechanism of the humpback whale can effectively keep other whales following the optimal whale position and continuously update their positions, which is beneficial in improving the overall optimization performance of the algorithm. Third, it is possible to adjust the global search ability and the local search ability of the WOA by the control parameter
Image matching has laid a solid foundation for pattern recognition, feature tracking, stereo matching, image analysis and computer vision. The purpose of image matching is to maximize the similarity between the original image and the template image in order to improve the resolution of the matching background as well as the accuracy and effectiveness of image matching. This paper proposes a WOA that is based on a lateral inhibition mechanism to solve the image matching and vision-guided AUV docking problem. The LI-WOA achieves complementary advantages that allow it to avoid falling into a local optimum and enhances overall optimization performance. Compared with those of other algorithms, the average value, correct rate and execution time of the LI-WOA are better. The experimental results indicate that the LI-WOA has faster convergence and a higher calculation accuracy, which not only can effectively balance the global search ability and the local search ability to find the optimal solution but also has strong robustness and stability in solving the image matching and vision-guided AUV docking problem.
In future work, the LI-WOA will be used for solving more complicated patterns, real-world images, stereo matching and feature tracking in complex noisy marine environments. The proposed algorithm will be implemented in a parallel embedded processor. Furthermore, we will further study the convergence and applications of the LI-WOA for image matching with a theoretical approach.
Footnotes
Acknowledgments
This work was partially funded by the National Nature Science Foundation of China under Grant No. 51679057, and partly supported by the Province Science Fund for Distinguished Young Scholars under Grant No. J2016JQ0052.
