Abstract
Image stitching is a widely utilized computer vision technique with applications in panorama generation, virtual reality, 3D reconstruction, and structural inspection. Accurate estimation of the homography matrix is critical for stitching quality but remains sensitive to hyperparameter selection and lacks systematic optimization frameworks. This study proposes a regression analysis-based hyperparameter optimization algorithm to ensure stable control over stitching errors. Quantitative variables characterizing the number and spatial distribution of inliers were defined, and a regression model was developed to predict stitching errors based on these characteristics. Bayesian optimization was then employed to determine optimal hyperparameters. Experimental validation using 100 high-resolution drone-captured image pairs demonstrated significant improvements. The proposed algorithm reduced average stitching error by 21.4% and maximum error by 61.7%, effectively eliminating critical failures exceeding 1.5%. Visual comparisons confirmed consistent improvements in alignment quality across diverse cases. This research introduces an innovative “error prediction-optimization” framework for hyperparameter tuning, providing a robust foundation for reliable image stitching in applications such as drone-based inspections and virtual reality mapping. Future work will extend validation to challenging imaging conditions and explore lightweight optimization methods for real-time processing.
Keywords
Introduction
The field of computer vision (CV) has advanced rapidly across various application domains. Among its numerous techniques, image stitching is widely utilized in industrial and commercial applications, including panorama creation, virtual reality, and 3D reconstruction (Fu et al., 2023). Image stitching combines multiple images into a single, coherent visual representation by aligning locally captured images, effectively generating high-resolution images of large structures that are difficult to capture in a single shot. The image stitching process generally involves three key steps: extracting corresponding feature points from image pairs, removing outliers, and estimating a homography matrix to transform images into a common coordinate system. This process ensures proper alignment; however, errors in feature extraction and matching can lead to inaccurate homography estimation, significantly degrading the quality of the final stitched image. Prior research on improving image stitching performance has primarily focused on two main directions. The first involves advancements in feature extraction and matching algorithms. Techniques such as Scale-Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF) have improved matching accuracy under varying lighting conditions and scales. Karami et al. (2017) conducted a comparative analysis of feature extraction algorithms in the context of image stitching. The second research direction focuses on enhancing outlier removal techniques. The widely used Random Sample Consensus (RANSAC) algorithm effectively filters inconsistent feature correspondences, thereby improving homography reliability. More recently, the Parallel Sample Consensus (PARSAC) algorithm has been introduced, leveraging parallel processing with neural networks to enhance efficiency, building upon the Conditional Sample Consensus (CONSAC) framework (Kluger and Rosenhahn, 2024). Despite these advancements, several challenges remain. Deep learning-based optimization methods, which have recently gained popularity, require extensive training datasets and high-performance GPU resources, limiting their applicability in resource-constrained or field environments. Moreover, the black-box nature of deep learning models often hinders interpretability, making it difficult for engineers to diagnose and address the root causes of stitching errors in practical applications. In addition, adaptive RANSAC and its variants such as ARRSAC (Raguram et al., 2008), dynamically adjust parameters based on the statistics of a single image pair, but they lack a data-driven or generalizable framework for understanding and controlling the root causes of stitching errors across diverse environments. As a result, both approaches may degrade in performance when applied to new or complex scenarios, and their results are often difficult to interpret or troubleshoot in field applications. First, the performance of feature extraction algorithms is highly sensitive to hyperparameter selection, which currently lacks a standardized optimization method. While some efforts have been made towards automated parameter selection in image stitching (Prokop and Połap, 2024), a comprehensive and robust optimization framework remains an ongoing challenge. This sensitivity affects image stitching accuracy and limits the generalization of models across diverse environments. Additionally, applying uniform hyperparameter settings to multiple images can introduce errors in certain image pairs, leading to cumulative distortions that degrade stitching quality.
In civil engineering, CV techniques are increasingly employed to enhance the accuracy and objectivity of structural inspections. These applications are part of a broader trend of integrating artificial intelligence for structural health monitoring (Spencer et al., 2025). CV-based applications include damage detection (Kim and Cho, 2019), deflection measurement (Hong et al., 2024), and stress analysis (Lee et al., 2019). Furthermore, drones are increasingly utilized for safety inspections of large civil structures (Lee et al., 2022). Due to the large scale of civil structures, these methods often incorporate image stitching techniques to generate comprehensive visual representations. Several studies have sought to improve image stitching accuracy for structural analysis. Cui and Zhang (2024) applied SURF-based stitching for crack images captured from different angles, preserving dimensional information and improving measurement accuracy on curved surfaces. Kao et al. (2024) employed deep learning models, such as Mask R-CNN, to identify cracks as regions of interest, reducing manual parameter tuning and enhancing robustness under challenging conditions. Additionally, Wang et al. (2024) proposed a depth-assisted panoramic stitching method to correct image misalignment and optimize seamline placement in 360-degree panoramas. However, these approaches exhibit limitations: deep learning methods require extensive training datasets and computational resources, depth-assisted techniques introduce errors when depth data is inaccurate, and significant parallax or distortion remains problematic on highly irregular surfaces. Furthermore, the absence of standardized stitching quality metrics complicates performance comparisons across different algorithms, underscoring the need for more adaptable solutions suited to complex geometries and environmental conditions. In contrast, the regression-based approach proposed in this study leverages a diverse set of real-world images to quantitatively model the relationship between inlier characteristics and stitching errors. This enables robust error prediction and transparent parameter optimization, making the framework highly suitable for civil engineering applications where data may be limited, computational resources are constrained, and explainability is essential. By providing interpretable error prediction based on inlier characteristics, the framework supports engineers in identifying and mitigating the root causes of stitching errors. This interpretability stands in contrast to the black-box nature of deep learning models and provides a robust foundation for reliable and explainable image stitching in civil engineering contexts.
In this study, a novel image stitching algorithm is proposed that predicts the maximum possible stitching error for each image pair based on inlier characteristics, and automatically optimizes hyperparameters to ensure that this predicted upper-bound error remains below a specified quality threshold. The algorithm focuses on controlling the worst-case stitching error, thereby providing robust quality assurance for the final stitched output. Four critical parameters influencing stitching quality—derived from feature extraction and outlier removal processes—were identified and systematically analyzed. To establish the relationship between the predicted upper-bound error and these parameters, real-world images captured from civil structures were used, and eight corresponding point pairs were manually selected as reference points in each image pair. By varying the hyperparameters, the four variables were computed and the maximum stitching error was evaluated for each configuration. A regression model was developed to predict the maximum possible stitching error based on these variables. This model, combined with Bayesian optimization, enables automated hyperparameter selection that guarantees the predicted maximum error does not exceed the desired threshold. The effectiveness of the proposed algorithm was validated using high-resolution images captured from a port structure in South Korea. In summary, this research provides a practical, interpretable, and resource-efficient framework for quality-assured image stitching, enabling systematic control of maximum stitching errors and reliable performance in civil engineering applications. The remainder of this paper is organized as follows: Section 2 provides the theoretical background on feature extraction techniques, outlier removal methods, and image stitching approaches. Section 3 details the proposed algorithm. Section 4 presents experimental validation of its performance, and Section 5 concludes with a discussion of findings and potential directions for future research.
Background
Feature extraction techniques in images
Feature extraction is a fundamental step in image processing and computer vision tasks, particularly for establishing correspondences between image pairs. It involves detecting salient points or regions within images, which are subsequently utilized in feature matching and geometric transformation processes. Prominent feature extraction algorithms include SURF (Speeded-Up Robust Features), SIFT (Scale-Invariant Feature Transform), and ORB (Oriented FAST and Rotated BRIEF). Each algorithm identifies feature points based on distinctive intensity variations within local image regions. Specifically, SURF employs Haar wavelet-based approximations to efficiently detect robust features invariant to scale and rotation (Bay et al., 2006). SIFT utilizes Difference-of-Gaussian (DoG) operations to achieve stable performance across diverse scales and orientations (Lowe, 2004). ORB combines the FAST corner detector with BRIEF descriptors to maximize computational efficiency while maintaining robust matching performance (Juan and Gwun, 2009).
Outlier removal techniques
Accurate removal of outliers is essential for reliable feature correspondence estimation during image matching. Outliers typically originate from noise, illumination changes, or incorrect matches, significantly deteriorating homography estimation accuracy if not properly addressed. Widely adopted algorithms for outlier detection and removal include RANSAC (Random Sample Consensus) and MSAC (M-estimator SAmple Consensus). RANSAC iteratively estimates model parameters by randomly sampling subsets of correspondences, distinguishing inliers that fit the estimated model from outliers that deviate significantly (Fischler and Bolles, 1981). MSAC, an enhanced variant of RANSAC, provides improved robustness against outliers by employing a more refined scoring function for model fitting (Torr and Zisserman, 2000). These algorithms ensure accurate homography estimation by selecting optimal subsets of inliers, thus exhibiting robust performance even under noisy or challenging conditions.
Image stitching methods
The core task in image stitching is accurately estimating the homography matrix that describes geometric transformations between overlapping images. Homography is represented as a 3 × 3 matrix capable of mapping corresponding planar points from one image onto another with high precision. Accurate homography estimation heavily relies on correct feature correspondences obtained through effective feature extraction, matching, and outlier removal processes. To achieve precise stitching results across extensive overlapping areas, matched feature points must be accurately localized and evenly distributed throughout the images. Consequently, the overall success of image stitching critically depends on preceding stages such as robust feature extraction algorithms, reliable matching strategies, and sophisticated outlier removal techniques. Therefore, advanced algorithmic approaches are essential for accurate homography computation and optimized matching procedures in practical image stitching scenarios (Zhang et al., 2012).
Proposed algorithm
Overall workflow of the proposed framework
The accuracy of the homography matrix—essential for successful image stitching—depends directly on the quantity and spatial distribution of inliers (feature correspondences after outlier removal). Building upon this dependency, we propose an algorithm that leverages regression analysis to predict stitching errors based solely on inlier characteristics obtained through various hyperparameter configurations for feature extraction and outlier removal. This predictive framework not only enhances stitching accuracy but also ensures robustness across diverse imaging conditions, introducing a novel paradigm for hyperparameter optimization in image stitching (Figure 1). Flowchart of proposed algorithm.
Initially, we extract features using SURF due to its computational efficiency and robustness. Subsequently, MSAC is applied to select reliable inliers from matched features and compute the corresponding homography matrix. The proposed algorithm operates as follows: (1) Specify Image Pairs and Initial Hyperparameters: Select target image pairs for stitching and define initial hyperparameters. (2) Extract Features Using SURF: Detect distinctive feature points from each image. (3) Select Inliers Using MSAC: Identify reliable correspondences among extracted features using MSAC; subsequently estimate the homography matrix. (4) Quantify Inlier Characteristics: Evaluate quantitative metrics regarding the number and spatial distribution of inliers. (5) Predict Stitching Error via Regression Model: Employ a pre-trained regression model correlating quantified inlier characteristics with expected stitching errors. (6) Evaluate Predicted Error Against Threshold: If predicted error exceeds predefined criteria (set at 0.5% of diagonal length; approximately 22 pixels for images of resolution 3840 × 2160), automatically adjust hyperparameters through Bayesian optimization until acceptable error levels are achieved.
Hyperparameters optimized include: • Feature Detection Threshold (hereafter, Feature Threshold): Controls the sensitivity of the feature extraction algorithm (SURF). Higher values result in fewer but more distinctive feature points being detected. • Inlier Confidence Level (hereafter, Confidence Level): Sets the probability threshold for accepting a geometric model during outlier removal (MSAC). Excessively high values may lead to overfitting by accepting only highly confident models. • Maximum Iterations (hereafter, Max Iterations): Specifies the maximum number of iterations allowed for the outlier removal process. If set too low, the algorithm may fail to find an optimal model. • Inlier Distance Threshold (hereafter, Distance Threshold): Determines the maximum allowable distance between matched points after transformation. A stricter threshold removes more potential outliers, while a looser threshold may admit more mismatches.
Development of regression analysis model
Dataset construction
Hyperparameters for dataset construction.
Quantitative assessment of stitching accuracy
Quantitative assessment of the estimated homography in image stitching requires a numerical evaluation method. In this study, stitching accuracy was evaluated by manually selecting eight corresponding reference points for each pair of images, as illustrated in Figure 2. To ensure spatial uniformity in accuracy assessment, the overlapping region between each image pair was divided into eight equal sections, and one reference point was selected from each section. For each image pair j, we defined the coordinates of these reference points as Example of selected reference points to calculate stitching error.

Stitching errors were standardized across images with varying dimensions by normalizing each RMSE using the corresponding image diagonal length (D
j
). The final stitching error percentage was calculated by averaging these normalized RMSE values across all image pairs, as shown in equation (2):
Definition of variables
To enable multidimensional and reliable prediction of stitching errors, we selected four variables. These variables were chosen to comprehensively reflect the quantitative abundance, spatial uniformity, spatial coverage, and density-dispersion balance of inlier distributions, respectively. By integrating these complementary aspects, the proposed approach overcomes the limitations of conventional methods that rely solely on single metrics, such as the inlier count, and allows the regression-based model to capture the diverse mechanisms underlying stitching errors. This strategy ensures reliable error prediction and hyperparameter optimization across a wide range of image conditions and environments. These variables were also chosen for their intuitive physical meaning, enabling field engineers to diagnose potential sources of stitching errors and make informed decisions during inspection or data acquisition (Figure 3). (1) N : Logarithmic Number of inliers Example of input images used for stitching.

Defined as the logarithm (base 10) of the total number of inliers:
Figure 4 visually represents the identified inliers within the image. (2) S : Standard Deviation of Inlier Grid Distribution Visualization of selected inliers.

This variable quantifies the uniformity of inlier distribution across an image. The image is divided into grids (e.g., 16 × 9), and S measures how evenly inliers are distributed within these grids. A lower S indicates uniform distribution, whereas a higher S suggests clustering of inliers, potentially increasing stitching errors. Figure 5 illustrates the grid-based analysis, showing how inliers are distributed across various regions of the image and supporting the computation of S (3) P : Inlier Coverage Area Ratio Grid-based scoring of inliers for spatial distribution analysis.

This variable measures spatial coverage by computing the ratio between the union area of circles (radius = 300 pixels) centered at each inlier point and the total area of the image. To intuitively illustrate this concept, the spatial coverage can be visualized as “paint balls” covering the image area, where a higher coverage ratio implies better spatial distribution and potentially lower stitching errors. Figure 6 provides a visualization of this concept, where overlapping circles centered at inlier points depict the spatial coverage, effectively demonstrating how P is derived. (4) ND : Number-distance Product Spatial coverage visualization using paint-ball representation.

ND is defined as the product of two variables-the logarithmic number of inliers (N), calculated as equation (2), and their average mutual distance (D). It quantifies both quantity and spatial dispersion characteristics simultaneously:
Development of regression model
The four independent variables (N, S, P, and ND) were defined to quantitatively characterize the spatial distribution of inliers across 100 pairs of images, resulting in a dataset comprising 25,600 data points. For each image pair, stitching error was calculated by comparing the transformed positions of eight manually selected reference points with their actual positions on the images. This error metric served as the dependent variable in the regression analysis. To ensure the robustness of the regression model and improve its predictive accuracy for maximum stitching errors, extreme error values were excluded during model training. Specifically, errors within the top 0.1% range across the dataset were removed to mitigate potential biases caused by outliers. Following this preprocessing step, each independent variable was divided into 50 equally spaced intervals. Within each interval, only the top 1% of errors were retained for constructing the regression model. This approach ensured that the model focused on predicting upper-bound errors effectively. Figure 7 visually illustrates the values of each independent variable after preprocessing, with data points corresponding to top-percentile errors highlighted in distinct colors. Analyzing these selected data points reveals a consistent trend: stitching error values decrease rapidly at lower ranges of each independent variable and gradually stabilize as the variable increases. Based on this observed pattern, a regression model utilizing the hyperbolic tangent (tanh) function was chosen as the architecture for training. The trained regression model closely aligns with the dashed curve depicted in Figure 7. Given that the primary objective of this study is to control stitching errors through hyperparameter optimization rather than achieving precise predictive accuracy, detailed descriptions of model training procedures and accuracy metrics are intentionally excluded from this section. Regression models to predict maximum stitching error based on inlier characteristics.
Bayesian optimization
Bayesian optimization was employed to efficiently identify hyperparameter configurations that control predicted stitching errors below the target threshold of 0.5%. This approach is particularly suitable for problems with high computational costs where repeated evaluations are impractical and the objective function is non-convex or lacks an explicit mathematical form (Frazier, 2018; Snoek et al., 2012).
The optimization process begins with several initial random samples of the four hyperparameters (Feature Threshold, Confidence Level, Max Iterations, Distance Threshold). For each sampled configuration, the regression model developed in Section 3.2.4 predicts the expected stitching error. These results are used to construct a Gaussian process-based surrogate model, which serves as a probabilistic representation estimating the distribution of stitching errors across all possible hyperparameter combinations. The surrogate model provides both the mean prediction and the associated uncertainty for any point in the hyperparameter space, enabling the algorithm to balance exploration and exploitation when selecting the next candidate configuration through an acquisition function such as Expected Improvement. At each iteration, the surrogate model is updated with new observations, and the acquisition function guides the search toward hyperparameter sets that are most likely to further reduce the predicted error. The objective function for Bayesian optimization is defined as the predicted stitching error from the regression model, and a penalty term is applied if the error exceeds the 0.5% threshold. This ensures that the optimization process prioritizes configurations that yield robust, quality-assured stitching results. Figure 8 presents a representative example of the optimization trajectory for a single image pair, visualized as a parallel coordinates plot. In this plot, each polyline corresponds to a candidate hyperparameter configuration evaluated at each iteration. Early iterations are depicted as lighter blue lines, and as the optimization progresses, the lines become darker, indicating the transition toward later iterations. The final optimal solution is distinctly marked in red, making it easy to identify the converged parameter set. By analyzing this plot, it is evident that the initial search explores a wide range of parameter values, while later iterations progressively concentrate around specific regions, demonstrating effective convergence toward an optimal configuration. By integrating regression-based error prediction and Bayesian optimization, the proposed framework enables automated, data-driven hyperparameter tuning that is both computationally efficient and interpretable. This is particularly advantageous for image stitching in civil engineering, where minimizing manual intervention and ensuring consistent quality are critical. Parallel coordinates plot visualizing the optimization trajectory of four hyperparameters during Bayesian optimization. Blue lines represent each iteration, with darker shades indicating later iterations; the final optimal solution is highlighted in red.
Experimental validation
Quantitative validation and algorithm comparison on the dataset
Comparative experiments were conducted to evaluate the effectiveness and efficiency of the proposed algorithm, using a dataset comprising 100 pairs of high-resolution images (3840 × 2160) captured by a drone flying parallel to a large civil engineering structure, referred to as ‘the wharf’. All experiments were performed on a workstation equipped with an Intel i9-11900 CPU and 96 GB RAM, using MATLAB R2023b. The stitching error calculation method described in Section 3.2.2 was consistently applied across all experiments. For the control group, a standard baseline hyperparameter configuration was used: Feature Threshold = 1,000, Confidence Level = 99%, Max Iterations = 1,000, and Distance Threshold = 1.5. Under these baseline settings, stitching errors ranged from 0.04% to 1.54% relative to the image diagonal length, with an average error of 0.14% and a standard deviation of 0.21%. Notably, critical stitching failures occurred in cases where errors exceeded 1.5%, compromising reliability and rendering some results unusable. Applying the proposed regression analysis-based hyperparameter optimization algorithm resulted in significant improvements. The stitching error was reduced to a range of 0.03% to 0.59%, with an average error of 0.11% and a standard deviation of 0.09%. These results are illustrated in Figure 9, where (a) shows the stitching error distribution under baseline hyperparameter settings, (b) presents the results after applying the proposed algorithm, and (c) depicts the difference between the two distributions, highlighting the improvements achieved. Key achievements include: (1) Elimination of Critical Stitching Failures: The maximum error was reduced by 61.7% (from 1.54% to 0.59%), effectively preventing cases with errors above 0.6%. This improvement is attributed to the algorithm’s ability to predict extreme errors and perform optimizations. (2) Improved Error Distribution: The average stitching error decreased by 21.4% (from 0.14% to 0.11%), while the standard deviation was reduced by 57.1% (from 0.21% to 0.09%). These reductions demonstrate enhanced consistency and reliability in stitching outcomes across diverse image pairs. Comparison of stitching error distribution before and after hyperparameter optimization: (a) shows the stitching error distribution under baseline hyperparameter settings, (b) presents the results after applying the proposed algorithm, and (c) depicts the difference in stitching error between the two distributions (baseline vs. proposed algorithm), highlighting the improvements achieved.

These findings confirm that the proposed algorithm successfully achieved its objective of ensuring robust stitching performance across diverse image pairs. Although not all errors fell below the target threshold of 0.5%, the elimination of extreme failure cases and improved error distribution underscore the algorithm’s stability and practical applicability in real-world applications.
To further assess optimization efficiency and reliability, a direct comparison was performed between Bayesian optimization (BO) and a genetic algorithm (GA)—both capable of global parameter search—under identical experimental conditions. For BO, the number of iterations per image pair was limited to a maximum of 30. For GA, the population size was set to 20 and the maximum number of generations was set to 30. Importantly, for both BO and GA, the optimization was immediately terminated if the predicted maximum stitching error for a given image pair fell below the threshold of 0.5% of the image diagonal length. This ensured that both algorithms operated under the same early stopping criterion, allowing for a fair and direct comparison. The average processing time for the entire optimization process using BO was approximately 12.3 min per 100 image pairs, while GA required 52.7 min. BO achieved a mean actual error of 0.131%, with 98% of the dataset satisfying the threshold of 0.5% or less. In comparison, GA yielded a mean actual error of 0.153%, with 96% of the dataset meeting the 0.5% threshold. While genetic algorithms are also capable of global optimization and can be applied to this class of parameter search problems, Bayesian optimization proved more suitable for the proposed error prediction-optimization framework owing to its higher sample efficiency, faster convergence, and adaptive search capability. These characteristics enabled BO to achieve superior accuracy and computational efficiency under the same stopping conditions, making it particularly advantageous for practical deployment in civil engineering applications.
Ablation study on variable selection
The primary objective of variable selection in the proposed hyperparameter optimization algorithm is to accurately predict the maximum possible stitching error for each image pair. To quantitatively evaluate the effect of each variable and their combinations, we conducted ablation experiments using all possible subsets of the four variables (N, S, P, ND) as inputs to the regression model.
Three key performance metrics were used for quantitative evaluation. First, the containment rate (hereafter, CR) measures the proportion of cases in which the predicted maximum stitching error (
Second, the over-prediction rate (hereafter, OPR) quantifies the average extent to which the predicted error exceeds the actual error, defined as:
Third, the mean actual error is the average actual stitching error (%) across all image pairs.
The goal of the algorithm is to maximize CR so that the predicted error reliably serves as an upper bound for quality assurance, while avoiding excessive OPR, which would reduce the practical utility of the optimization. If coverage is low, quality assurance becomes unreliable, but if the over-prediction rate is too high, the optimization loses effectiveness.
Performance comparison across different combinations of variables.
Visual comparison of stitching results
Additional experiments were conducted on the image pair exhibiting the highest stitching error within the dataset (Figure 10) to further validate the algorithm’s effectiveness. This image pair, shown in Figure 10, represents the original input used for visual comparison experiments and highlights a challenging case for stitching accuracy. The experiments were repeated three times under identical conditions as described in Section 2.3.1. Table 3 provides a quantitative comparison of stitching errors before and after applying the proposed optimization algorithm, including both pixel deviations and diagonal length-relative percentages (%). Using baseline hyperparameter settings, stitching errors for this image pair ranged from 1.40% to 1.63%, corresponding to pixel deviations between 61.79 pixels and 71.68 pixels relative to the image diagonal length. In contrast, applying the proposed optimization algorithm reduced errors significantly, yielding a range of 0.56%–0.57%, or pixel deviations between 24.86 pixels and 25.14 pixels, representing an overall reduction exceeding 65%. Figures 11 and 12 provide visual comparisons of stitching results across three trials conducted under baseline hyperparameter settings and the proposed algorithm, respectively. Figure 11 illustrates the stitched images obtained without applying the optimization algorithm, allowing visual assessment of stitching errors across all trials. These results demonstrate how baseline settings fail to address critical stitching challenges effectively. In contrast, Figure 12 presents the results after applying the proposed algorithm, where superior alignment quality and reduced stitching errors are consistently observed across all three trials. The optimized results exhibit minimal artifacts and improved consistency, even in this challenging case. These visual comparisons corroborate the quantitative findings presented in Table 3, highlighting the algorithm’s ability to address critical failure cases effectively while maintaining high visual fidelity. By eliminating significant misalignments and achieving a more reliable error distribution, the proposed algorithm demonstrates robustness in handling high-error scenarios. Image pair used for visual comparison. Quantitative comparison of stitching errors: Baseline versus proposed algorithm. Visual results of stitched images using baseline hyperparameter settings: (a–c) trial 1–3. Visual results of stitched images after applying the proposed algorithm: (a–c) trial 1–3.


Conclusions
This study proposed a novel regression analysis-based hyperparameter optimization framework for image stitching, addressing the limitations of conventional feature matching and black-box optimization methods. By introducing an interpretable error prediction-optimization process, the algorithm enables systematic, data-driven control over homography accuracy and stitching quality. The framework integrates feature extraction, inlier-based variable quantification, regression-based error prediction, and Bayesian optimization, providing both interpretability and robust error control across diverse imaging conditions.
Experimental validation using a large-scale image dataset of a structural environment demonstrated the algorithm’s efficacy. The results showed a 21.4% reduction in average stitching error (from 0.14% to 0.11%) and a 61.7% reduction in maximum error (from 1.54% to 0.59%) relative to baseline hyperparameter settings. Notably, the algorithm successfully eliminated critical stitching failures where errors exceeded 1.5%, thereby establishing a minimum quality threshold for all results. This achievement underscores the algorithm’s ability to predict extreme errors and incorporate these predictions into an effective optimization process. A key practical strength of the proposed approach is its computational feasibility for field applications. The entire optimization process for 100 high-resolution image pairs was completed in approximately 12.3 min, corresponding to about 7.4 s per image pair. While this is slower than the baseline method (2.1 s per pair), the proposed algorithm still enables near real-time or efficient batch processing in practice, striking a balance between accuracy and speed. This demonstrates that robust, data-driven quality assurance can be achieved without excessive computational cost or reliance on large-scale training data. Furthermore, the framework’s interpretable structure and ability to achieve robust results with a limited number of optimization iterations distinguish it from deep learning-based or heuristic methods, allowing for consistent and reliable stitching quality without extensive manual intervention.
From an academic standpoint, this study addresses key limitations in existing feature matching-focused research by introducing a novel and systematic “error prediction-optimization” framework for hyperparameter tuning, which is interpretable and generalizable compared to prior black-box approaches. This methodological advancement broadens the scope of hyperparameter optimization research by emphasizing error prediction as a critical component of the optimization process. Practically, the proposed algorithm provides a robust foundation for achieving consistent stitching results without reliance on manual quality verification procedures, making it particularly valuable in applications such as drone-based structural inspections and virtual reality mapping. Furthermore, the observed 57.1% reduction in standard deviation (from 0.21% to 0.09%) highlights the algorithm’s reproducibility across diverse environmental conditions.
Despite these advancements, certain limitations remain that warrant further investigation. The current validation was conducted exclusively under high-resolution (4K) conditions, with limited exploration of challenging scenarios such as low-light environments or high-noise datasets. Future research should extend performance evaluations to multi-resolution datasets and adverse imaging conditions to ensure broader applicability. Additionally, developing lightweight optimization algorithms capable of real-time processing and exploring deep reinforcement learning-based automation frameworks represent promising directions for future work.
In conclusion, the regression analysis-based hyperparameter optimization algorithm presented in this study establishes a novel paradigm for quality-assured image stitching by combining predictive accuracy with practical utility. By addressing both academic challenges and real-world requirements, this approach contributes significantly to advancing image stitching technology while paving the way for its standardization and industrial adoption.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2025-00522517).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
