Abstract
Non-rigidly coupled LiDAR-camera systems are increasingly adopted for 3D reconstruction and autonomous robots. However, point cloud colorization remains challenging for such systems due to the absence of fixed extrinsic parameters between sensors. This paper proposes a novel colorization framework for self-built non-rigid systems, which decomposes the colorization task into localized 3D-2D projective transformation estimation, mitigating the lack of fixed extrinsics. Each image then colors local 3D scene within its view frustum. When merging these scenes, overlapping regions yield multiple coloring candidates for individual 3D points. To resolve this, we introduce a coloring reliability assessment method that selects optimal coloring sources per point by evaluating projection geometry and reprojection errors. We conduct comprehensive experiments on self-collected outdoor real-world datasets, reporting the quantitative accuracy of 3D-2D transformation estimation and conducting qualitative analysis of colorized point clouds; we also perform ablation studies on the key module and cross-system validation on rigidly coupled systems, all of which collectively demonstrate the effectiveness and extensibility of the proposed method.
Introduction
LiDAR-Camera systems have been widely adopted in 3D reconstruction and autonomous robotic systems to produce colorized point cloud. 1 In a typical LiDAR-camera system, the components are rigidly connected. A pre-calibration step suffices to determine the fixed extrinsic parameters between the LiDAR and camera that define their relative spatial relationship.2,3 These fixed parameters enable the mapping of each LiDAR point to a corresponding image pixel, thereby coloring the LiDAR points. Subsequently, techniques such as simultaneous localization and mapping (SLAM) are employed to accumulate these points into a global colorized point cloud. 4
Non-rigidly coupled LiDAR-camera systems are also frequently encountered in practical systems, arising from a variety of real-world setups such as mechanical jitter in the mounting of originally rigidly coupled systems, integrated pan-tilt camera-LiDAR combinations, or the more common design illustrated in Figure 1 that employs a servo motor to drive LiDAR rotation. This servo-driven rotating setup, in comparison with rigidly mounted LiDAR sensors, enables a significant and flexible enhancement of the LiDAR’s perceptual coverage range, bringing notable perception gains for handheld devices and robotic systems.5,6 However, due to the non-rigid coupling arising from the servo-driven rotational motion of the LiDAR, the extrinsic parameters between the LiDAR and the camera become time-varying, making it impossible to obtain fixed extrinsic parameters via a single pre-calibration step as in rigidly coupled systems. This not only presents a significant challenge for point cloud colorization,
5
but also raises difficulties for many other key robotic tasks including SLAM,
7
multi-sensor fusion and multi-modal perception.
8
The top image shows a self-built non-rigidly coupled LiDAR-camera system. The bottom image illustrates its point cloud colorization results of a street-level commercial building facade using the proposed framework.
To enable continuous estimation of system extrinsic parameters over time, instead of relying on a one-time pre-calibration, numerous studies have investigated online calibration approaches.9–11 Such methods treat LiDAR-camera extrinsic parameters as optimizable variables for continuous iterative estimation and integrate them as a submodule into algorithms such as multi-sensor fusion SLAM, thereby improving the accuracy of extrinsic estimation and the entire system. 12 However, existing online calibration methods are still designed for rigidly coupled sensors. They can only handle minor extrinsic variations caused by factors like long-term mechanical mounting loosening, and are unable to adapt to the scenario of rapidly extrinsic variations induced by the servo-driven LiDAR rotation as addressed in this paper.
For systems such as servo-driven LiDAR setups, several studies leverage motor encoder angle data to measure the real-time rotational angles of the LiDAR, and thus treat such systems as rigidly coupled systems in turn.13,14 They first transform the LiDAR point cloud into the motor’s coordinate system. This can establish fixed extrinsic parameters between the motor coordinate frame and the camera. Colors are then mapped to the point cloud in the motor frame and subsequently accumulated using SLAM. However, these approaches require high-precision synchronization among the LiDAR, the motor encoder, and the camera, which is challenging for self-built systems13,15 and inapplicable to non-rigidly coupled systems without access to servo angle information.
To address these challenges, we propose a novel framework for point cloud colorization tasks specifically designed for non-rigidly coupled LiDAR-camera systems. Crucially, it enables colorization on any self-built system without requiring hardware calibration or data synchronization, greatly facilitating the design and implementation of autonomous robotic systems. It first accumulates LiDAR points into a global map using SLAM and segments this map into local scenes. For each scene and its corresponding image, an 3D-to-2D projective transformation is estimated to enable the colorization within the local scene. When merging scenes, LiDAR points potentially mapped to multiple images undergo a coloring reliability evaluation to select the optimal coloring source, maximizing overall colorization quality. The contributions of this work can be summarized as follows: (1) A novel point cloud colorization framework is designed explicitly for non-rigidly coupled LiDAR-camera systems, enabling its application on any self-built hardware without requiring extrinsic calibration or data synchronization. (2) A coloring reliability evaluation method is introduced to robustly resolve ambiguous color mappings by selecting the optimal image source for each LiDAR point, thereby ensuring high-quality and consistent colorization.
Methodology
As shown in Figure 2, our framework comprises three main modules. The preprocessing module ingests raw LiDAR scans and camera images, accumulates a global point cloud via LiDAR SLAM while estimating coarse camera poses for each image frame. It then partitions the global cloud into local point cloud scenes roughly aligned with corresponding image’s view frustum, forming image-point cloud pairs. Subsequently, the target-free 3D-to-2D transformation estimation module autonomously computes the transformation matrix for each pair using a state-of-the-art target-less extrinsic calibration method.
3
Finally, the multi-view colored points merging module evaluates coloring reliability across all potential mappings for each LiDAR point and selects the optimal image source to produce a high-fidelity colorized point cloud. The point cloud colorization framework for non-rigidly coupled LiDAR-camera systems.
Notations
We denote the world, LiDAR, and camera coordinate frames as {W}, {L}, and {C}, respectively. Given the global point cloud set
Preprocessing
This module resolves the absence of fixed extrinsic parameters by decomposing global point cloud colorization into independent local scene tasks. The global point cloud can be generated by any LiDAR SLAM algorithm (e.g., FAST-LIO2
16
in our work). Critically, our framework bypasses gimbal coordinate transformations (infeasible for custom setups like Figure 1) by having SLAM directly compute the LiDAR-to-world transformation
For each image
Target-free 3D-2D transformation estimation
For each pair (
While the target-free calibration approach estimates
Multi-view colored points merging
While individual coloring is performed per image-point cloud pair, significant overlaps between scenes may cause multiple images to color each LiDAR point. Selecting optimal coloring sources during point cloud merging thus becomes critical. We find that larger incidence angles and longer ranges between camera and object surfaces correlate with higher coloring errors. We therefore propose a quality assessment metric evaluating coloring reliability per point-image pair using
Experiments
Experimental setup
Due to the absence of public datasets for non-rigidly coupled LiDAR-camera systems, we collected ten outdoor data sequences (trajectory lengths: 20-100m) across Dalian Maritime University campus using the custom-built system shown in Figure 1. The hardware comprises a servo motor rotating an Ouster OS1-32 LiDAR, a rigidly mounted FLIR CM3-U3-13Y3C color camera capturing 1280×1024 images at 25 Hz, and an onboard computer (Intel i7-10510U CPU, 16GB RAM) for data logging and offline processing.
For the decay coefficients in Eq. 3, we select representative data sequences and calculate the value distribution of the three metric terms. Each coefficient is set as the reciprocal of the 95th percentile of its corresponding term to normalize different quantities into a consistent range [0,1]. The final coefficients are set as
Results of 3D-to-2D transformation estimation
Transformation estimation errors.
Results of point cloud colorization
Since acquiring ground-truth point-pixel correspondences remains challenging, quantitative evaluation of colorization accuracy relies on the indirect metrics presented in last section. Here we qualitatively validate results through rendered colorized point clouds. Figures 1 and 3 presents the colorization results of multiple structured architectural scenes and less-structured lawn and garden environments. The coloring results demonstrate realistic appearance consistency with actual environments. Point cloud colorization results. (a)-(c) are scene photographs and (d)-(f) are corresponding colored point clouds.
Ablation study on multi-view merging
We conduct an ablation study to validate the significance of the multi-view colored points merging module. For comparison, we implement a baseline framework without this module, where each point’s color is assigned solely by the image corresponding to its first projection instance during scene merging. Figure 4 contrasts representative local details from both methods, showing that the baseline suffers from obvious misalignment and ghosting artifacts (e.g., text boundaries), while our method presents clearer edges and a more faithful representation by selecting optimal coloring sources based on perspective, distance, and projection quality. Comparison of local details of the proposed framework with (left column) and without (right column) multi-view merging.
Comparison on rigidly coupled systems
We compare the proposed framework with the coloring scheme based on pre-calibration. Since traditional pre-calibration methods are not applicable to the colorization task of non-rigidly coupled LiDAR-camera systems focused in this paper, we conduct such a comparison on rigidly coupled systems, which can further verify the extensibility of the proposed method. Specifically, the servomotor of the device shown in Figure 1 is locked to form a conventional rigidly coupled system. Two outdoor data sequences of approximately 100 meters long are collected in the campus environment as test data. The same LiDAR SLAM method (i.e., FAST-LIO2) is adopted for map construction. For the baseline method, after performing manual extrinsic calibration using the state-of-the-art calibration method, 3 the raw LiDAR points of each frame are directly colored online with the calibrated extrinsics, and the global colored point cloud is accumulated incrementally. Note that our experimental setup is not equipped with hardware synchronization; thus, the platform moves slowly during data acquisition to minimize the impact of time synchronization errors. In contrast, the proposed framework operates in an offline manner: the global map is first constructed, and then the colorization is completed following the pipeline shown in Figure 2.
Comparisons of the partial colorization results for the two data sequences are illustrated in Figure 5. It can be seen that both methods are capable of accomplishing the global colorization task. However, the baseline method yields inferior color details with blurry local regions, which is most likely due to the lack of precise hardware synchronization in the rigidly coupled system constructed in this work. This leads to fusion errors between LiDAR and image frames, and such errors accumulate at the same physical location from multi-frame fusion. In contrast, the proposed method achieves sharper local color details, thanks to its one-shot offline coloring based on image-scene pairs and its independence from precise hardware temporal synchronization. We also present the quantitative results using the same statistical approach as in Table 1, where Table 2 summarizes the errors of the estimated 3D-2D transformations over 15 randomly selected frames. It shows that the average error is also close to that in Table 2, which also verifies the applicability of the proposed method on different types of LiDAR-camera systems. Comparison results on the rigidly coupled system. Transformation estimation errors on rigidly coupled system.
Conclusion
We propose a point cloud colorization framework for non-rigidly coupled LiDAR-camera systems. By decomposing the global coloring problem into localized 3D-to-2D transformation estimation tasks, our approach overcomes the absence of fixed extrinsic parameters. A novel coloring quality assessment method is introduced to resolve optimal image selection during multi-view colored points merging. Future work will focus on improving the framework’s robustness across diverse scenarios and validating it on more types of non-rigid systems. We also intend to develop quantitative evaluation benchmarks with ground truth for point cloud coloring and conduct more systematic analysis on parameter effectiveness, so as to further improve the practicality and objectivity of the method.
Footnotes
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (grant number 62303085).
Declaration of conflicting interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
All data relevant to this study are available upon request. Please contact the corresponding author to access the data.
