Abstract
With the development and maturity of virtual reality (VR for short) and artificial intelligence technology, panoramic VR roaming is applied to more and more industries. In the construction engineering industry, due to the complex structure of some key building nodes, the application of panoramic VR roaming technology modeling will play a role in significantly reducing the workload and understanding difficulty of design, construction and management personnel. Therefore, this study improves the parallax map generation method of the belief propagation (BP) algorithm in the panoramic VR roaming to use the form of limit matching and optimizes the energy function and matching primitives in it, so as to propose a spatial model construction method of the panoramic VR roaming based on the improved BP algorithm. The experimental results show that the panoramic VR roaming space model construction method proposed in this study can significantly improve the quality of spatial modeling of building node images, and the normalized values of structural similarity of VR roaming space models designed based on Improved belief propagation (IBP), GC, BP, and DP algorithms are 3.32, 3.23, 2.96, and 2.84, respectively, when the number of iterations is obtained 200 times, is also the highest among the compared methods, for example, when the number of samples is 400, the calculation time of IBP scheme is 14.73
Introduction
The rise of VR technology-based applications in recent years has led to the remarkable development of immersive videos that can meet users’ needs for high-quality image viewing [1]. In this context, panoramic VR roaming based on depth information is gradually understood by various industries due to its advantages of good roaming experience and high scene construction effect [2]. In the field of building construction management, due to the complex structure of some large buildings, construction, management, and design difficulties, the application of panoramic VR roaming space modeling technology to it will be able to based on the practitioners intuitive, multi-spatial perspective of the key nodes viewing tools, compared with the traditional engineering drawing software, can better play a role in reducing the difficulty of practitioners to understand the building structure, thus improving the engineering construction design progress of the building and reduce the full-cycle cost of the project [3]. However, the traditional panoramic VR roaming space modeling has problems such as the modeling quality needs to be improved and the model loses more original image information, and it is necessary to improve the modeling method in order to enhance the quality of the panoramic VR roaming space modeling. In this study, an attempt is made to optimize the parallax map generation method, energy function, and matching primitives from the BP algorithm used for panoramic VR roaming space modeling to improve the VR roaming space modeling quality of the method.
Related works
A large number of experts and scholars have conducted research on virtual space modeling and VR roaming technology for building nodes. Mohamed et al. [4] designed an adaptive safety monitoring model for the safety problems caused by human factors faced by buildings. The test results showed that the frequency of the loss of items inside the building decreased significantly after using the safety monitoring model. Nishida et al. [5] found that creating a large number of 3D modeling virtual models was time-consuming and proposed an interactive tool that allows users to tool individual images of buildings and their own needs to automatically generate corresponding virtual space models, and test results showed that using this interactive tool, the speed of building virtual space models of buildings was significantly improved. Ramdan et al. [6] team found that the lack of attention to building modeling design in the teaching of modern architecture is likely to lead to low building design ability of students, so they proposed an improved teaching scheme combining building software modeling design. Experiments show that the teaching scheme can play a better role in cultivating architectural talents. Wang et al. [7] proposed an image space distance feature processing algorithm combining k-means and KNN (k Nearest Neighbors) algorithms in order to improve the quality of virtual modeling of indoor objects in large spaces, and the simulation test results showed that the modeling quality of the modeling system built based on this algorithm was significantly higher than that of the traditional method. Wang et al. [8] constructed a hybrid parameter identification algorithm consisting of least squares and optimal search methods in order to find a more accurate description of the heat transfer mode between buildings and indoor heating systems, and incorporates this algorithm into the virtual model of indoor heating, and the experimental results show that the virtual model built using the method of hybrid this algorithm can more accurately describe the indoor heat propagation law of buildings.
Elbamby et al. [9] found that VR is expected to be one of the killer applications in 5G networks. However, there are many technical bottlenecks and challenges to be overcome for wide-scale adoption of VR, especially the need for virtual reality in terms of high throughput, low latency and reliable communication requires innovative solutions across multiple disciplines. With this in mind, this paper discusses the challenges and implementation approaches for achieving reliable and low-latency virtual reality applications. Furthermore, it was found in the case study of an interactive virtual reality game arcade that intelligent network design using millimeter wave communication, edge computing, and active caching can enable the future vision of wireless virtual reality. Thies et al. [10] proposed an improved FaceVR algorithm, while the current market dominant approach for facial tracking in virtual reality environments exclusively Despite achieving good tracking performance, the resulting generated virtual space models are less realistic. Compared with these traditional VR modeling methods, the FaceVR algorithm designed in this study can generate VR models with a near photo-like degree of realism. The researchers also designed a wearable device based on the algorithm, the key component of which is a head device with optical and acoustic sensors, and the device can perform real-time facial motion capture of a person wearing a head-mounted display (HMD) and track the eyes from monocular video. Test results show that the device designed based on the improved FaceVR algorithm incorporates realistic re-rendering in real time, so that the geometry and shape of the face and eyes can be manually modified to improve the realism of the created model. Du et al. [11] studied a system that can coordinate data communication between BIM and VR devices in order to solve the problems of communication difficulties of VR applications and inefficient data transfer methods between VR and other engineering modeling software in the building construction industry, and conducted performance verification experiments on it, which showed that the system can significantly improve the user’s device ease of use and data transmission speed. Meißner et al. [12] identified the drawbacks of eye tracking methods with cumbersome data encoding and reviewed the key advantages of different eye tracking techniques applicable to artificial, natural, and virtual environments. The combination of virtual reality settings with eye tracking is then explained, providing a unique perspective for shopper research. Huang and Liaw [13] investigated the possibilities and advantages and disadvantages of applying virtual reality to compulsory education in order to reduce the difference between what students learn and their real-life experiences. The results of the study showed that perceived self-efficacy and perceived interaction are two key factors that influence the effectiveness of VR applications in the field of teaching and learning. In addition, perceived ease of use, perceived usefulness, and motivation to learn are three important factors that influence learners’ willingness to use virtual reality learning environments.
Analysis of the above-mentioned relevant research results reveals that in recent years, experts and scholars have begun to apply VR technology in large numbers to a variety of traditional industries such as education, engineering construction, and industrial design, and have achieved a wide range of academic results. But the application of advanced VR roaming technology to the virtual modeling display of architectural nodes is still quite rare, and this application will play a significant role in improving the difficulty of understanding engineering construction and design for practitioners, which is one of the implications of conducting this study.
Improved BP algorithm design for constructing panoramic VR roaming in building nodes
Panoramic VR roaming polar matching method design
The VR roaming scene with omnidirectional scenes constructed by VR technology for the construction of key node structures will be able to provide construction managers, designers, supervisors, and owners with a more intuitive and immersive engineering observation experience, thus improving the quality of their own work [14]. The difficulty in roaming space generation is the extraction of depth information of the adjacent two viewpoint scene images, which is crucial for virtual viewpoint synthesis [15]. However, the accuracy of traditional parallax map generation algorithms is often unsatisfactory due to the large difference in image size between adjacent viewpoints when using omnidirectional linear path acquisition. Therefore, an improved BP algorithm based on polar line matching is now proposed.
BP algorithm is a typical global matching algorithm, which updates the confidence degree by carrying out information computation among pixels in the domain, so as to output a stable optimal solution [16]. In order to solve the image matching method selection of the omnidirectional linear path collection module in the traditional BP algorithm and to deal with the vertical and horizontal dimensional offset problems, this study proposes an improved BP algorithm, whose core computational flow is shown in Fig. 1.
Flow chart of the core steps of the improved BP algorithm.
The following is a detailed design of the algorithm. The premise of the algorithm to obtain the parallax map is to find the pair of polar lines of the self-image point based on the viewpoint image correspondence, when the traditional binocular way of matching polar lines is no longer reasonable, here, based on the binocular lateral pair of polar lines, the calculation method of the pair of adjacent viewpoint images is designed, and then the image matching path is optimized. The left-right imaging of a standard stereo image can be described in Fig. 2.
Schematic diagram of polar matching under binocular vision.
As shown in Fig. 2, the left and right imaging are located on the same plane, and both are parallel to the baseline
Imaging diagram of adjacent viewpoints before and after.
In Fig. 3,
Matching path of traditional binocular optimization algorithm.
The subfigures (a) and (b) in Fig. 4 represent the matching path process of binocular vertical and binocular horizontal local optimization algorithms, respectively. Considering the offset problem existing in both directions, the pair of polar lines is now chosen as the image matching path, and the window size of the local matching algorithm is set to
In Eq. (1),
In Eq. (3.1),
SAD SSD is used to measure the difference between the primitive to be matched and the reference primitive, and NCC is used to measure the similarity between the two, and the most appropriate measurement function should be chosen according to the conditions of use.
The BP algorithm approximates the space to be matched for the physical images of building nodes as a Markov random field (MRF for short), so the marker state can be updated by information propagation between nodes [18]. Because of the problems of noise interference and unsmooth edges of parallax objects in stereo matching in panoramic VR roaming, a model based on Bayes’ theorem is more suitable for such problems [19]. And MRF is also a Bayesian network model, which has its own two-dimensional network and is more suitable for processing two-dimensional image data, and the common types of MRF are shown in Fig. 5.
Common MRF models.
Observing Fig. 5, we can see that the eight-domain MRF is based on the original four-domain with the addition of four directions: top-right, bottom-right, top-left and bottom-left, and the former message propagation direction is more efficient. If the four-domain MRF is applied to the image processing field, each node can be used to represent a pixel point, as shown in Fig. 5, in which the dots represent pixel points, each edge line represents the relationship between the pixels of the two endpoints, and the black pixel points and white pixel points represent the hidden unknown variables and the observed known variables, respectively. Now, for example, we briefly explain the calculation process of MRF, assuming that the known viewpoint image, the hidden information of the scene to be inferred are
The full probability model is removed from the known variable nodes in MRF and used to obtain the edge probability of selected nodes, and the node edge probability calculation formula is shown in Eq. (7).
The relationship between viewpoint images and parallax information is expressed in MRF by the likelihood function
The global stereo matching algorithm uses the minimization energy function to calculate the optimal solution of the problem, and the key steps of the computational process are energy function establishment and iterative optimization, and the minimized numerical solution of the energy function must be obtained to generate the optimal image parallax information [20]. While the energy function of the global matching algorithm is divided into two smooth terms and data terms, the former represents the generation value of the viewpoint image discontinuity and the latter represents the penalty cost of a certain parallax marker with the viewpoint image, the global energy function can be expressed as in Eq. (9).
In Eq. (10),
And the data term is calculated from the absolute difference of the cost function, which is calculated in Eq. (11).
In Eq. (11),
where
Confidence information dissemination model.
In Fig. 6,
In Eq. (13),
The optimal parallax for each pixel in the image can be obtained by minimizing the confidence level, from which the worst parallax of
After iterative optimization and information transfer through the above steps, a parallax map with global optimality can be generated. If the binocular matching algorithm is directly used to the front-back image matching, it is very easy to bring the problem of mis-matching, so the matching primitives need to be optimized. Assuming the corresponding epipolar lines of the forward image
In Eq. (16),
After completing the construction of the improved BP algorithm (hereinafter referred to as IBP), 400 groups of domestic building nodes of different sites, types and construction stages were modeled using 3ds Max2016 software, and the task of generating the optimal parallax based on the before-and-after images in the modeling was completed by the improved BP algorithm designed in this study, and the data set was collected by taking and keeping the high-definition images. The quality of the output VR models is evaluated using structural similarity, mutual information content, mean square error, and training time using the HD images of the building nodes as labels, and each metric is subject to standardization to unify the order of magnitude. The VR roaming space construction method designed using traditional confidence propagation (BP for short), dynamic programming (DP for short), and graph cut stereo (GC for short) matching algorithms is used as the comparison method. The calculated structural similarity data between building node models and photos are shown in Fig. 7.
Structural similarity statistics between building node model and photos.
The horizontal axis in Fig. 7 represents the number of training iterations of each algorithm, the vertical axis represents the structural similarity of the calculated results after normalization, and the data dots in different colors represent the VR roaming space model construction methods based on different matching algorithms. Observing Fig. 7, it can be seen that with the growth of the number of iterations, the structural similarity values of VR roaming space models based on various matching algorithms show a trend of first growing and then stabilizing. When each method is trained to convergence, the normalized structural similarity between the VR roaming spatial model designed based on IBP, GC, BP, and DP algorithms and the source image are 3.32, 3.23, 2.96, and 2.84, respectively, and the spatial model constructed by the VR system based on the IBP algorithm designed in this study has the highest similarity to the original image and the best restoration effect. Then, the similarity of the output spatial model of each scheme with the original image was further verified from the perspective of mutual information quantity, and the statistical results are shown in Fig. 8.
Mutual information statistics of building node model and photos.
The horizontal axis in Fig. 8 represents the number of training iterations of each algorithm, the vertical axis represents the normalized mutual information quantity, and the different color data lines represent the VR roaming space model construction methods based on different matching algorithms. As we can observe in Fig. 8, with the growth of the number of iterations, the change pattern of the mutual information quantity values of VR roaming space models based on various matching algorithms is generally consistent with Fig. 7. After each method is trained to convergence, the normalized mutual information amounts of the VR roaming space models designed based on the IBP, GC, BP, and DP algorithms and the source images are 3.26, 3.22, 3.02, and 3.24, respectively, and the VR spatial modeling system based on the IBP algorithm designed in this study retains the most information of the original images. The mean square error between the output spatial model and the original image for each scheme is analyzed below and shown in Fig. 9.
Mean square error statistics of building node models and photos.
The horizontal axis in Fig. 9 still represents the number of training iterations of each algorithm, the vertical axis represents the normalized military squared error, and the different color data lines represent the VR roaming space model construction methods based on different matching algorithms. Analysis of Fig. 9 reveals that, on the whole, the standardized mean square error values of VR space construction methods based on each algorithm show an overall decreasing fluctuating downward trend with the growth of the number of iterations. The spatial modeling system based on the IBP algorithm converges the slowest, but the standardized mean square error value after convergence is the smallest,
Time consumed/min for each protocol to process different number of samples
Observing Table 1, it can be seen that the VR spatial modeling system based on the IBP algorithm designed in this study has significantly higher average consumption time for processing samples for all selected cases of various processing sample sizes, which is due to the incorporation of the attention mechanism in the confidence propagation model in the system, which significantly increases the computational complexity of the algorithm. In contrast, the VR spatial modeling system based on the traditional BP algorithm has the lowest computational time consumption for each processing sample size, while the modeling time consumption of the VR spatial modeling system based on the GC and DP methods is approximately the same and in between the first two.
Aiming at the problem of information loss in the VR model of building nodes, this research designed a construction method of panoramic VR roaming space model based on improved BP algorithm, and selected the VR roaming space construction method designed using traditional confidence propagation, dynamic planning, and graph cut stereo matching algorithm as a comparison method to carry out experiments. The experimental results show that, with the increase of the number of iterations, the structural similarity value and mutual information of VR roaming space model based on various matching algorithms show a trend of first increasing and then tending to be stable. When the performance of all methods is stable, the standardized values of the above indicators of the VR roaming space model designed based on IBP, GC, BP and DP algorithms are 3.32, 3.23, 2.96, 2.84 and 3.26, 3.22, 3.02 and 3.24 respectively. The mean square error shows a trend of overall fluctuation and decline with the increase of the number of iterations. In addition, under the selected conditions of various data sample sizes, the average sample processing time of the VR spatial modeling system based on IBP algorithm is significantly higher than that of other systems. The experimental results show that the application of the improved confidence propagation algorithm designed in this study to the construction of building node VR walkthrough space will reduce the information loss of the constructed VR walkthrough space model, but it will to a certain extent prolong the modeling time of building node objects in virtual space, which is also an aspect that needs continuous improvement in subsequent research.
