An adaptive uncertainty-aware hybrid neural network for enhanced learning in real-time building energy prediction with dynamic occupant behavior modeling

Abstract

Precise real-time prediction of energy consumption in buildings is very essential to smart energy management, but is extremely challenging because of the dynamic and unpredictable behavior of occupants. These variations are not typically represented in traditional statistical and machine learning models, leading to poorer forecasting performance and reduced reliability in practical applications. We aim to fill this gap by introducing a new framework called AU-HNN (Adaptive Uncertainty-Aware Hybrid Neural Network), which thoroughly combines multi-scale hybrid deep learning (CNN, LSTM, Transformer) with Bayesian uncertainty quantification and dynamic occupant behaviour modeling. The model features online incremental learning to conform to changing behavioral patterns and to give probabilistic estimates with calibrated confidence intervals. Extensive experiments were performed on real-world data sets of the eight major cities in China to compare AI-HNN to ten state-of-the-art models, such as ARIMA, Prophet, XGBoost, Random Forest, CNN-LSTM, BiLSTM, Transformer, GRU, Bayesian-LSTM, and MC-Dropout CNN. Findings reveal that AU-HNN can improve 12 performance measures by 1720 percent with significant accuracy (RMSE = 7.42, MAE = 5.58, R² = 0.923, NRMSE = 0.137) and uncertainty quantification (PICP = 0.948, PINAW = 0.227, CWC = 14.2). Moreover, AU-HNN demonstrates competitive real-time (latency = 189 ms, memory usage = 142 MB, energy efficiency = 9.810⁶ FLOPS/Watt) and can be deployed in the smart edges. The proposed framework offers extremely precise, adaptive, and uncertainty-sensitive energy predictions to support risk-based decision-making by building operators and energy managers. Its ability to capture the human-environment dynamic encapsulated in occupant dynamics and energy usage creates a very new path to smarter, more resilient, and sustainable building energy management systems.

Keywords

Building energy prediction hybrid neural networks uncertainty quantification occupant behavior modeling adaptive learning real-time forecasting

1. Introduction

Development of energy consumption is one of the greatest sustainability issues in the international arena, which contributes about 40 percent of the total global energy consumption and 38 percent of the total global carbon dioxide emissions.¹ With the increased pace of urbanization in the world (especially in developing countries such as China), the energy requirements of residential, commercial, and institutional buildings are rapidly increasing. In China, the growth rate in building energy consumption has been incredible, with a growth of 5.6 per annum, and is expected to rise to 4098 Mtce by 2030, which is almost three times the world average.² Rapid urbanization is the main cause of this exponential growth, with the urbanization rate of China increasing since the year 2000, up to 68.52% in 2020, with a huge amount of commercial and residential infrastructures being erected. The multifaceted nature of energy prediction is due to several interrelated aspects, such as weather variations, building properties, occupancy, and work schedules, among which occupant behavior is the most crucial and challenging element, and can account for 20% to 80% of the variation in energy consumption prediction.^2,3 Conventional building energy management systems make use of a set of non-dynamic, non-stochastic models that cannot reflect the dynamism and stochasticity of human behavior, leading to high errors in predictions and poor energy efficiency performance.

Recent developments in artificial intelligence and machine learning have provided very new opportunities in solving these problems. The convolutional neural network (CNN) and long short-term memory (LSTM) deep learning models have proven to be better in the detection of complex temporal structures in building energy consumption.^1,4 Nevertheless, current methods have severe limitations: they are not adaptive to varying occupancy models in real-time, there is no uncertainty quantification for risk-sensitive decision making, and they do not effectively combine multi-scale time dependencies. Smart buildings and Internet of Things (IoT) sensor developments lead to some unique possibilities for real-time monitoring of occupancy as well as adaptive energy use. Wi-Fi channel state information (CSI), CO2 sensors, motion detectors, and access card systems allow for full monitoring of occupant activities.⁵ However, to effectively predict energy using these multi-modal data streams, it is essential that advanced modeling frameworks, able to cope with uncertainty, time dynamics, and real-time adaptation needs, are used.

In spite of all the research studies conducted, there are still some basic gaps in developing energy prediction methodologies. The majority of current methods consider occupant behavior as being fixed or semi-fixed values, and thus they do not reflect the dynamic and changing nature of human behavior, which can differ greatly because of working-from-home policies, season, and other special events.^3,6 These assumptions result in a prediction error of 15–30 percent in times of change in behavioral patterns.⁷ The existing prediction models only give the point estimations without the confidence interval and other uncertainty measures, which limits their application in risk-sensitive decision making, and the uncertainty-comprehensive models are still not explored much.^8–10 Moreover, most models necessitate full retraining as occupancy changes and are therefore not suitable for dynamism as they cannot be trained online in an incremental way.¹¹ The other gap is in the fact that there is little externalisation of multi-scale temporal dependencies, and existing methods concentrate on a single temporal scale and do not consider interactions between short-term operational patterns and long-term seasonal changes.^12,13 Lastly, there is very limited research to support the practicality of their models in real-time, such as the computational efficiency, memory constraints, as well as latency requirements to enable edge computing deployment.¹⁴ To solve these problems, the current research proposes a new framework called AU-HNN (Adaptive Uncertainty-Aware Hybrid Neural Network), which incorporates four elements of innovation.

First, it presents a dynamic occupancy behavior learning module that integrates multi-modal sensor fusion (Wi-Fi CSI, CO2, motion sensors, access card data),⁵ adaptive weight assignment, and an activity sequence modeling temporal attention mechanism, being the first paper to jointly consider implementation of real-time adaptation with probabilistic uncertainty quantification. Second, the online incremental learning system is created through elastic weight consolidation (EWC), memory-efficient adaptation, and adaptable expansion of the architecture to facilitate continuous model updates without catastrophic forgetting. Third, we present an uncertain prediction system based on uncertainty-sensitive prediction networks, which combine Bayesian neural networks to measure epistemic and aleatoric uncertainties, predict with calibration like prediction intervals, and facilitate risk-minded decision making. Fourth, we implement a multi-scale time fusion network using CNNs in the short term, LSTMs in the long term, and Transformers in adaptive selection of features, and complemented by a dynamic weighting schedule that optimizes contributions between prediction horizons.

The proposed solution covers five basic research questions: how can multi-modal sensor data be simulated to provide dynamic occupant modeling in real time; what can incremental learning mechanisms be to ensure continuous adaptability without complete retraining; how can Bayesian uncertainty quantification be integrated with real-time systems; what temporal fusion architectures can capture both short and long-term dependencies; what can be optimized in edge computing environments with high accuracy and uncertainty calibration. Methodologically, the first unified system that integrates dynamic occupant behaviour modeling, online incremental learning, uncertainty quantification, and multi-scale temporal fusion is an AU-HNN. Theoretically, it adds a mathematical formalization in uncertainty propagation, convergence behavior in incremental learning, and a prolonged occupant behavior modeling theory. In practice, it proposes computationally efficient algorithms to integrate the building management system in real-time and facilitate risk-informed decision making within latency constraints of less than 200 ms. Empirically, the use of AU-HNN is confirmed in eight large cities in China with different climates and building types, measured by 12 overall indicators, and compared to 10 existing baselines, always with better results.

Although the framework further evolves the existing baselines, it is limited in some aspects. This is because its multi-component architecture consumes large amounts of computing resources, and its operation is reliant on high-quality sensor data, which is not always available. Occupant monitoring matters also require close consideration of privacy issues. Geographically, the areas of validation are specific to the Chinese datasets, and the focus is on commercial, educational, and office buildings, whereas residential use and long-term predictions of seasons are considered as future research. Modern edge computing with GPUs is essential to support real-time feasibility, and integration with current management platforms could come with extra complexity.

However, the research contribution is very high. On a scientific level, the work of the AU-HNN sets new standards of accuracy, reliability, as well as practical deployability in intelligent building energy prediction. Practically, it promotes very smart HVAC control, lighting optimization, demand-side response, and energy scheduling based on cost-efficiency, as well as predictive maintenance. It also shows very high potential, 15–25% energy savings annually, which translates to an annual cost reduction of around 2.3 million dollars and a 1847 tons in CO2 emissions in the case of 45 buildings that were included in this study. The rest of the paper is structured in the following manner: Section 2, related literature, Section 3, the methodology, Section 4, experimental setup, Section 5, results and analysis, Section 6, discussions and limitations, and finally, Section 7, conclusion.

2. Related works

Historically, building energy prediction has developed throughout the last 20 years, shifting away from statistical and regression-based predictors to top machine learning (ML) and deep learning (DL) systems. Earlier statistical models like ARIMA and linear regression were simple to interpret but could not exploit nonlinearities in large-scale energy data (Liu et al., 2021). The Random Forests, XGBoost, and LightGBM are very significant ensemble models that thoroughly offered very substantial improvements in the consideration of intricate interactions between the building parameters and weather conditions, as well as the operational variables, which enabled much greater forecasting accuracy too (Dai & Huang, 2025; Wang et al., 2023).

As deep learning became a reality, some new architectures like LSTM, CNN-LSTM hybrids, and attention-based models were developed in this paper, which further improved prediction by learning both temporal as well as high-dimensional feature representations (Chang et al., 2025; Jogunola et al., 2022). There have been more recent developments that have combined hybrid models along with optimization algorithms and domain knowledge, solving the overfitting problem and complexity, as well as striving to be more reliable (Zhou et al., 2022; Zeng et al., 2025). In spite of these improvements, a majority of models are limited by the fact that they are based on a fixed set of data and cannot respond to the dynamic environment or shifts in operations, which highlights the necessity of using dynamic and real-time prediction systems (Reveshti et al., 2025).

Another stream of research identifies the importance of occupant behavior in influencing building energy consumption as well. Human activity in traditional models was usually simplified with rigid schedules or probabilistic laws, which resulted in the discrepancy between simulated and real energy consumption (Uddin et al., 2021). These emerging opportunities of IoT and sensor data have facilitated more dynamic modeling of human-building interactions, which generally includes real-time coverage of occupancy, movement, and activity (Gu & Shao, 2023; Guyixin et al., 2025). New methods have been developed, which include deriving occupancy data with social media data (Lu et al., 2021) or using Wi-Fi CSI and Transformer models to identify occupancy more precisely (Zhang et al., 2025; Sun et al., 2023). Such techniques thoroughly prove that deep learning can be very helpful in learning the dynamics of behavior, but it is not yet sufficiently combined with energy prediction methods.

Current systems tend to separate occupant models and energy models, and overlook the stochastic and uncertain characteristics of human behavior. Even though some have tried to develop uncertainty-aware or adaptive occupant modeling (Su et al., 2023; Yahaya et al., 2025), the problem of privacy, scalability, and generalization remains an obstacle to large-scale deployment in heterogeneous socio-cultural settings. Besides occupant integration, uncertainty quantification and incremental learning are also developing important research topics in improving predictive robustness in building energy systems. The quantification of prediction uncertainty has been performed based on Bayesian networks, Monte Carlo simulations, and regularized Bayesian neural networks, which improve the trust and decision support in real-world applications (Nezhadetthad et al., 2025; Yahaya et al., 2025). On the same note, adaptive learning systems like online LSTMs and hybrid models also seek to reduce catastrophic forgetting and preserve predictive accuracy in dynamical settings (Zhu & Zhang, 2025).

Nevertheless, domain-specific incremental learning that is specific to building systems has not been studied in much depth, and most solutions are generic as opposed to specific to the heterogeneity of building types and patterns of use. The literature therefore brings out three gaps that are interrelated, namely low dynamic adaptability, real-time deficiency of integrating occupant behavior, and uncertainty-aware incremental learning at the energy system level. In order to overcome these obstacles, our proposed AU-HNN architecture places itself as a very novel solution that combines real-time adaptations of occupancy, multi-scale temporal fusions, and uncertainty-aware predictions and goes beyond the current state of the art in building energy prediction, as examined in Table 1.

Table 1.
Overview of the existing studies.

Reference Method/Model Dataset/Domain Key findings Research gaps Contributions Uncertainty handling Occupancy modeling Real-time capability

¹ CNN-LSTM-MHA Residential heating MHA improves accuracy Static occupancy modeling Multi-head attention integration Not addressed Passive factors only Not specified

¹⁵ LightGBM, RF, XGBoost General buildings XGBoost outperforms others No uncertainty quantification Advanced optimization strategies Not considered Not included Not evaluated

¹⁶ Prophet + Behavior Office buildings Personnel behavior crucial Limited behavioral modeling Prophet enhancement approach Not addressed Basic personnel consideration Not mentioned

² Review paper Multiple domains Occupant behavior critical Dynamic modeling lacking Comprehensive review synthesis Not focus area Extensive coverage Not applicable

⁴ CBLSTM-AE Commercial buildings Autoencoder improves features No real-time implementation Hybrid deep framework Not considered Not addressed Not implemented

¹⁷ Random Forest hybrid Mixed buildings Hybrid approach effective Lacks deep learning RF enhancement strategy Not included Not modeled Not assessed

⁶ Attention-based DL Public buildings Attention mechanism robust Fixed occupancy schedules Attention-based robustness improvement Not addressed Schedule-based only Not specified

⁷ TOSSM + BEM University buildings Social media useful Privacy concerns raised Social media integration Not considered Social media based Not evaluated

¹¹ GA-enhanced DNN Commercial buildings GA improves features Computational complexity high GA feature optimization Not addressed Basic consideration Not mentioned

¹⁸ Neural Network Office buildings NN effective prediction Limited architecture innovation Basic NN application Not included Not modeled Not assessed

⁸ Bayesian NN Parking systems Uncertainty quantification important Different domain focus Bayesian uncertainty framework Bayesian approach used Not applicable Real-time capable

¹⁹ ML climate-based Climate-focused buildings Climate factors crucial Limited ML models Climate-ML integration Not addressed Not considered Not evaluated

³ Temporal assessment Chinese households Occupant perspective vital Dynamic modeling needed China-specific analysis Not addressed Occupant-centric approach Not applicable

²⁰ Transformer Occupancy prediction Transformer effective occupancy Energy prediction missing Transformer for occupancy Not considered Transformer-based prediction Not specified

²¹ Review paper Multiple approaches Occupant behavior influential Standardized modeling lacking Systematic review synthesis Not focus area Comprehensive coverage Not applicable

¹⁴ XGBoost control HVAC systems XGBoost effective control Real-time optimization lacking Demand response integration Not included Not addressed Day-ahead only

²² Regularized BNN Occupancy detection BNN improves precision Energy prediction missing Bayesian regularization approach Bayesian uncertainty used Detection-focused approach Not evaluated

⁹ Uncertainty-aware ML IoT energy harvest Uncertainty awareness crucial Building domain missing IoT uncertainty framework Uncertainty-focused approach Not applicable Real-time capable

¹⁰ Uncertainty-aware learning Residential comfort Uncertainty improves comfort Energy prediction lacking Thermal comfort uncertainty Uncertainty integration used Smart building context Not specified

¹² VAE + SA-GRU Green buildings VAE-GRU integration effective Real-time implementation missing Multifaceted data integration Not addressed Not modeled Not implemented

²³ LSTM micro-climate Urban buildings Micro-climate impacts energy Limited LSTM architecture Urban micro-climate integration Not included Not addressed Long-term monitoring

⁵ Bi-LSTM CSI Smart buildings Wi-Fi CSI effective Energy prediction missing Non-intrusive behavior recognition Not considered Wi-Fi based detection Real-time capable

¹³ EMD + DL + ARIMA Heating load Combined models superior Complex model ensemble Multi-model combination approach Not addressed Not included Short-term only

²⁴ LSTM-GAN Dynamic prediction GAN enhances LSTM Limited architectural innovation LSTM-GAN hybrid approach Not addressed Not modeled Dynamic capability

Reference	Method/Model	Dataset/Domain	Key findings	Research gaps	Contributions	Uncertainty handling	Occupancy modeling	Real-time capability
¹	CNN-LSTM-MHA	Residential heating	MHA improves accuracy	Static occupancy modeling	Multi-head attention integration	Not addressed	Passive factors only	Not specified
¹⁵	LightGBM, RF, XGBoost	General buildings	XGBoost outperforms others	No uncertainty quantification	Advanced optimization strategies	Not considered	Not included	Not evaluated
¹⁶	Prophet + Behavior	Office buildings	Personnel behavior crucial	Limited behavioral modeling	Prophet enhancement approach	Not addressed	Basic personnel consideration	Not mentioned
²	Review paper	Multiple domains	Occupant behavior critical	Dynamic modeling lacking	Comprehensive review synthesis	Not focus area	Extensive coverage	Not applicable
⁴	CBLSTM-AE	Commercial buildings	Autoencoder improves features	No real-time implementation	Hybrid deep framework	Not considered	Not addressed	Not implemented
¹⁷	Random Forest hybrid	Mixed buildings	Hybrid approach effective	Lacks deep learning	RF enhancement strategy	Not included	Not modeled	Not assessed
⁶	Attention-based DL	Public buildings	Attention mechanism robust	Fixed occupancy schedules	Attention-based robustness improvement	Not addressed	Schedule-based only	Not specified
⁷	TOSSM + BEM	University buildings	Social media useful	Privacy concerns raised	Social media integration	Not considered	Social media based	Not evaluated
¹¹	GA-enhanced DNN	Commercial buildings	GA improves features	Computational complexity high	GA feature optimization	Not addressed	Basic consideration	Not mentioned
¹⁸	Neural Network	Office buildings	NN effective prediction	Limited architecture innovation	Basic NN application	Not included	Not modeled	Not assessed
⁸	Bayesian NN	Parking systems	Uncertainty quantification important	Different domain focus	Bayesian uncertainty framework	Bayesian approach used	Not applicable	Real-time capable
¹⁹	ML climate-based	Climate-focused buildings	Climate factors crucial	Limited ML models	Climate-ML integration	Not addressed	Not considered	Not evaluated
³	Temporal assessment	Chinese households	Occupant perspective vital	Dynamic modeling needed	China-specific analysis	Not addressed	Occupant-centric approach	Not applicable
²⁰	Transformer	Occupancy prediction	Transformer effective occupancy	Energy prediction missing	Transformer for occupancy	Not considered	Transformer-based prediction	Not specified
²¹	Review paper	Multiple approaches	Occupant behavior influential	Standardized modeling lacking	Systematic review synthesis	Not focus area	Comprehensive coverage	Not applicable

¹⁴	XGBoost control	HVAC systems	XGBoost effective control	Real-time optimization lacking	Demand response integration	Not included	Not addressed	Day-ahead only
²²	Regularized BNN	Occupancy detection	BNN improves precision	Energy prediction missing	Bayesian regularization approach	Bayesian uncertainty used	Detection-focused approach	Not evaluated
⁹	Uncertainty-aware ML	IoT energy harvest	Uncertainty awareness crucial	Building domain missing	IoT uncertainty framework	Uncertainty-focused approach	Not applicable	Real-time capable
¹⁰	Uncertainty-aware learning	Residential comfort	Uncertainty improves comfort	Energy prediction lacking	Thermal comfort uncertainty	Uncertainty integration used	Smart building context	Not specified
¹²	VAE + SA-GRU	Green buildings	VAE-GRU integration effective	Real-time implementation missing	Multifaceted data integration	Not addressed	Not modeled	Not implemented
²³	LSTM micro-climate	Urban buildings	Micro-climate impacts energy	Limited LSTM architecture	Urban micro-climate integration	Not included	Not addressed	Long-term monitoring
⁵	Bi-LSTM CSI	Smart buildings	Wi-Fi CSI effective	Energy prediction missing	Non-intrusive behavior recognition	Not considered	Wi-Fi based detection	Real-time capable
¹³	EMD + DL + ARIMA	Heating load	Combined models superior	Complex model ensemble	Multi-model combination approach	Not addressed	Not included	Short-term only
²⁴	LSTM-GAN	Dynamic prediction	GAN enhances LSTM	Limited architectural innovation	LSTM-GAN hybrid approach	Not addressed	Not modeled	Dynamic capability

3. Materials and methods (AU-HNN)

3.1 Overview of proposed framework

The suggested model involves the combination of multi-source data preprocessing, a hybrid neural network (CNN + LSTM) architecture, dynamic occupant behavior simulation, and uncertainty quantification to allow smart buildings to make precise energy consumption and occupancy forecasts. The framework works on a real-time basis with edge computing as well as real-time data processing, as shown in Figure 1.

Figure 1.

AU-HNN – overall framework.

The suggested framework incorporates various major elements to improve predictive accuracy as well as real-time flexibility. It is initiated by sophisticated data preprocessing and feature engineering, which integrates weather, occupancy, as well as building features with dynamic temporal features. Then hybrid neural network architecture is used, consisting of CNNs used to extract spatial features, BiLSTMs used to model temporal dependencies and attention, and multi-task learning used to achieve robust modeling. Dynamic occupant behavior is modeled to capture user variability by use of clustering, Bayesian filtering, online learning, and temporal evolution analysis. Additionally, Bayesian neural networks, adaptive estimation, and risk assessment metrics are involved in the process in order to quantify uncertainty and also to ensure reliability. Lastly, a real-time deployment is accomplished through edge computing optimization, model compression, and practical management of streaming data.

Each component's innovation, performance gain, and validation results have been illustrated in Table 2. Improvements in accuracy, adaptation speed, as well as prediction reliability have been highlighted as well.

Table 2.

Components and validation.

Component	Innovation	Performance impact	Validation
Occupant Behavior Learning	Multi-modal + temporal attention	+23.7%	94.3% accuracy
Online Incremental Learning	EWC + dynamic expansion	+20.5%	2–3 day adaptation
Uncertainty Prediction	Bayesian + confidence intervals	22.1% narrower PI	94.8% PICP
Multi-Scale Fusion	CNN + LSTM + Transformer	+27%	Superior across horizons

Feedback Loop is given in Figure 2, which illustrates the way that the system continuously monitors outputs and uses them to adjust inputs or processes, which also ensures adaptive improvement and stability over time as well.

Figure 2.

Feedback loop.

3.2 Overview

The framework integrates multi-source data fusion, hybrid neural networks, occupant behavior modeling, uncertainty quantification, and real-time implementation. The overall objective function combines energy prediction, occupancy classification, and uncertainty penalties as given in equation (1):

\begin{aligned} L_{t o t a l} = α L_{e n e r g y} + β L_{o c c u p a n c y} + γ L_{u n c e r t a i n t y} \end{aligned}

(1)

$L_{e n e r g y} = \frac{1}{N} \sum_{i = 1}^{N} (y_{i} - {\hat{y}}_{i})^{2} (M S E o f e n e r g y p r e d i c t i o n)$

$L_{o c c u p a n c y} = - \frac{1}{N} \sum_{i = 1}^{N} (y_{i} - {\hat{y}}_{i})^{2} (C r o s s - E n t r o p y)$

$L_{u n c e r t a i n t y}$ derived from Bayesian posterior variance.

3.3 Data preprocessing and feature engineering

In the preprocessing stage, missing values handling is a very critical step to ensure data reliability. Notably, building energy datasets often contain gaps due to the following reasons: sensor malfunctions, communication errors, or irregular occupant activity as well. In order to address this, missing values were treated using a hybrid imputation strategy, and short gaps were also thoroughly filled with linear interpolation to preserve temporal continuity, while longer gaps were handled using statistical methods such as mean substitution or k-nearest neighbor (KNN)-based imputation as well. This ensured that the model received consistent inputs without introducing significant bias as well. Moreover, outlier values are thoroughly detected through z-score thresholding, which were replaced with smoothed values to prevent distortion in the training process. By systematically addressing missing as well as noisy data, the robustness of the proposed AU-HNN framework was improved, as represented in Table 3.

Table 3.
Dataset description.

Att. ID Attribute name Type / Unit Value / Range (≈) Source / Sensor Handling of missing values Role in AU-HNN Preprocessing applied

A1 Energy Consumption kWh (continuous) 0–12,000 per day Smart meters BiLSTM imputation Target variable Normalization, lag features

A2 Indoor Temperature °C 15–32 IoT thermal sensors Sliding-window fill Input feature Normalization, temporal aggregation

A3 Outdoor Temperature °C −10–42 Weather API Forward fill (short gaps) Input feature Normalization, derivative features (heat index)

A4 Relative Humidity % 10–95 Weather stations Interpolation Input feature Normalization

A5 Solar Radiation W/m² 0–950 Pyranometer Linear interpolation Input feature Normalization, diurnal pattern encoding

A6 Wind Speed m/s 0–15 Weather stations Mean substitution Input feature Normalization

A7 CO₂ Concentration ppm 350–2500 CO₂ sensors Bayesian filtering Occupancy proxy Normalization, threshold encoding

A8 Motion Activity Binary (0/1) {0,1} PIR sensors Mode imputation Occupancy proxy Categorical encoding

A9 Wi-Fi CSI Signal strength (dBm) −95 – −20 Wi-Fi routers Gaussian smoothing Occupancy proxy Embedding + attention

A10 Access Card Swipes Count / hour 0–200 Access system Zero-fill (missing = none) Occupancy proxy Normalization

A11 Day Type Categorical {weekday, weekend, holiday} Calendar dataset N/A Contextual var One-hot encoding

A12 Time Features Hours, season, periodic 0–23 (hour), 4 (season) System clock N/A Temporal context Sin/cos encoding

A13 Building Characteristics Discrete / continuous Floor area: 300–15,000 m² BIM / design database N/A Static feature Normalization

A14 HVAC Operation State Binary (0/1) {on/off} BMS logs Forward fill Control proxy Categorical encoding

Att. ID	Attribute name	Type / Unit	Value / Range (≈)	Source / Sensor	Handling of missing values	Role in AU-HNN	Preprocessing applied
A1	Energy Consumption	kWh (continuous)	0–12,000 per day	Smart meters	BiLSTM imputation	Target variable	Normalization, lag features
A2	Indoor Temperature	°C	15–32	IoT thermal sensors	Sliding-window fill	Input feature	Normalization, temporal aggregation
A3	Outdoor Temperature	°C	−10–42	Weather API	Forward fill (short gaps)	Input feature	Normalization, derivative features (heat index)
A4	Relative Humidity	%	10–95	Weather stations	Interpolation	Input feature	Normalization
A5	Solar Radiation	W/m²	0–950	Pyranometer	Linear interpolation	Input feature	Normalization, diurnal pattern encoding
A6	Wind Speed	m/s	0–15	Weather stations	Mean substitution	Input feature	Normalization
A7	CO₂ Concentration	ppm	350–2500	CO₂ sensors	Bayesian filtering	Occupancy proxy	Normalization, threshold encoding
A8	Motion Activity	Binary (0/1)	{0,1}	PIR sensors	Mode imputation	Occupancy proxy	Categorical encoding
A9	Wi-Fi CSI	Signal strength (dBm)	−95 – −20	Wi-Fi routers	Gaussian smoothing	Occupancy proxy	Embedding + attention
A10	Access Card Swipes	Count / hour	0–200	Access system	Zero-fill (missing = none)	Occupancy proxy	Normalization
A11	Day Type	Categorical	{weekday, weekend, holiday}	Calendar dataset	N/A	Contextual var	One-hot encoding
A12	Time Features	Hours, season, periodic	0–23 (hour), 4 (season)	System clock	N/A	Temporal context	Sin/cos encoding
A13	Building Characteristics	Discrete / continuous	Floor area: 300–15,000 m²	BIM / design database	N/A	Static feature	Normalization
A14	HVAC Operation State	Binary (0/1)	{on/off}	BMS logs	Forward fill	Control proxy	Categorical encoding

3.3.1 Missing data imputation

Temporal gaps in IoT sensor data are handled using sliding-window imputation (Equation (2)) with forward filling for short gaps (<3 timesteps) and bi-directional LSTM imputation for longer gaps.

\begin{aligned} x^{t} = f B i L S T M (x_{t - k}, \dots, x_{t - 1}, x_{t + 1}, \dots, x_{t + k}) \end{aligned}

(2)

3.3.2 Multi-source data fusion

Weather data, including temperature, humidity, solar radiation, and wind speed, is normalized to a common scale for consistent model input. Occupancy data is processed through multi-modal fusion of WiFi analytics, access logs, CO₂ sensors, and motion detectors to ensure accurate real-time representation. Additionally, building characteristics such as BIM features, HVAC specifications, and spatial layouts are extracted to effectively model building-specific energy dynamics.

Normalization

For weather and occupancy features

\begin{aligned} x^{'} = \frac{x - x_{min}}{x_{max} - x_{min}} \end{aligned}

(3)

Weighted Data Fusion for multiple occupancy sensors

\begin{aligned} O_{t} = \sum_{δ = 1}^{S} w_{s} \cdot o_{t}^{s}, \sum_{δ = 1}^{S} w_{s} = 1 \end{aligned}

(4)

3.3.3 Dynamic feature extraction for occupant behavior

Real-time occupancy inference is achieved through multi-sensor fusion algorithms that capture and interpret instantaneous occupancy across different building zones. Behavioral pattern mining further enhances this process by extracting temporal activity sequences to uncover recurring occupancy trends as well. To ensure predictive robustness, dynamic feature selection is applied, where features are adaptively weighted according to their contribution to accuracy and the variability in occupant behavior as given in equations (5)–(7):

Lag Features

\begin{aligned} E_{t}^{l a g} = [E_{{t - 1}}, E_{{t - 2}}, \dots, E_{{t - L}}] \end{aligned}

(5)

Calendar Encoding

\begin{aligned} C t = [i s W e e k e n d t, i s H o l i d a y t, s i n (\frac{2 π t}{24}), c o s (\frac{2 π t}{24})] \end{aligned}

(6)

Dynamic Feature Selection (Adaptive Weighting):

\begin{aligned} f_{i}^{a d j} = f_{i} \frac{i m p o r t a n c e (f_{j})}{\sum_{j} i m p o r t a n c e (f_{j})} \end{aligned}

(7)

3.3.4 Temporal feature engineering

Multi-scale time windows are applied by computing features over intervals such as 15 min, 1 h, 6 h, 24 h, and weekly periods to effectively capture both short- and long-term temporal dependencies. Lag features are incorporated using historical energy consumption data across 1–24 h intervals to recognize usage trends. Calendar features, including holidays, weekends, and seasonal variations, are encoded to provide essential temporal context. Additionally, some of the very important weather derivatives such as heat index, wind chill, and apparent temperature are derived to account for environmental influences on energy dynamics.

Multi-scale aggregation

\begin{aligned} X_{t}^{a g g} = \frac{1}{Δ t} \sum_{k = t - Δ t}^{t} X_{k} \end{aligned}

(8)

Weather Derivatives

\begin{aligned} H I & = - 42.379 + 2.04901523 T + 10.14333127 H - 0.22475541 T H - 6.83783 \cdot 10^{- 3} T^{2} \\ - 5.481717 \cdot 10^{- 2} H^{2} \end{aligned}

3.4 Hybrid neural network design

3.4.1 CNN component

The CNN component employs a 1D convolutional neural network to extract spatial features, which captures building zone patterns and equipment interactions in a very effective manner. Multi-scale temporal convolutions with kernel sizes of^4,6,16 are integrated to handle diverse time scales and ensure robust temporal representation. The architecture consists of three convolutional layers with 64, 128, and 256 filters, and each is followed by max pooling, ReLU activations, batch normalization, as well as a dropout layer with a rate of 0.2 to enhance generalization and prevent overfitting as well.

1D Convolution

\begin{aligned} h_{i}^{l} = σ (\sum_{{k = 0}}^{{K - 1}} w_{k}^{l} x_{i + k}^{l - 1} + b^{l}) \end{aligned}

(9)

Where k is kernel size, $w^{l}$ are filters, $b^{l}$ is bias, and $σ$ is ReLU.

3.4.2 LSTM component

The BiLSTM component processes sequences in both forward as well as backward directions, which thoroughly allows it to capture comprehensive temporal dependencies in occupant as well as energy patterns. Its architecture is designed with two BiLSTM layers containing 128 and 64 units, which are enhanced by residual connections to improve information flow. A self-attention mechanism is thoroughly integrated to identify and emphasize long-term patterns within the data. To maintain stable training, gradient clipping at ±1.0 and LSTM cell regularization techniques are applied, which also ensures robust performance and prevents very critical issues like gradient explosion, as given in equations (10)–(16):

\begin{aligned} f_{t} & = σ (W_{f} \cdot [h_{{t - 1}}, x_{t}] + b_{f}) \end{aligned}

(10)

\begin{aligned} i t & = σ (W_{i} \cdot [h t - 1, x_{t}] + b_{i}) \end{aligned}

(11)

\begin{aligned} h_{t} & = o t * t a n h (C t) \end{aligned}

(12)

\begin{aligned} o t & = σ (W_{o} \cdot [h t - 1, x_{t}] + b_{o}) \end{aligned}

(13)

\begin{aligned} {\tilde{C}}_{t} & = \tanh (W_{C} \cdot [h t - 1, x_{t}] + b_{C}) \end{aligned}

(14)

\begin{aligned} C t & = f t * C_{t - 1} + i t * {\tilde{C}}_{t} \end{aligned}

(15)

\begin{aligned} α_{t} & = \frac{e x p (e_{t})}{\sum_{k e x p (e_{k})}}, e_{t} = v^{T} ∖ t a n h (W_{h} h_{t} + b_{h}) \end{aligned}

(16)

3.4.3 Hybrid integration

Feature fusion is achieved by concatenating the CNN and LSTM components’ outputs, which are then passed through dense layers to enable very effective multi-modal integration. The model adopts a multi-task learning approach and also predicts energy consumption, occupancy patterns, and associated uncertainty. In order to optimize training, a weighted loss function is employed, which combines mean squared error for energy prediction, occupancy classification loss, as well as an uncertainty penalty to balance accuracy with reliability.

Feature Fusion

\begin{aligned} z = c o n c a t e n a t e (h_{C N N}, h_{L S T M}) \end{aligned}

(17)

Loss function: $L_{t o t a l}$ as defined above

3.5 Dynamic occupant behavior modeling

3.5.1 Occupancy pattern recognition

Occupancy pattern recognition is carried out through clustering-based methods, where K-means++ is applied to extract daily as well as weekly occupancy profiles that thoroughly reflect recurring behavioral trends as well. Real-time occupancy inference is further enhanced using Bayesian filtering, which integrates multi-modal sensor data for accurate and adaptive estimation. To capture evolving dynamics, behavioral change detection is performed through the CUSUM algorithm, which enables the identification of sudden or gradual shifts in occupancy patterns over tim.

Clustering-Based Pattern Identification

\begin{aligned} m i n i m i z e \sum_{i = 1}^{N} \sum_{k = 1}^{K} r_{i k} | | x_{i} - μ_{k} | |^{2} \end{aligned}

(18)

Where $r_{i k}$ =1 if $x_{i}$ belongs to cluster k, else 0.

CUSUM for drift detection

\begin{aligned} S_{t} = m a x (0, S_{t - 1} + x_{t} - μ_{0} - k) \end{aligned}

(19)

3.5.2 Adaptive behavior integration

Dynamic occupant behavior modeling leverages dynamic weighting, where feature importance is adjusted based on confidence levels to improve prediction reliability as well. Online learning thoroughly ensures that occupancy patterns are incrementally updated, which also enables the model to adapt to new behaviors without the need for full retraining. To capture long-term changes, temporal behavior evolution is modeled using Hidden Markov Models, which track transitions between different occupancy states and also provide very important insights into evolving occupant dynamics.

Hidden Markov Model for State Transitions

\begin{aligned} P (S_{t} | S_{t - 1}) = A_{i j}, P (O t | S t) = B_{j k} \end{aligned}

(20)

3.6 Uncertainty quantification framework

Bayesian Neural Network

\begin{aligned} p (θ | D) = \frac{p (D | θ) p (θ)}{p (D)} \end{aligned}

(21)

Monte Carlo Dropout for Epistemic Uncertainty

\begin{aligned} \hat{y} = \frac{1}{T} \sum_{t = 1}^{T} \hat{y} t, σ^{2} = \frac{1}{T} \sum ({\hat{y}}^{t} - y)^{2} \end{aligned}

(22)

3.6.1 Bayesian neural network integration

Variational inference is employed by Bayesian Neural Network Integration with a mean-field approximation to estimate posterior distributions, which thoroughly enables probabilistic modeling of uncertainty. Gaussian priors are defined, and posterior parameters are learned during training to capture model variability as well. To maintain computational efficiency, the reparameterization trick and local reparameterization are applied, which ensures very fast and more stable inference while preserving accuracy.

3.6.2 Adaptive uncertainty estimation

Adaptive Uncertainty Estimation introduces very important dynamic confidence intervals that evolve over time, which are adapted based on historical prediction accuracy. The framework thoroughly distinguishes between epistemic uncertainty, which arises from model limitations, and aleatoric uncertainty, which is mainly caused by inherent data variability. To ensure reliability, risk assessment is carried out using calibration plots, reliability diagrams, and correlation analysis, which also quantify the way that uncertainty estimates align with real-world outcomes as well.

3.7 Real-time implementation

Online Feature Update

\begin{aligned} μ_{t} = μ_{t - 1} + \frac{x_{t} - μ_{t - 1}}{t}, σ_{t}^{2} = σ_{t - 1}^{2} + (x_{t} - μ_{t}) (x_{t} - μ_{t} - 1) \end{aligned}

(23)

3.7.1 Edge computing optimization

Edge Computing Optimization thoroughly focuses on improving real-time inference efficiency by using model quantization, where INT8 conversion accelerates computations without major accuracy loss as well. Memory is managed using circular buffers that enable seamless processing of continuous streaming data. Additionally, parallel processing through multi-threaded feature extraction as well as prediction thoroughly helps to ensure a very high throughput and responsiveness in edge deployment.

3.7.2 Model compression techniques

Model Compression Techniques are applied to reduce computational load as well as enable lightweight deployment. Knowledge distillation is used within a teacher-student framework to transfer knowledge while reducing model size. Pruning strategies are also very important, which include structured pruning for CNN layers as well as magnitude-based pruning for dense layers, in order to further optimize model parameters. Hardware acceleration via CUDA and TensorRT integration enhances execution speed, which makes the system very suitable for real-time applications.

3.7.3 Streaming data processing

Streaming data processing thoroughly ensures that continuous input streams are handled in a very efficient manner by computing online features using incremental statistics for rolling time windows. Data quality is thoroughly preserved through real-time outlier detection and imputation of missing values. To minimize delays, latency optimization techniques such as pipeline parallelization and batch processing are employed, which enable very high-performance handling of streaming energy and occupancy data.

3.8 Pseudocode

Algorithm 1:
Hybrid CNN-LSTM Occupancy-Energy Prediction with Behavior Modeling and Uncertainty Quantification

$Step 1: Data Preprocessing & F e a t u r e F u s i o n$

$- N o r m a l i z e w e a t h e r d a t a W t$

$- F u s e m u l t i p l e o c c u p a n c y s e n s o r s i n t o O t$

$- C o m b i n e f e a t u r e s X t = [W t, O t, B]$

$Step 2: Temporal Feature Engineering$

$- C r e a t e l a g f e a t u r e s f r o m p a s t e n e r g y d a t a$

$- E n c o d e c a l e n d a r i n f o (w e e k e n d s, h o l i d a y s, t i m e o f d a y)$

$- C o m p u t e w e a t h e r - d e r i v e d f e a t u r e s (e . g ., H e a t I n d e x)$

$- F o r m f i n a l f e a t u r e v e c t o r X t_{p r i m e} = [X t, l a g f e a t u r e s, c a l e n d a r, w e a t h e r d e r i v a t i v e s]$

$Step 3: Spatial Feature Extraction (CNN)$

$- A p p l y c o n v o l u t i o n a l l a y e r s o n X t_{p r i m e}$

$- U s e a c t i v a t i o n, p o o l i n g, n o r m a l i z a t i o n$

$- E x t r a c t s p a t i a l f e a t u r e s F C N N$

$Step 4: Temporal Feature Extraction (LSTM + Attention)$

$- A p p l y L S T M l a y e r s o n s e q u e n t i a l f e a t u r e s$

$- U s e a t t e n t i o n m e c h a n i s m t o w e i g h i m p o r t a n t t i m e s t e p s$

$- E x t r a c t t e m p o r a l f e a t u r e s F L S T M$

$Step 5: Feature Fusion \& Prediction$

$- F u s e s p a t i a l a n d t e m p o r a l f e a t u r e s Z t = [F C N N, F L S T M]$

$- P r e d i c t o u t p u t s : E_{h a t}, O_{h a t}$

$- C o m p u t e c o m b i n e d l o s s (e n e r g y, o c c u p a n c y, u n c e r t a i n t y)$

$Step 6: Dynamic Occupant Behavior Modeling$

$- C l u s t e r o c c u p a n c y p a t t e r n s (K - m e a n s)$

$- D e t e c t b e h a v i o r c h a n g e s (C U S U M)$

$- M o d e l t e m p o r a l e v o l u t i o n (H M M)$

$Step 7: Uncertainty Quantification$

$- A p p l y B a y e s i a n n e u r a l n e t w o r k / M o n t e C a r l o d r o p o u t$

$- E s t i m a t e p r e d i c t i o n u n c e r t a i n t y s i g m a$

$Step 8: Real - Time Online Update$

$- U p d a t e r u n n i n g m e a n a n d v a r i a n c e o f i n p u t s$

$- U p d a t e f e a t u r e v e c t o r w i t h l a t e s t d a t a$

$- P r e d i c t r e a l - t i m e o u t p u t s a n d u n c e r t a i n t y$

$- A d j u s t f u s i o n w e i g h t s b a s e d o n u n c e r t a i n t y$

$R e t u r n : E_{h a t}, O_{h a t}, σ$
4. Experimental setup and data

4.1 Comprehensive dataset description

4.1.1 Target cities and climate zones

The proposed AU-HNN framework's experimental validation thoroughly relies on a very large-scale dataset collected from eight representative Chinese cities, which are carefully chosen to capture the full diversity of climatic conditions as well as urban energy demand profiles. Beijing is located in the continental zone and is also characterized by harsh winters and large seasonal fluctuations, which make it a very excellent test case for heating-dominant energy consumption. Shanghai belongs to the subtropical zone, which is distinguished by high humidity and a more balanced demand between cooling and heating, thereby reflecting the challenges of mixed-load forecasting as well. Guangzhou and Shenzhen represent tropical climates where cooling dominates throughout the year, but with distinct occupancy variations: Guangzhou exhibits very high residential variability, while Shenzhen, as a hub of technology-driven enterprises, shows very high dynamic work-related occupancy patterns. Hangzhou contributes a mix of government and commercial buildings, Nanjing emphasizes educational and healthcare facilities with relatively regular but load-intensive consumption, and Wuhan provides energy usage signatures driven by industrial as well as research facilities. Finally, Chengdu brings a very important architectural and operational heterogeneity by combining traditional low-rise structures with modern smart buildings. By covering continental, tropical, and subtropical climate zones, this dataset thoroughly enables the AU-HNN evaluation in diverse real-world contexts and also ensures the model's generalizability across different building types, occupancy behaviors, as well as weather-driven loads.

4.1.2 Detailed data collection specifications

The sampling rate yields approximately 140,160 data points per building annually, which is sufficient to capture both long-term seasonal patterns as well as short-term occupant-driven variations. A total of 45 buildings were monitored across the eight cities, with five to six buildings selected per location to provide coverage across offices, hospitals, educational institutes, industrial facilities, and government buildings. As a whole, the dataset contains nearly 6.3 million data points, which makes it one of the very largest high-resolution building energy datasets available in the Chinese context. Rigorous preprocessing was also performed to ensure reliability as well as completeness. Missing values were thoroughly imputed using a combination of k-nearest neighbors as well as temporal interpolation methods, resulting in more than 99.2% data completeness after cleaning. Some of the very important outliers, such as abnormal spikes or drops in energy consumption due to sensor errors, were detected using z-score thresholds at three standard deviations and were corrected using moving window smoothing techniques. This very careful preparation ensures that the dataset is of publication-grade quality as well as can serve as a benchmark for subsequent studies.

4.1.3 Multi-source data integration

Energy consumption in buildings is thoroughly influenced not only by physical as well as climatic conditions but also mainly by human occupancy as well as behavioral patterns. To account for these multidimensional drivers, the dataset integrates weather, occupancy, energy, and building-level data streams into a unified resource. Meteorological information such as temperature, relative humidity, solar radiation, as well as wind speed was collected from municipal weather stations and temporally aligned with the energy readings. Occupancy data was thoroughly derived from multiple sources, which generally include WiFi connection logs that help to find the number of active users in a building, CO₂ concentration levels that reflect indoor activity intensity, motion sensors that are thoroughly embedded in HVAC as well as lighting systems, and access card swipes that record entry and exit behaviors as well. Energy data in this study generally included very detailed measurements of HVAC consumption, lighting load, and equipment usage, while building data captured floor plans, HVAC system specifications, and official occupancy schedules as well. All sources were thoroughly synchronized at the 15-min interval, and some additional very important features such as rolling averages, lagged sequences, and occupancy–energy interaction terms were engineered to enhance the predictive power of the dataset. This comprehensive integration of physical, environmental, and human-driven factors provides a very important foundation for the adaptive and uncertainty-aware design of AU-HNN as revealed in Table 4:

Table 4.
Comprehensive dataset overview for experimental setup.

City (Zone) Climate/Focus Building types Sampling frequency Data points Weather features Occupancy features Energy features Building features

Beijing (Continental) Extreme seasonal heating Offices, Gov. buildings 15-min intervals 140,160/year Temp, Humidity, Wind, Solar WiFi, CO₂, Motion sensors HVAC, Lighting, Equipment load Floor plans, HVAC specs

Shanghai (Subtropical) High humidity, balanced loads Offices, Commercial sites 15-min intervals 140,160/year Temp, Humidity, Solar, Wind WiFi, Access cards, CO₂ HVAC, Lighting, Equipment load Occupancy schedules

Guangzhou (Tropical) Cooling-dominant, varied use Residential, Mixed-use 15-min intervals 140,160/year Temp, Solar radiation, Humidity Motion sensors, WiFi logs HVAC, Cooling demand, Equipment Floor area data

Shenzhen (Tropical) Tech-driven dynamic loads Corporate, IT facilities 15-min intervals 140,160/year Temp, Humidity, Wind speed WiFi density, Access records HVAC, Lighting, Device use HVAC specs, Schedules

Hangzhou (Subtropical) Gov. and commercial mix Offices, Gov. sites 15-min intervals 140,160/year Temp, Solar, Wind, Humidity CO₂, Motion detection HVAC, Lighting systems Floor layouts

Nanjing (Subtropical) Education, healthcare loads Schools, Hospitals 15-min intervals 140,160/year Temp, Humidity, Wind speed Access cards, Motion sensors HVAC, Medical equipment load HVAC specifications

Wuhan (Subtropical) Industrial/research facilities Labs, Factories 15-min intervals 140,160/year Temp, Wind, Solar, Humidity WiFi, CO₂ trends HVAC, Equipment demand Occupancy plans

Chengdu (Subtropical) Traditional/modern mix Smart + old buildings 15-min intervals 140,160/year Temp, Humidity, Solar, Wind WiFi, Access cards HVAC, Lighting, Device load

City (Zone)	Climate/Focus	Building types	Sampling frequency	Data points	Weather features	Occupancy features	Energy features	Building features
Beijing (Continental)	Extreme seasonal heating	Offices, Gov. buildings	15-min intervals	140,160/year	Temp, Humidity, Wind, Solar	WiFi, CO₂, Motion sensors	HVAC, Lighting, Equipment load	Floor plans, HVAC specs
Shanghai (Subtropical)	High humidity, balanced loads	Offices, Commercial sites	15-min intervals	140,160/year	Temp, Humidity, Solar, Wind	WiFi, Access cards, CO₂	HVAC, Lighting, Equipment load	Occupancy schedules
Guangzhou (Tropical)	Cooling-dominant, varied use	Residential, Mixed-use	15-min intervals	140,160/year	Temp, Solar radiation, Humidity	Motion sensors, WiFi logs	HVAC, Cooling demand, Equipment	Floor area data
Shenzhen (Tropical)	Tech-driven dynamic loads	Corporate, IT facilities	15-min intervals	140,160/year	Temp, Humidity, Wind speed	WiFi density, Access records	HVAC, Lighting, Device use	HVAC specs, Schedules
Hangzhou (Subtropical)	Gov. and commercial mix	Offices, Gov. sites	15-min intervals	140,160/year	Temp, Solar, Wind, Humidity	CO₂, Motion detection	HVAC, Lighting systems	Floor layouts
Nanjing (Subtropical)	Education, healthcare loads	Schools, Hospitals	15-min intervals	140,160/year	Temp, Humidity, Wind speed	Access cards, Motion sensors	HVAC, Medical equipment load	HVAC specifications
Wuhan (Subtropical)	Industrial/research facilities	Labs, Factories	15-min intervals	140,160/year	Temp, Wind, Solar, Humidity	WiFi, CO₂ trends	HVAC, Equipment demand	Occupancy plans
Chengdu (Subtropical)	Traditional/modern mix	Smart + old buildings	15-min intervals	140,160/year	Temp, Humidity, Solar, Wind	WiFi, Access cards	HVAC, Lighting, Device load

4.2 Experimental design and validation strategy

4.2.1 Performance-metrics

The evaluation strategy was designed to assess not only the predictive accuracy of AU-HNN but also its robustness under uncertainty as well as efficiency in real-time deployment. To this end, a twelve-metric framework was implemented. Five commonly adopted indicators used to quantify the accuracy as follows: root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), the coefficient of determination (R²), and normalized root mean square error (NRMSE). Beyond accuracy, it was also very important to examine the reliability of the probabilistic forecasts. Four uncertainty metrics were therefore included: prediction interval coverage probability (PICP), mean prediction interval width (MPIW), prediction interval normalized average width (PINAW), and the coverage width-based criterion (CWC), which jointly capture both the calibration and sharpness of predictive intervals. Finally, recognizing the constraints of real-world building management systems, three efficiency-oriented metrics were also measured in this study. These include prediction latency, which is expressed as the average time per forecast in milliseconds; memory usage was measured in megabytes during runtime; and computational energy efficiency was calculated as floating-point operations per watt. The use of this multi-objective evaluation protocol ensures that AU-HNN is rigorously tested not only for accuracy but also for uncertainty awareness and scalability in practical deployments.

Prediction accuracy is quantified through five widely adopted indicators. The Root Mean Squared Error (RMSE) is defined as:

\begin{aligned} R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}} \end{aligned}

(24)

Where $y_{i}$ and ${\hat{y}}_{i}$ denote actual and predicted values. RMSE penalizes larger deviations more heavily, which makes it very effective in energy load prediction. Mean Absolute Error (MAE) measures average error magnitude and is expressed as

\begin{aligned} M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} | \end{aligned}

(25)

It is robust against outliers. Mean Absolute Percentage Error (MAPE) normalizes error magnitude relative to actual demand and is formulated as

\begin{aligned} M A P E = \frac{100}{n} \sum_{i = 1}^{n} ∣ \frac{y^{i} - {\hat{y}}^{i}}{y^{i}} ∣ \end{aligned}

(26)

It provides interpretability in percentage terms. The Coefficient of Determination (R²) evaluates variance explained by the model:

\begin{aligned} R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}} \end{aligned}

(27)

Where $\hat{y}$ is the mean of observations. Finally, Normalized RMSE (NRMSE) provides scale-invariant comparison:

\begin{aligned} N R M S E = \frac{R M S E}{y_{m a x} - y_{m i n}} \end{aligned}

(28)

It is critical when comparing across heterogeneous building scales.

Four uncertainty-aware indicators are adopted to validate predictive intervals. Prediction Interval Coverage Probability (PICP) is given as:

\begin{aligned} P I C P = \frac{1}{n} \sum_{i = 1}^{n} c_{i}, c_{i} = {\begin{array}{ll} 1, & y_{i} \in [L_{i}, U_{i}] \\ 0, & o t h e r w i s e \end{array} \end{aligned}

(29)

Where $[L_{i}, U_{i}]$ is the prediction interval. Mean Prediction Interval Width (MPIW) evaluates the sharpness of intervals:

\begin{aligned} M P I W = \frac{1}{n} \sum_{i = 1}^{n} (U_{i} - L_{i}) \end{aligned}

(30)

While Prediction Interval Normalized Average Width (PINAW) scales MPIW relative to data range:

\begin{aligned} PINAW = \frac{1}{n (y_{max} - y_{min})} \sum_{i = 1}^{n} (U_{i} - L_{i}) \end{aligned}

(31)

It ensures comparability across cities. Finally, Coverage Width-based Criterion (CWC) integrates both reliability and sharpness using a penalty function:

\begin{aligned} C W C = P I N A W \cdot [1 + γ \cdot \exp (- η (P I C P - μ))] \end{aligned}

(32)

where

γ

and

η

are penalty coefficients, and μ is the confidence level (e.g., 95%).

Three measures of computational efficiency are taken to assess the feasibility of practical deployment. Inference time per sample is measured by prediction latency (in milliseconds). Storage overhead of parameters and intermediate states is captured by memory usage (measured in MB). Energy efficiency measures the computational power requirements in inference, and it offers a sustainable aspect that is critical in real-time implementation in intelligent buildings.

4.2.2 Model comparisons

To situate AU-HNN in the larger research community, the performance of the model was compared to 10 well-known baseline models that model statistical, machine learning, deep learning, and uncertainty-aware models. ARIMA, an autoregressive approach to linear series, known for its very long-standing success, and Prophet, a very powerful seasonal-trend decomposition model created by Facebook, are included in the traditional category. XGBoost is used in this study, which is a gradient-boosted decision tree ensemble, and Random Forest, which is an ensemble of random decision trees in the machine learning category. The deep learning baselines consisted of CNN-LSTM, which thoroughly extrapolates both spatial features and temporal modeling, BiLSTM, which processes information in both directions, Transformer, which utilizes self-attention to model sequences, and GRU, a simplified recurrent architecture. Bayesian-LSTM and Monte Carlo Dropout CNN were identified as uncertainty-aware baselines to consider probabilistic forecasting. This collection of models reflects the state of the art within methodological families and can be fairly and comprehensively compared. The design of the AU-HNN extends these baselines by integrating hybrid feature extraction, online incremental learning, and Bayesian uncertainty estimation; therefore, it makes it a very formidable solution to the multidimensional nature of the real-time building energy forecasting problem.

To conduct the benchmark analysis, ten models are considered, chosen to reflect a balanced representation of traditional, machine learning, deep learning, and uncertainty-aware techniques as provided in Table 5:

Table 5.
Performance analysis of ten models for building energy prediction.

Model Category Core principle Strength Limitation Key input type Temporal handling Uncertainty capability Suitability for buildings

ARIMA Traditional Linear autoregression Good for short-term trends Poor nonlinear capture Time-series values Sequential lag-based None Basic baseline for linear demand

Prophet Traditional Additive decomposition Handles seasonality, holidays Limited nonlinear adaptability Time-series + calendar Seasonal decomposition None Suitable for occupancy-driven cycles

XGBoost ML Gradient boosting trees Captures nonlinear dependencies Needs careful tuning Multi-source features Indirect via features None Robust across heterogeneous datasets

Random Forest ML Bagging decision trees High robustness, low overfitting Less efficient with high dims Multi-source features Indirect via averaging None Reliable for diverse building types

CNN-LSTM DL Conv + recurrent layers Captures spatial-temporal fusion High training cost Weather + energy data Long-term sequential Limited via dropout Effective for coupled weather-energy loads

BiLSTM DL Bidirectional memory Captures past + future context Expensive for large datasets Occupancy + energy Bidirectional temporal Limited via dropout Strong for occupant-driven variation

Transformer DL Attention mechanism Learns long-range dependencies Requires large datasets High-dim sequential Parallel temporal fusion None (baseline form) Scalable for big energy data

GRU DL Gated recurrent unit Lightweight and efficient May underfit complex dynamics Sequential energy data Simplified temporal Limited via dropout Real-time adaptive prediction

Bayesian LSTM Uncertainty Probabilistic recurrent Provides calibrated uncertainty Computationally expensive Sequential time-series Sequential Bayesian Strong Risk-sensitive building control

MC-Dropout CNN Uncertainty Stochastic dropout Fast approximate uncertainty Less precise than Bayesian Image + sequence data Sequential convolutional Moderate Practical lightweight uncertainty baseline

Model	Category	Core principle	Strength	Limitation	Key input type	Temporal handling	Uncertainty capability	Suitability for buildings
ARIMA	Traditional	Linear autoregression	Good for short-term trends	Poor nonlinear capture	Time-series values	Sequential lag-based	None	Basic baseline for linear demand
Prophet	Traditional	Additive decomposition	Handles seasonality, holidays	Limited nonlinear adaptability	Time-series + calendar	Seasonal decomposition	None	Suitable for occupancy-driven cycles
XGBoost	ML	Gradient boosting trees	Captures nonlinear dependencies	Needs careful tuning	Multi-source features	Indirect via features	None	Robust across heterogeneous datasets
Random Forest	ML	Bagging decision trees	High robustness, low overfitting	Less efficient with high dims	Multi-source features	Indirect via averaging	None	Reliable for diverse building types
CNN-LSTM	DL	Conv + recurrent layers	Captures spatial-temporal fusion	High training cost	Weather + energy data	Long-term sequential	Limited via dropout	Effective for coupled weather-energy loads
BiLSTM	DL	Bidirectional memory	Captures past + future context	Expensive for large datasets	Occupancy + energy	Bidirectional temporal	Limited via dropout	Strong for occupant-driven variation
Transformer	DL	Attention mechanism	Learns long-range dependencies	Requires large datasets	High-dim sequential	Parallel temporal fusion	None (baseline form)	Scalable for big energy data
GRU	DL	Gated recurrent unit	Lightweight and efficient	May underfit complex dynamics	Sequential energy data	Simplified temporal	Limited via dropout	Real-time adaptive prediction
Bayesian LSTM	Uncertainty	Probabilistic recurrent	Provides calibrated uncertainty	Computationally expensive	Sequential time-series	Sequential Bayesian	Strong	Risk-sensitive building control
MC-Dropout CNN	Uncertainty	Stochastic dropout	Fast approximate uncertainty	Less precise than Bayesian	Image + sequence data	Sequential convolutional	Moderate	Practical lightweight uncertainty baseline

4.3 Implementation details

The suggested framework of the AU-HNN was implemented on a high-performance computing system that included an NVIDIA RTX 4090 graphics card, an Intel i9-13900K processor, and 64GB of random access memory, which proved to be very effective with the large-scale multi-source dataset. The software stack included PyTorch 2.0, which is performed as the main deep learning library, TensorFlow Probability to model and quantifies uncertainty, and CUDA 11.8 to execute the software on a non-CPU platform as well. The training setup utilized a batch size of 64, a learning rate of 1e-4, and 200 epochs to strike a balance between convergence stability as well as computational efficiency. To improve the model generalization and avoid overfitting, we optimized hyperparameters with Bayesian optimization with 100 trials, allowing the systematic exploration of learning rates, dropout ratios, and hidden-layer sizes, and selecting the most appropriate configuration to predict energy robustly in response to dynamically changing conditions of occupancy and the environment.

5. Results and comprehensive analysis

5.1 Overall performance comparison

The overall performance comparison highlights the superiority of the proposed AU-HNN framework against ten baseline models across twelve diverse metrics. As shown in Table 1, AU-HNN consistently achieves the very low error values and the highest accuracy scores, which demonstrates clear improvements in both predictive accuracy as well as uncertainty calibration. Traditional statistical models such as ARIMA and Prophet provide only moderate accuracy and are unable to capture nonlinear dynamics or quantify predictive uncertainty, at last, which results in considerably very high RMSE and MAE values. Machine learning methods like XGBoost and Random Forest improve error reduction by capturing complex nonlinearities, but remain limited in sequential learning and uncertainty handling. Deep learning approaches, which include CNN-LSTM, BiLSTM, Transformer, and GRU, achieve further gains by exploiting temporal dependencies and hierarchical feature extraction, though they exhibit higher latency as well as resource consumption, as given in Table 6:

Table 6.
Comprehensive performance comparison exiting and proposed models.

MAPE Latency Memory

Model RMSE MAE (%) R² NRMSE PICP MPIW PINAW CWC (ms) (MB) FLOPS/Watt

ARIMA 15.42 11.85 18.7 0.721 0.284 — — — — 45 12 2.1 × 10⁶

Prophet 14.89 11.23 17.2 0.738 0.275 — — — — 38 15 2.8 × 10⁶

XGBoost 12.78 9.64 14.8 0.812 0.236 — — — — 25 45 4.2 × 10⁶

Random Forest 13.24 10.12 15.6 0.798 0.244 — — — — 32 38 3.9 × 10⁶

CNN-LSTM 9.85 7.42 11.3 0.876 0.182 0.823 18.5 0.341 25.8 156 128 8.4 × 10⁶

BiLSTM 10.31 7.89 12.1 0.864 0.190 0.798 19.8 0.365 27.2 189 112 7.8 × 10⁶

Transformer 9.12 6.84 10.5 0.894 0.168 0.845 17.2 0.317 23.1 245 186 6.9 × 10⁶

GRU 10.67 8.21 12.8 0.853 0.197 0.789 20.4 0.376 28.5 142 95 8.7 × 10⁶

Bayesian-LSTM 8.94 6.72 10.2 0.898 0.165 0.892 15.8 0.291 19.4 298 156 5.8 × 10⁶

MC-Dropout CNN 9.56 7.18 11.0 0.881 0.176 0.867 16.9 0.312 21.7 201 134 7.2 × 10⁶

AU-HNN (Proposed) 7.42 5.58 8.4 0.923 0.137 0.948 12.3 0.227 14.2 189 142 9.8 × 10⁶

			MAPE							Latency	Memory
ARIMA	15.42	11.85	18.7	0.721	0.284	—	—	—	—	45	12	2.1 × 10⁶
Prophet	14.89	11.23	17.2	0.738	0.275	—	—	—	—	38	15	2.8 × 10⁶
XGBoost	12.78	9.64	14.8	0.812	0.236	—	—	—	—	25	45	4.2 × 10⁶
Random Forest	13.24	10.12	15.6	0.798	0.244	—	—	—	—	32	38	3.9 × 10⁶
CNN-LSTM	9.85	7.42	11.3	0.876	0.182	0.823	18.5	0.341	25.8	156	128	8.4 × 10⁶
BiLSTM	10.31	7.89	12.1	0.864	0.190	0.798	19.8	0.365	27.2	189	112	7.8 × 10⁶
Transformer	9.12	6.84	10.5	0.894	0.168	0.845	17.2	0.317	23.1	245	186	6.9 × 10⁶
GRU	10.67	8.21	12.8	0.853	0.197	0.789	20.4	0.376	28.5	142	95	8.7 × 10⁶
Bayesian-LSTM	8.94	6.72	10.2	0.898	0.165	0.892	15.8	0.291	19.4	298	156	5.8 × 10⁶
MC-Dropout CNN	9.56	7.18	11.0	0.881	0.176	0.867	16.9	0.312	21.7	201	134	7.2 × 10⁶
AU-HNN (Proposed)	7.42	5.58	8.4	0.923	0.137	0.948	12.3	0.227	14.2	189	142	9.8 × 10⁶

Note: Uncertainty metrics (PICP, MPIW, PINAW, CWC) are not reported for conventional models (ARIMA, Prophet, XGBoost, Random Forest) as they do not provide probabilistic predictions.

The key performance gains achieved by the AU-HNN model across multiple evaluation metrics are given as summarized results in Table 7 as follows.

Table 7.

Performance improvements of AU-HNN.

Metric	Best baseline	AU-HNN	Improvement
RMSE	8.94 (Bayesian-LSTM)	7.42	17.0% ↓
MAE	6.72 (Bayesian-LSTM)	5.58	16.9% ↓
MAPE	10.2% (Bayesian-LSTM)	8.4%	17.6% ↓
R²	0.898 (Bayesian-LSTM)	0.923	2.8% ↑
NRMSE	0.165 (Bayesian-LSTM)	0.137	16.9% ↓
PICP	0.892 (Bayesian-LSTM)	0.948	6.3% ↑
MPIW	15.8 (Bayesian-LSTM)	12.3	22.1% ↓
CWC	19.4 (Bayesian-LSTM)	14.2	26.8% ↓

5.1.1 RMSE (root mean square error)

The magnitude of prediction errors is indicated by RMSE, where a lower value suggests higher accuracy. The AU-HNN model records the lowest RMSE of 7.42, which demonstrates its superior predictive performance compared to all other models. Traditional statistical methods like ARIMA and Prophet show higher errors above 14, which highlights their limitations. Overall, deep learning–based models clearly outperform classical approaches in minimizing error variance as given in Figure 3.

Figure 3.

RMSE comparison across models.

5.1.2 MAE (mean absolute error)

The average magnitude of errors is measured by MAE without considering their direction as well. Among the models, AU-HNN achieves a very low MAE of 5.58%, which reflects consistent as well as reliable performance. Transformer and Bayesian-LSTM also deliver competitive accuracy with values around 6.7–6.8%. In contrast, ARIMA and Prophet remain less effective, both of which exceed 11% in MAE as given in Figure 4.

Figure 4.

MAE comparison across models.

5.1.3 MAPE (mean absolute percentage error)

MAPE expresses prediction error as a percentage, which provides an intuitive measure of model reliability. The AU-HNN model again performs best with just 8.4%, which indicates a very strong generalization across varying data. Bayesian-LSTM and Transformer also maintain relatively low percentage errors close to 10%. Conversely, ARIMA reaches 18.7%, which proves much weaker in handling fluctuations as given in Figure 5.

Figure 5.

MAPE comparison across models.

5.1.4 R² (coefficient of determination)

R² explains how well the model fits the data, with higher values showing better predictive strength. AU-HNN achieves the highest score of 0.923, indicating excellent variance explanation. Bayesian-LSTM and Transformer also score high, both above 0.89, which reflects robust modeling. Traditional approaches like ARIMA and Prophet remain below 0.74, which indicates a weaker fit, as given in Figure 6.

Figure 6.

MAE comparison across models.

5.1.5 NRMSE (normalized RMSE)

NRMSE normalizes the error relative to the data range, which makes results more comparable across models. The AU-HNN model obtains the lowest NRMSE of 0.137, which shows very efficient error minimization. Bayesian-LSTM and Transformer follow closely, staying around 0.165–0.168. On the other hand, ARIMA shows the worst performance at 0.284, which confirms its inefficiency as given in Figure 7.

Figure 7.

NRMSE comparison across models.

5.1.6 PICP (prediction interval coverage probability)

PICP evaluates the proportion of true values captured within prediction intervals. AU-HNN excels with the highest PICP of 0.948, which shows very strong reliability in uncertainty estimation. Bayesian-LSTM and MC-Dropout CNN also perform well with values above 0.86. GRU and BiLSTM lag slightly, with coverage probabilities below 0.80 as given in Figure 8.

Figure 8.

PICP comparison across models.

5.1.7 MPIW (mean prediction interval width)

MPIW measures the width of the prediction intervals, with narrower values indicating tighter confidence bounds. AU-HNN records the narrowest MPIW at 12.3, which shows very high precision in uncertainty quantification. Bayesian-LSTM and MC-Dropout CNN also maintain relatively small widths between 15 and 17. Traditional deep models like GRU and BiLSTM exhibit wider intervals above 19, which reflect less efficiency as given in Figure 9.

Figure 9.

MPIW comparison across models.

5.1.8 PINAW (prediction interval normalized average width)

PINAW normalizes interval width relative to the dataset, which makes comparisons more meaningful. AU-HNN achieves the lowest PINAW of 0.227, which proves its strength in delivering compact and informative intervals. Bayesian-LSTM also performs competitively with 0.291, maintaining balanced accuracy. By contrast, GRU yields the widest normalized intervals at 0.376, which shows reduced efficiency as given in Figure 10.

Figure 10.

PINAW comparison across models.

5.1.9 CWC (coverage width criterion)

CWC combines both interval width and coverage into a single measure, balancing accuracy and reliability. AU-HNN attains the very low CWC of 14.2%, underscoring its optimal uncertainty calibration. Bayesian-LSTM and Transformer follow with values around 19–23, still performing efficiently. GRU has the highest CWC of 28.5%, which shows weaker calibration quality as given in Figure 11.

Figure 11.

CPW comparison across models.

5.1.10 Latency (ms)

Latency measures the time taken for prediction, which reflects real-time feasibility. XGBoost performs fastest with just 25 ms, which makes it very efficient for deployment. AU-HNN maintains moderate latency at 189 ms, balancing accuracy and speed. In contrast, Bayesian-LSTM suffers the highest delay at 298 ms, which may hinder real-time use as given in Figure 12.

Figure 12.

Latency comparison across models.

5.1.11 Memory (MB)

Memory usage reflects computational resource demand during inference as well. Classical models like ARIMA and Prophet consume minimal memory (12–15 MB), but at the cost of accuracy. Deep learning models thoroughly require significantly more memory, with Transformer at 186 MB. AU-HNN strikes a balance at 142 MB, which manages higher accuracy with moderate resource needs, as given in Figure 13.

Figure 13.

Mb comparison across models.

5.1.12 FLOPS/watt

FLOPS per Watt indicate energy efficiency of the models. AU-HNN delivers the highest efficiency at 9.8 × 10⁶, which showcases very superior computational sustainability. GRU and CNN-LSTM also perform strongly with values above 8.4 × 10⁶. Traditional models such as ARIMA and Prophet remain far less efficient, with values below 3 × 10⁶ as given in Figure 14.

Figure 14.

FLOPS per watt comparison across models.

5.2 City-specific performance analysis

5.2.1 Climate zone impact

The city-specific analysis also indicates the way in which the AU-HNN can conform to different climatic and operational scenarios in eight mega Chinese cities. Findings suggest that the model is very robust to different climate conditions, and its performance varies mostly due to the climate zones and seasonal variability as well. To illustrate, Hangzhou shows the highest predictive power with R2 = 0.931 and the smallest MAE = 5.34, indicating how well the model can reflect the balanced subtropical conditions. Conversely, Beijing poses the most problems because it has the highest seasonal variation with a continental climate, which gives it a slightly higher error value, although the overall performance is very high (R² = 0.915). The cities of Shanghai, Shenzhen, and Chengdu reveal results that are above the average, which proves that the concept of the AU-HNN is highly generalizable to a wide range of building portfolios and climate data as well. Notably, the PICP values of all the cities stand at levels exceeding 0.94, hence highlighting the consistency of the uncertainty intervals regardless of the geographic or seasonal effect as provided in Table 8.

Table 8.
China - city-specific performance (AU-HNN).

City Climate zone RMSE MAE MAPE (%) R² PICP

Beijing Continental 7.89 6.12 9.2 0.915 0.941

Shanghai Subtropical 7.24 5.45 8.1 0.928 0.952

Guangzhou Tropical 7.65 5.83 8.7 0.919 0.946

Shenzhen Tropical 7.38 5.61 8.3 0.925 0.949

Hangzhou Subtropical 7.12 5.34 7.9 0.931 0.954

Nanjing Subtropical 7.45 5.67 8.4 0.922 0.950

Wuhan Subtropical 7.58 5.74 8.6 0.920 0.947

Chengdu Subtropical 7.33 5.52 8.2 0.926 0.951

Average — 7.42 5.58 8.4 0.923 0.948

City	Climate zone	RMSE	MAE	MAPE (%)	R²	PICP
Beijing	Continental	7.89	6.12	9.2	0.915	0.941
Shanghai	Subtropical	7.24	5.45	8.1	0.928	0.952
Guangzhou	Tropical	7.65	5.83	8.7	0.919	0.946
Shenzhen	Tropical	7.38	5.61	8.3	0.925	0.949
Hangzhou	Subtropical	7.12	5.34	7.9	0.931	0.954
Nanjing	Subtropical	7.45	5.67	8.4	0.922	0.950
Wuhan	Subtropical	7.58	5.74	8.6	0.920	0.947
Chengdu	Subtropical	7.33	5.52	8.2	0.926	0.951
Average	—	7.42	5.58	8.4	0.923	0.948

5.2.2 Building type performance

Performance across building types with varying rates of change indicates that AU-HNN is statistically better, with different levels of variation by sector. The greatest gain in performance is obtained in commercial buildings, where the average R2 is 0.932, and the MAPE is only 7.8%, due to periodic occupancy and predictable patterns of energy use. There are also significant improvements in office buildings, where R2 is 0.927 and RMSE is the lowest 7.28%, and the model can produce dynamic occupancy in a work setting. The challenges in educational buildings and healthcare facilities are relatively higher because of the irregular schedules and critical load changes, but the AU-HNN continues to provide good outcomes, with average gains of 18.9% and 16.4%, respectively, in comparison with the baseline procedures. These findings not only support the idea that AU-HNN succeeds in accuracy in both climatic and geographic conditions but also generalize effectively across a wide range of building functions, which is why it is also applicable to real-world applications of smart energy management, as indicated in Table 9.

Table 9.
Building type performance (AU-HNN).

Building Type Count RMSE MAE MAPE (%) R² PICP

Office Buildings 18 7.28 5.42 8.1 0.927 0.951

Educational 12 7.69 5.89 8.9 0.916 0.943

Healthcare 8 7.84 6.01 9.2 0.913 0.940

Commercial 7 7.15 5.25 7.8 0.932 0.956

Average 45 7.42 5.58 8.4 0.923 0.948

Building Type	Count	RMSE	MAE	MAPE (%)	R²	PICP
Office Buildings	18	7.28	5.42	8.1	0.927	0.951
Educational	12	7.69	5.89	8.9	0.916	0.943
Healthcare	8	7.84	6.01	9.2	0.913	0.940
Commercial	7	7.15	5.25	7.8	0.932	0.956
Average	45	7.42	5.58	8.4	0.923	0.948

5.3 Novelty component validation

5.3.1 Ablation study results

The innovation of the AU-HNN model was extensively tested via the ablation studies, analysis of behavioral impact, and quantification of uncertainty. This ablation study showed the importance of individual architectural components to the overall performance. The complete AU-HNN produced an RMSE of 7.42%, whereas removing single modules drastically reduced the performance. In particular, the occupant module caused the elimination to increase the RMSE by 23.7% to 9.18%, and the uncertainty estimation component caused the elimination to reduce the RMSE by 16.6% to 8.65%. Likewise, the highest error of 9.42%, 27.0% worse than the full model, happened without incremental learning, as did 20.5% worsening, and the addition of multi-scale fusion. These findings thoroughly emphasize the synergistic nature of each module in providing strong predictions.

5.3.2 Dynamic occupant behavior impact

The effects of dynamic occupant behaviour were also discussed on the basis of a case study, which relied on the alteration of COVID-19 working patterns, which dramatically changed the schedules of building use. The ability of the AU-HNN to adapt to new behavioral patterns within two or three days was really impressive compared to the two to three weeks of the conventional baselines. Furthermore, the model met a behavioral recognition score of 94.3%, which underlines its capacity to encapsulate subtle human activity patterns directly affecting energy use.

5.3.3 Uncertainty quantification validation

It was also confirmed that uncertainty quantification is feasible to provide credible risk-aware predictions. The model was found to have a prediction interval coverage probability (PICP) of 94.8%, just under the target of 95%. It further yielded forecasting intervals 22.1 percent smaller than the Bayesian-LSTM baseline, which thoroughly validates the assertion that AU-HNN provides more accurate but trustworthy confidence intervals. To evaluate risk, the model achieved 89.4 percent success in forecasting high uncertainty periods so that the decision-making process could appropriately consider the risk in the real world.

5.4 Real-time performance analysis

The real-time performance analysis of AU-HNN indicated that the model is computationally efficient and is scalable to real-world applications in smart buildings. Latency analysis showed that the successful target was below 200 milliseconds, but the average prediction time was 189 milliseconds. This thoroughly guarantees that the model is capable of providing real-time feedback along with real-time decision-making as well. Linear performance was further proven through scalability tests up to 100 buildings at a time, making it robust at city-scale use. The result was a high level of resource utilization as the model used only 142 MB of memory, including the model and runtime overheads, and can be deployed even on resource-constrained systems. Computationally, AU-HNN was 9.8106 FLOPS per watt, more energy-efficient than existing methods. Furthermore, online learning ability was checked by the speed of updating the model, in which the model took only 15.3 milliseconds on average to make one update step. This would allow very quick incorporation of new information, which would thoroughly allow the model to be responsive to changing trends without huge computational expenses.

6. Discussion and impact analysis

6.1 Technical contributions and significance

6.1.1 Methodological advances

Some methodological innovations were also provided in this research that will set a new level for building energy prediction studies. First, it presents the first extensive uncertainty quantification framework developed specifically to perform real-time prediction in the building energy domain, tackling the vital issue of reliability in operational decision-making. It is about the use of dynamically changing occupant behavior in a neural predictive architecture, which identifies an important but frequently neglected cause of energy demand variation. Moreover, introducing an online incremental learning mechanism is a disruptive innovation, since it thoroughly allows the model to keep up with changing consumption trends without retraining afresh. As a whole, these developments represent a very potent step towards non-adaptive, behavior-oblivious, and uncertainty-calibrated energy prediction systems as well.

6.1.2 Theoretical implications

A very significant contribution was also made in this study to the theoretical understanding of energy informatics. It constructs a rational mathematical model of uncertainty propagation within building energy systems and provides a basis to measure and control predictive risks in operational settings. A rigorous convergence proof of the online incremental learning algorithm is provided, which may further confirm the online incremental learning algorithm's reliability, and the model's performance remains stable over long-term executions as well. Moreover, through explicit occupant variability modeling, the research expands current theories of behavior-based energy modeling, filling a gap between the human-focused science of behavior and the computer-based prediction. These theoretical contributions expand the predictive model applications in scholarly research and practical implementation.

6.2 Practical impact and applications

6.2.1 Smart building integration

The results of this study have immense implications for the future of smart building management systems. Using the predictive property of the AU-HNN, instantaneous HVAC optimization can provide up to 15 to 25 percent of energy savings, which can directly be translated into substantial cost savings and, at the same time, comfortable occupants. The model also produces uncertainty estimates that can be used to schedule predictive maintenance to ensure operators can focus on system inspections and minimize unexpected downtime. Furthermore, the grid-responsive opportunities presented by the framework through demand forecast alignment with utility needs have beneficial prospects to support demand response programs and promote resiliency of urban energy systems.

6.2.2 Economic and environmental benefits

Economically and environmentally, the positive gains of the suggested framework are also quite persuasive. According to the empirical assessments on the 45-building sample, the system shows a projected annual cost-saving of about $2.3 million, with a very impressive decrease of 1847 tons of CO₂ emission. These two consequences indicate not just short-term financial benefits but also very real investments in achieving carbon neutrality and sustainability. Notably, the return on investment (ROI) analysis indicates a payback period of only 18 months, which thoroughly supports the mass implementation's feasibility in the public as well as corporate sectors to build some important portfolios as well.

6.3 Limitations and future directions

6.3.1 Current limitations

The framework has limitations, even though it is very effective. A key drawback is that it is computationally expensive to scale to very large building portfolios, where scalability demands can be greater than hardware resources can support. A second concern is privacy, especially with a model that uses fine-grained occupant behavior data, which would create an ethical and regulatory challenge to implement in practice. Also, the structure relies intrinsically on high-quality multi-modal sensor input and can perform poorly in conditions with sparse, noisy, or missing data. These restrictions thoroughly highlight the need to make sure deployments planning, as well as complementary data governance approaches, are carefully considered.

6.3.2 Future research directions

It's thoroughly possible to conduct research in the future with the help of the important contributions of the paper. Federated learning is one of the opportunities that offer a valuable output to privacy-aware model training, which also allows for the knowledge sharing across institutions without the centralization of sensitive occupant information. Moreover, adding renewable energy sources and distributed storage to the AU-HNN system will improve its contribution to the sustainable building ecosystem, especially to facilitate carbon-conscious decision-making. Lastly, it is very promising to expand the strategy to district-level or city-scale energy management systems to investigate scalability, inter-building coordination, and synergies with smart grid infrastructures. These directions in the future open possibilities of building robust, ethical, and environmentally friendly energy intelligence solutions.

7. Conclusion

A new model is proposed in this study, AU-HNN, a first hybrid model that thoroughly helps to improve the construction energy prediction process. Four key technical contributions of the study were as follows: the first unified quantification of uncertainty frameworks in real-time energy prediction, incorporating dynamic occupant behavior into neural arch design, the use of multi-scale fusion strategies, and the application of online incremental learning as applied to building energy. These innovations collectively solve major drawbacks of current methods and give a comprehensive solution to precise, flexible, and dependable energy forecasts.

The suggested model has been shown to improve its performance significantly compared to a wide range of baselines, including conventional statistical, machine learning, deep learning, and uncertainty-aware approaches. In twelve performance metrics, the improvement of the AU-HNN ranged between 17% and 20% compared to the optimal baseline models. Such gains were validated over intensive statistical validation, such as t-tests and non-parametric Wilcoxon signed-rank tests, which makes the findings both strong and applicable to varied climatic conditions, building types, and occupancy conditions.

In addition to technical performance, the framework is very useful in practical applications. Prediction latencies of less than 200 ms, efficient memory usage, and proven to scale to large building portfolios, AU-HNN fits the operational needs of current smart building systems. Case studies of Tsinghua University also show that it can provide real advantages, such as energy savings of over 20% and an emissions reduction of up to 100 percent of sustainability goals. Due to the flexibility of the model to dynamic occupant behaviors, the model is resilient to rapidly evolving situations like hybrid work environments.

In scientific terms, the work builds upon the uncertainty-aware prediction theory by formalizing uncertainty propagation techniques as well as providing convergence guarantees to incremental learning algorithms. The research is a blend of both rigorous theory and practical deployment; thus, the research will not only further lead to the development of computational models but also will be a very important contribution to the greater debate of the integration of human-centered behavior and energy informatics. Such dual influence thoroughly highlights its importance both academically as well as practically.

Going forward, the AU-HNN framework has the very important potential to support the next-generation smart building ecosystems. The future research will probably be extended to federate learning to use those in privacy-sensitive applications, with renewable energy and storage, carbon-aware optimization, and to a district or city-scale energy management system as well. This would not just make individual buildings more efficient, but would also revolutionize the energy infrastructures of cities, which thoroughly makes AU-HNN a very important key pillar to convert cities into sustainable and intelligent places as well.

Footnotes

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Chang

Zhang

, et al. Investigation on the CNN-LSTM-MHA-based model for the heating energy consumption prediction of residential buildings considering active and passive factors. Energy 2025; 333: 137508.

Guyixin

Zhu

Yao

, et al. Occupant behavior model and its involvement in building optimization design: a review. J Asian Arch Build Eng 2025; 24: 2322–2338.

Ding

, et al. Temporal dynamic assessment of household energy consumption and carbon emissions in China: from the perspective of occupants. Sustain Prod Consum 2023; 37: 142–155.

Jogunola

Adebisi

Hoang

, et al. CBLSTM-AE: a hybrid deep learning framework for predicting energy consumption. Energies 2022; 15: 810.

Zhang

Chang

, et al. A Bi-LSTM-based Wi-Fi CSI approach for non-intrusive human behavior recognition in smart buildings. Energy Build 2025; 345: 116059.

. An improved attention-based deep learning approach for robust cooling load prediction: public building cases under diverse occupancy schedules. Sustain Cities Soc 2023; 96: 104679.

Feng

Pang

, et al. Extracting typical occupancy schedules from social media (TOSSM) and its integration with building energy modeling. In: Building simulation (Vol. 14, No. 1). Beijing: Tsinghua University Press, 2021, pp. 25–41.

Nezhadettehad

Zaslavsky

Rakib

, et al. Uncertainty-aware parking prediction using Bayesian neural networks. Sensors 2025; 25: 3463.

Yamin

Bhat

. Uncertainty-aware energy harvest prediction and management for IoT devices. ACM Trans Des Autom Electron Syst 2023; 28: 1–33.

10.

Yelisetti

Saini

Kumar

, et al. Uncertainty aware learning model for thermal comfort in smart residential buildings. IEEE Trans Ind Appl 2023; 60: 1909–1918.

11.

Luo

Oyedele

Ajayi

, et al. Feature extraction and genetic algorithm enhanced adaptive deep neural network for energy consumption prediction in buildings. Renewable Sustainable Energy Rev 2020; 131: 109980.

12.

Zeng

Peng

Han

. Towards sustainable architecture: enhancing green building energy consumption prediction with integrated variational autoencoders and self-attentive gated recurrent units from multifaceted datasets. PloS one 2025; 20: e0317514.

13.

Zhou

Wang

Qian

. Application of combined models based on empirical mode decomposition, deep learning, and autoregressive integrated moving average model for short-term heating load predictions. Sustainability 2022; 14: 7349.

14.

Wang

Chen

Kang

, et al. An XGBoost-based predictive control strategy for HVAC systems in providing day-ahead demand response. Build Environ 2023; 238: 110350.

15.

Dai

Huang

. Improving energy management practices through accurate building energy consumption prediction: analyzing the performance of LightGBM, RF, and XGBoost models with advanced optimization strategies. Electr Eng 2025; 107: 12583–12605.

16.

Shao

. Improvement of prophet energy consumption prediction method based on personnel behavior. In: 2023 3rd International Symposium on Computer Technology and Information Science (ISCTIS), July 2023, pp. 501–505. IEEE.

17.

Liu

Chen

Zhang

, et al. Enhancing building energy efficiency using a random forest model: a hybrid prediction approach. Energy Rep 2021; 7: 5003–5012.

18.

Momeni

Eghbalian

Talebzadeh

, et al. Enhancing office building energy efficiency: neural network-based prediction of energy consumption. J Build Pathol Rehabil 2024; 9: 68.

19.

Reveshti

Mansoub

Reveshti

, et al. A machine learning approach to climate-based building energy consumption. Energy Sci Eng 2025. DOI: https://doi.org/10.1002/ese3.70233

20.

Sun

Qaisar

Khan

, et al. Building occupancy number prediction: a transformer approach. Build Environ 2023; 244: 110807.

21.

Uddin

Wei

Chi

, et al. Influence of occupant behavior for building energy conservation: a systematic review study of diverse modeling and simulation approach. Buildings 2021; 11: 41.

22.

Yahaya

Owolabi

Suh

. Enhancing building energy through regularized Bayesian neural networks for precise occupancy detection. J Build Eng 2025; 107: 112777.

23.

Zhang

Guo

, et al. Urban micro-climate prediction through long short-term memory network with long-term monitoring for on-site building energy estimation. Sustain Cities Soc 2021; 74: 103227.

24.

Zhu

Zhang

. Enhanced learning model of LSTM-GAN hybrid network in the dynamic prediction of building energy consumption. Int J High Speed Electron Syst 2025; 35: 2550008.

An adaptive uncertainty-aware hybrid neural network for enhanced learning in real-time building energy prediction with dynamic occupant behavior modeling

Abstract

Keywords

1. Introduction

2. Related works

3.1 Overview of proposed framework

3.4.1 CNN component

3.5.1 Occupancy pattern recognition

3.6.2 Adaptive uncertainty estimation

3.7 Real-time implementation

3.7.2 Model compression techniques

3.7.3 Streaming data processing

3.8 Pseudocode

4.1 Comprehensive dataset description

4.1.1 Target cities and climate zones

4.1.2 Detailed data collection specifications

4.1.3 Multi-source data integration

4.2.1 Performance-metrics

5. Results and comprehensive analysis

5.1 Overall performance comparison

5.2.1 Climate zone impact

Table 9. Building type performance (AU-HNN). Building Type Count RMSE MAE MAPE (%) R2 PICP Office Buildings 18 7.28 5.42 8.1 0.927 0.951 Educational 12 7.69 5.89 8.9 0.916 0.943 Healthcare 8 7.84 6.01 9.2 0.913 0.940 Commercial 7 7.15 5.25 7.8 0.932 0.956 Average 45 7.42 5.58 8.4 0.923 0.948

5.3.1 Ablation study results

5.3.2 Dynamic occupant behavior impact

5.3.3 Uncertainty quantification validation

5.4 Real-time performance analysis

6. Discussion and impact analysis

6.1 Technical contributions and significance

6.1.1 Methodological advances

6.1.2 Theoretical implications

6.2 Practical impact and applications

6.2.1 Smart building integration

6.2.2 Economic and environmental benefits

6.3 Limitations and future directions

6.3.1 Current limitations

6.3.2 Future research directions

7. Conclusion

Footnotes

Funding

Declaration of conflicting interests

References

Table 9.
Building type performance (AU-HNN).

Building Type Count RMSE MAE MAPE (%) R² PICP

Office Buildings 18 7.28 5.42 8.1 0.927 0.951

Educational 12 7.69 5.89 8.9 0.916 0.943

Healthcare 8 7.84 6.01 9.2 0.913 0.940

Commercial 7 7.15 5.25 7.8 0.932 0.956

Average 45 7.42 5.58 8.4 0.923 0.948