Intelligent analysis and early warning: Big data in telecom fraud prevention and control

Abstract

This paper proposes an intelligent prevention and control framework. Combining SSL correlation analysis and graph convolutional network (GCN), it realizes efficient semantic restoration of HTTPS encrypted traffic and multi-hop behavior identification; designs a streaming computing engine based on Kafka-Flink, which supports millisecond anomaly detection and dynamic model updating; and constructs a group portrait model under the heterogeneous information network, which accurately locates vulnerable nodes to fraud. In addition, federal learning is introduced to optimize the virtual base station positioning algorithm, combined with particle filtering and improved Chan-Taylor parameter optimization, to improve the positioning accuracy in non-line-of-sight environments, and the Deep Reinforcement Learning (DRL) framework is used to achieve dynamic reasoning and adaptive defense of fraudulent intent. The framework provides theoretical support and technical breakthroughs at the algorithmic level for telecom fraud prevention and control.

Keywords

encrypted traffic parsing graph neural networks real-time stream computing federated learning intent reasoning

Introduction

In recent years, with the rapid development of information technology, telecommunication fraud presents the characteristics of technology, organization and cross-border, and its means are constantly upgraded, seriously threatening social security and public property security.¹ The traditional prevention and control mode based on rule base and manual audit is difficult to deal with new fraud techniques such as encrypted communication and virtual base station due to the defects of relying on a priori knowledge and response lag.² In this context, the integration of big data and artificial intelligence algorithms provides new possibilities for the active defense of telecom fraud, the core of which lies in the accurate identification of fraudulent behavior and dynamic intervention through multimodal data analysis, real-time behavioral modeling and adaptive learning mechanisms.

Currently, encrypted traffic parsing is one of the key challenges in telecom fraud prevention and control. Although the HTTPS protocol plays an important role in protecting user privacy, its encrypted characteristics also provide a hidden channel for fraudulent communications. Existing studies such as Asaoka et al.³ proposed a new method to identify services from a given TLS session based on SNI and Protocol Data Unit analysis. In addition, for the processing framework of real-time streaming data, Zheng et al.⁴ proposed a multilevel storage model and an LMA-based application deployment method to meet the real-time and RTDP system heterogeneity requirements.

In the field of at-risk group identification, predictive models based on user behavioral portraits are gradually becoming a research hotspot. Mavaluru et al.⁵ introduced a new real-time method to detect anomalous behaviors in online social networks by exploiting dynamic user activities and their associated profiles through convolutional neural networks. Recently, Xu et al.⁶ integrated various personal factors to construct fraud exposure recognition and fraud victim recognition models using machine learning methods. Notably, privacy protection and data silo issues further constrain the feasibility of cross-domain collaborative anti-fraud. For example, Pan et al.⁷ develop a decision support system that integrates blockchain and federated learning to enable data sharing for security and privacy protection.

In this paper, we propose an algorithmic innovation system for intelligent prevention and control of telecom fraud in response to the above problems. Firstly, by fusing Deep Packet Inspection and semi-supervised learning, we construct a dynamic parsing architecture for encrypted traffic, combining SSL/TLS session correlation analysis and GCN-driven multi-hop behavioral modeling, to break through the technical bottleneck of the traditional methods in crypto-semantic restoration and gang correlation mining. Secondly, a multimodal stream processing engine based on Kafka-Flink is designed to introduce attention mechanism and online incremental learning strategy to achieve real-time feature fusion and model adaptive optimization of heterogeneous data. Further, a group risk prediction algorithm under heterogeneous information network is proposed, which utilizes TransE embedding and fwandering techniques to capture the higher-order interaction patterns of user-device-behavior, and combines the BERT-CRF model to extract transactional intent and emotion fluctuation features in the call text. At the privacy protection level, the innovative federated learning framework is combined with the improved Chan-Taylor localization algorithm to optimize the scatterer parameter estimation through particle filtering, which significantly improves the base station localization accuracy in NLOS environments.

The main contributions of this paper include (1) proposing a hybrid architecture of semi-supervised learning and graph neural network to realize the synergistic detection of encrypted traffic parsing and multi-hop fraud; (2) constructing a multimodal real-time feature fusion mechanism to overcome the problem of model drift caused by spatio-temporal heterogeneity in streaming data; (3) designing a federated learning-driven privacy-preserving localization algorithm to balance the localization accuracy with the user’s privacy needs; and (4) pioneering the first algorithm based on the DRL-based Intent Reasoning framework, which realizes the self-evolutionary defense against new fraud scripts through the fusion of hierarchical attention mechanism and knowledge graph. Experiments show that the system outperforms existing mainstream methods in core indicators such as encrypted traffic recognition rate and high-risk user detection accuracy, providing theoretical innovation and technological breakthroughs for telecom fraud prevention and control.

Relevant theories and analysis

Theory of cryptographic communication parsing

Encrypted traffic parsing is a key technological challenge to deal with telecommunication fraud, and its core goal lies in identifying malicious traffic patterns through non-intrusive analysis without destroying the privacy of communication.⁸ Information theory provides a fundamental theoretical framework for encrypted traffic feature extraction, and Shannon entropy, as a core indicator of information randomness, is widely used to quantify the distributional characteristics of traffic features such as packet length and arrival time interval. Normal HTTPS traffic usually exhibits a stable range of entropy values, while malicious traffic tends to show significant deviations in its entropy values due to frequent command interactions or data stealing behavior. For example, malicious servers may use self-signed certificates or expired certificates during the TLS handshake phase, and such anomalous features provide an interpretable basis for machine learning-based traffic classification. In addition, parameters such as the selection preference of key exchange algorithms and protocol version compatibility further enrich the feature dimension of traffic classification, making the behavioral pattern of encrypted traffic explicit.⁹

The introduction of semi-supervised learning techniques effectively alleviates the problem of scarcity of labeled data in encrypted traffic parsing. Traditional supervised learning methods rely on a large number of labeled samples, while the labeling of malicious traffic in real scenarios is costly and lagging behind in updating. Through the pseudo-label generation mechanism, the model can use a small amount of labeled data to guide the feature learning of unlabeled data and gradually optimize the classification boundary. The co-training framework reduces the impact of labeling noise on the model performance and improves the robustness of the model through the interactive validation of multiple classifiers. However, the widespread use of multi-hop proxy tools poses new challenges to existing approaches. Such tools disperse traffic paths through multiple layers of proxies, making it difficult for single-node features to capture global behavioral patterns.¹⁰ For this reason, researchers combine graph theory with traffic analysis to construct a topological relationship graph of traffic nodes and use graph embedding techniques to extract structural features among nodes. By dynamically assigning neighbor node weights, the graph attention network is able to effectively identify abnormal transit nodes in the agent link, thus enhancing the detection ability of covert behavior.

Cryptographic reverse analysis techniques provide new perspectives for in-depth analysis of cryptographic protocols. The TLS protocol, as the core guarantee for HTTPS communication, has potential vulnerabilities in the key negotiation mechanism of its handshake phase. For example, the weak randomness of Diffie-Hellman key exchange parameters may lead to the risk of man-in-the-middle attacks. Potential security threats can be identified by analyzing logical flaws in the protocol implementation through reverse engineering.¹¹ However, the high demand for computational resources and reliance on knowledge of protocol implementation details of such methods limit their application in large-scale real-time detection scenarios. In recent years, lightweight cryptographic analysis algorithms have gradually reduced the computational complexity by optimizing the parameter sampling and feature extraction process, but their detection accuracy and generalization ability still need to be further improved.

Temporal modeling of dynamic traffic features is another important direction in cryptographic communication parsing. The behavioral patterns of malicious traffic often have temporal correlations, such as periodic command sending or sudden data leakage. Long- and short-term memory networks can effectively recognize such dynamic patterns by capturing long-term dependencies in time series. Meanwhile, the self-attention mechanism of the Transformer model can extract contextual correlation features across packets, enhancing the parsing ability of complex attack links. It is worth noting that the semantic reduction of encrypted traffic needs to take into account the balance between privacy protection and security monitoring. Over-reliance on deep parsing may violate the red line of user privacy, so compliant feature extraction strategies should be designed, such as analyzing only the protocol metadata and avoiding decryption of the actual content.

The future development of encrypted communication parsing theory will show the trend of multidisciplinary cross-fertilization. The impact of quantum computing on traditional encryption algorithms has given rise to the research of post-quantum cryptography, which puts forward new adaptive requirements for traffic parsing techniques. In addition, the combination of edge computing and federated learning provides new ideas for encrypted traffic analysis in distributed environments, which can achieve collaborative detection while protecting data privacy through localized model training and encrypted transmission of parameters. With the popularization of 5G and IoT technologies, the heterogeneity and scale of encrypted traffic will be further intensified, and there is an urgent need to develop lightweight and adaptive parsing theories to cope with the evolving security challenges.

Streaming computing model

Streaming computing model is the core technical architecture to support real-time telecom fraud warning, and its design needs to achieve a dynamic balance between high throughput, low latency and system fault tolerance.¹² The CAP theorem for distributed systems reveals the inherent contradiction between data consistency, service availability and network partition tolerance, and in telecom fraud prevention and control scenarios, the system usually prioritizes availability and real-time response capability. For example, when high-risk transaction behavior is detected, even if the data of some computing nodes is not yet fully synchronized, the system still needs to immediately trigger an early warning interception mechanism to avoid response delays due to strong consistency constraints. This design choice meets the stringent requirements of telecom anti-fraud business on timeliness, but also poses a higher challenge to the state management strategy, which needs to realize the approximate consistency of the state after fault recovery through asynchronous checkpointing and log playback techniques.

The introduction of event time semantics provides theoretical support for disordered data flow processing. In the actual network environment, the transmission of data packets may be due to network jitter or routing differences in the disordered arrival phenomenon, the traditional processing based on the ingestion time sorting will lead to computational errors. The watermarking mechanism allows the system to perform real-time computation under the premise of tolerating a certain amount of disorder by dynamically inferring the integrity boundaries of the data stream.¹³ For example, when analyzing the sequence of user call behaviors, the sliding window combined with the watermarking technique can accurately count the frequency of calls within the time window, and even if part of the data is delayed in arriving, the system is still able to trigger the window calculation by advancing the watermark line to ensure the timeliness and accuracy of the results. Windowed processing further refines the granularity of the calculation. Rolling windows are suitable for periodic indicator statistics, while session windows excel at capturing correlation patterns between discrete events, such as multiple failed login attempts or abnormal transfer behavior within a short period of time.

Real-time fusion of multimodal data is a key challenge for streaming computing models. Telecom fraud prevention and control involves heterogeneous data from multiple sources, such as call text, financial transaction records, and network traffic logs, whose feature spaces differ significantly in dimension, scale and temporal characteristics. Text data needs to be converted into high-dimensional semantic vectors by pre-trained language models, transaction data needs to be normalized and discretized coding, and network traffic needs to extract time-series statistical features. The introduction of the attention mechanism provides a solution for cross-modal feature alignment, highlighting key features and suppressing noise interference through dynamic weight allocation.¹⁴

The optimization of the state management mechanism directly affects the robustness and scalability of the streaming system. Checkpointing techniques ensure that the system can quickly recover to the most recent consistent state in case of failure by periodically saving the intermediate states of the computational nodes. However, frequent checkpointing operations incur significant storage and computation overheads, which can be a performance bottleneck especially when dealing with massive data streams.¹⁵ In addition, dynamic resource scheduling algorithms are able to adjust the task allocation of compute nodes according to real-time load fluctuations, such as automatically expanding compute instances during peak traffic hours and scaling down resources during idle times to reduce operation and maintenance costs. This elastic expansion and contraction capability enables the system to adapt to periodic fluctuations in telecommunication service traffic, such as the surge scenario of voice traffic during holidays.

The integration of online incremental learning mechanism further improves the dynamic adaptability of the model. The traditional batch training mode cannot cope with continuous changes in data distribution, resulting in significant degradation of model performance over time. With the incremental learning strategy, the system is able to gradually incorporate newly arrived data samples into the model parameter updates while preserving historical knowledge to avoid catastrophic forgetting. The adversarial generative network effectively expands the training data diversity by synthesizing negative samples similar to the real data distribution, especially in scenarios where the fraud pattern evolves rapidly, this technique can significantly improve the model’s ability to generalize to unknown attacks.¹⁶

The optimization direction of streaming computing models is evolving toward intelligence and adaptivity. With the popularity of edge computing devices, some computing tasks can be sunk to the network edge nodes for execution, reducing the load pressure on the central server. The introduction of federated learning framework makes cross-regional data collaborative analysis possible, and the edge nodes share only encrypted parameters after locally training the model, which not only guarantees data privacy but also realizes global knowledge aggregation. In addition, reinforcement learning techniques are beginning to be applied to the dynamic scheduling decision of streaming resources, learning the optimal task allocation strategy by interacting with the environment to further improve the resource utilization and response efficiency of the system. These technological advances lay the theoretical foundation for building the next-generation intelligent anti-fraud system, enabling it to better cope with the challenges of the new network environment of ultra-high speed and ultra-low latency in the 5G era.

Group behavior modeling theory

Group behavior modeling aims to identify potential victim groups and propagation paths in telecommunication fraud by analyzing interaction patterns and social network characteristics among users. Social network analysis provides a fundamental theoretical framework for this purpose, which centers on revealing how individual behavior is influenced by group structure. Weak connection theory suggests that non-frequently connected fringe relationships (weak connections) in a social network have a stronger ability to penetrate across communities in the dissemination of information, a phenomenon that is particularly significant in telecommunication fraud. Fraud scripts tend to spread quickly to different social circles through weak connections, such as spreading phishing links through strangers or low-frequency contacts that are met by chance. Graph theoretic metrics such as node median centrality are used to quantify the role of nodes as hubs in the propagation network, and high median nodes usually correspond to key bridges for information diffusion, and blocking the propagation paths of such nodes can effectively curb the expansion of the scope of fraud.¹⁷

To formally describe the weak connection propagation mechanism, a threshold model is introduced to express individual behavioral triggering conditions:

a_{i} (t) = {\begin{array}{c} 1, i f \sum_{j \in N_{i}} w_{i j} a_{j} (t - 1) \geq θ_{i} \\ 0, o t h e r w i s e \end{array}

(1)

where

a_{i} (t)

denotes the behavioral state of individual i at time t (1 for executing fraud message forwarding, 0 for not executing),

N_{i}

is the set of its neighbors,

w_{i j}

is the influence weight of individual j on i, and

θ_{i}

is the behavioral trigger threshold.

Heterogeneous information networks can characterize the complex social ecology in a finer way by defining multiple types of nodes and multiple relationships. In telecom fraud scenarios, entities such as users, devices, geolocations, and applications, constitute heterogeneous graph nodes, and behaviors such as calls, transfers, and logins form edge relationships. Meta-paths, as semantic paths across types of nodes, provide an interpretable framework for the extraction of higher-order structural features. The meta-path-based random walk technique is able to generate node embedding vectors that capture the potential correlations of user behavioral patterns, and then identify anomalous groups through cluster analysis, its optimization objective function can be expressed as:

\max_{E} \sum_{v \in V} \sum_{u \in N_{m e t a} (v)} \log P (u | e_{v})

(2)

where V is the set of nodes,

N_{m e t a} (v)

denotes the neighboring nodes based on meta-paths,

e_{v}

is the embedding vector of node v, and

P (u | e_{v})

is the conditional probability of predicting neighboring node u.

The development of dynamic graph neural networks injects a temporal dimension into group behavior modeling.¹⁸ Traditional static graph models are difficult to capture the real-time evolution of user interactions, while fraudulent behaviors tend to be significantly time-dependent, such as periodic mass text messages or sudden fund transfers. Time-aware graph attention networks transform the timestamps of interaction events into continuous vectors by introducing a time encoder, which participates in the attention weight calculation together with node features. This design enables the model to dynamically aggregate the historical states of neighboring nodes to predict users’ short-term behavioral tendencies. For example, if a user frequently changes the bound device or suddenly increases the frequency of calls from unfamiliar numbers within a short period of time, the model can assess his/her fraud risk level in real time. The combination of long- and short-term memory networks and graph convolution operations further enhances the model’s ability to capture long-period behavioral patterns, such as identifying low-frequency anomalous activities that last for months.

Privacy protection and data silo issues constrain the scaling of group behavior modeling. Cross-organizational data sharing faces legal and compliance challenges, while a localized view of a single data source may lead to model bias. The federated learning framework allows multiple participants to collaboratively optimize models without exposing raw data through distributed model training with encrypted transmission of parameters. In the group portrait task, organizations can train graph-embedded models based on local data and update global model parameters through a secure aggregation protocol. However, the non-independent homogeneous distribution property of heterogeneous graph structures increases the convergence difficulty of federated learning, and a gradient alignment strategy based on meta-paths needs to be designed to reduce the impact of cross-domain feature distribution differences on model performance. The introduction of differential privacy techniques further reduces the risk of sensitive information leakage, such as adding Laplace noise when node embedding is released, but the balance between privacy budget and model utility needs to be weighed.

The future theory of group behavior modeling will deepen toward multimodal fusion and causal inference. On the one hand, fusing multi-source data such as call content semantics, geolocation trajectory, and device fingerprints, to construct panoramic user profiles can improve the granularity of risk prediction. For example, combining natural language processing technology to analyze the emotional tendency and conversation patterns in the call text, and identifying the key features of induced conversations. On the other hand, the causal inference framework can distinguish correlation from causality to avoid misjudging normal user behavior as fraud risk. For example, a counterfactual analysis verifies whether device replacement directly leads to an increase in the probability of being scammed, rather than being influenced by other confounding variables. These theoretical advances will promote group behavior modeling to leap from descriptive analysis to explanatory and predictive analysis, and provide more solid theoretical support for the construction of proactive defense system.¹⁹

Non-line-of-sight localization theory

Non-line-of-sight localization is the core technical challenge in combating virtual base stations and fraud dens, and its core challenge lies in the scattering, reflection and diffraction effects generated by obstacles in the signal propagation path, resulting in a significant increase in the measurement errors of key parameters such as arrival time and angle of arrival.²⁰ In complex urban environments, obstacles such as high-rise buildings, tunnels, and vegetation are prevalent, and the signal propagation path from the transmitter to the receiver often contains multiple reflections or bypasses, resulting in non-line-of-sight propagation phenomena. The traditional line-of-sight positioning algorithm is based on the assumption of geometric straight-line propagation, and directly uses the signal parameters to calculate the target position, but in non-line-of-sight scenarios, it will produce tens of meters or even hundreds of meters of positioning deviation, which seriously restricts the efficiency of precision strikes in the anti-fraud battle.

To cope with the non-line-of-sight positioning errors, ray tracing theory simulates the signal propagation path by constructing a three-dimensional environment model, and predicts the multipath effect by using the physical laws of electromagnetic wave reflection, transmission and bypass. This method can achieve higher accuracy when the environment map is known, but its computational complexity grows exponentially with the scene size, making it difficult to meet real-time localization requirements. Probabilistic statistical methods, on the other hand, assume that the scatterers obey a specific spatial distribution (e.g., Poisson point process), and back-propagate the location of the signal source through maximum likelihood estimation.²¹ In recent years, machine learning techniques have been introduced for non-line-of-sight error modeling to learn the nonlinear relationship between multipath features and positional offsets through training data, but it requires a large amount of labeled data support and high cost for practical deployment.

The introduction of the federated learning framework provides a new idea to solve the problem of cross-base station data collaboration and privacy protection. Multiple base stations can jointly optimize the positioning model by encrypted gradient exchange without sharing the original signal data. In a distributed positioning task, each base station trains a local model based on locally received signal strength and arrival time difference data, and the central server aggregates the model parameters to generate a global model. This mechanism both avoids direct exposure of user location information and exploits the spatial diversity feature of multi-base station data. However, the privacy-accuracy tradeoff problem of federated learning is particularly prominent in localization scenarios. The differential privacy technique reduces the risk of sensitive information leakage by adding noise to the gradient, but excessive noise will lead to a rise in the Kramer-Rowe lower bound of the localization error, and an adaptive noise injection strategy needs to be designed to dynamically adjust the noise amplitude according to the signal quality in order to balance the two contradictions.²²

The future development of non-line-of-sight localization theory will show a trend of multidisciplinary cross-fertilization. Breakthroughs in quantum sensing technology may provide new means for high-precision signal parameter measurements, such as using quantum entangled states to enhance time synchronization accuracy. At the same time, digital twin technology can realize real-time simulation and error prediction of signal propagation paths by constructing virtual environment mirrors, providing decision support for dynamic adjustment of positioning algorithm parameters. In terms of privacy protection, the combination of homomorphic encryption and secure multi-party computation is expected to further enhance the security of the federated learning framework, making cross-agency collaborative localization possible under compliance constraints. These theoretical advances will promote the non-line-of-sight positioning technology from laboratory research to large-scale engineering applications, laying a solid foundation for the construction of all-weather, high-precision anti-fraud positioning system.

Cognitive model of intentional reasoning

Intent Reasoning Cognitive Model is an advanced semantic analysis module in the telecom fraud prevention and control system, and its core task is to identify the potential targets and logical chains of fraudulent discourse by parsing the content of communication and behavioral patterns. The traditional approach can be divided into two major paradigms. Symbolism is based on formal logic rules to build an intent ontology library; connectionism relies on deep learning models to automatically extract semantic features from data, and realize intent classification through end-to-end training. However, it is difficult for a single paradigm to cope with the semantic variations and contextual ambiguities of fraud scripts, and a hybrid architecture that integrates the two needs to be explored in order to enhance the generalization ability and interpretability of the model.²³

To enhance the semantic representation capability of the knowledge graph, the TransR algorithm is used for embedding learning of entities and relations:

e_{h} + r_{r} \approx e_{t}^{T} M_{r}

(3)

where

e_{h}, e_{t} \in R^{d}

is the head and tail entity embedding vector,

r_{r} \in R^{k}

is the relation vector, and

M_{r} \in R^{k \times d}

is the relation-specific projection matrix. The model captures complex semantic associations by minimizing the ternary scoring function

{‖ e_{h} + r_{r} - e_{t} M_{r} ‖}^{2}

Symbolist approaches rely on manually constructed rule bases and knowledge graphs, and realize intent chain derivation through predicate logic and inference engines. In addition, manual annotation is costly and difficult to cover language variants such as dialects and slang, resulting in a high leakage rate in practical applications.²⁴

The connectionist approach focuses on deep learning, capturing deep semantic features of text through large-scale pre-training of language models.²⁵ The hierarchical attention mechanism of the Transformer architecture is able to model long-distance contextual dependencies, for example, identifying implicit associations between “arrest warrants” and “funding review.” The Deep Reinforcement Learning framework further integrates intentional inferences into the contextual features. Deep Reinforcement Learning frameworks further transform intentional reasoning into sequential decision problems, where intelligences learn optimal reasoning paths by interacting with the environment. However, purely data-driven models are poorly adapted to low-resource scenarios and may generate semantic misjudgments due to over-reliance on statistical features. The contrast loss function can be defined as:

L_{c o n} = - \log \frac{e^{s i m (e_{i}, e_{j}^{+}) / τ}}{e^{s i m (e_{i}, e_{j}^{+}) / τ} + \sum_{k = 1}^{K} e^{s i m (e_{i}, e_{j}^{-}) / τ}}

(4)

where

e_{i}

is the anchor sample embedding,

e_{i}^{+}

and

e_{i}^{-}

are the positive and negative sample embeddings, respectively,

s i m (\cdot)

is the cosine similarity and

τ

is the temperature coefficient. This loss function enhances the robustness of the model to semantic perturbations by maximizing the positive sample similarity with respect to the negative sample difference.

Hybrid architectures construct rule-guided data-driven models by fusing the advantages of symbolization and connectionism. The knowledge graph embedding technique transforms the symbolized rules into low-dimensional vectors, which participate in intent inference together with semantic features extracted by neural networks. In the deep reinforcement learning framework, the knowledge graph acts as a priori constraints to limit the action space and avoid the model from exploring invalid paths. Meanwhile, the contrast learning technique improves the model’s sensitivity to semantic nuances by enhancing the embedding distinction between positive and negative samples.

Intent inference in dynamic environments requires online learning and adaptive evolution capabilities. The online feedback mechanism updates the model parameters by collecting labeled data in real time, enabling the system to respond quickly to new types of fraudulent discourse. The incremental learning strategy mitigates the catastrophic forgetting problem by preserving the distillation loss function of historical knowledge. In addition, the adversarial generative network enhances the robustness of the model to semantic perturbations by synthesizing adversarial samples that are semantically similar to real fraud scripts.²⁶

In the future, intentional inference models will deepen toward multimodal fusion and causal inference. Combining speech emotion recognition and textual semantic analysis can capture non-textual cues such as threatening tone or rapid tempo in the call to enhance the multi-dimensional evidence support for intent determination. The causal inference framework is able to distinguish between correlation features and causal features, for example, verifying whether equipment replacement directly leads to an increase in the risk of being scammed, rather than being influenced by other confounding variables. The introduction of macrolanguage models provides new tools for generative confrontation of fraudulent discourse, which can prospectively train the adaptive capability of defense models by simulating the attacker’s discourse variation paths. These theoretical advances will promote the transition of intent reasoning from passive response to active defense, and provide core cognitive support for the construction of an adaptive and evolvable intelligent anti-fraud system.

Algorithm design and implementation

This paper proposes an algorithmic innovation system for intelligent prevention and control of telecom fraud. First, by fusing deep packet inspection and semi-supervised learning, we construct a dynamic parsing architecture for encrypted traffic, combining SSL/TLS session correlation analysis and GCN-driven multi-hop behavioral modeling, to break through the technical bottleneck of traditional methods in crypto-semantic restoration and gang correlation mining. Secondly, a multimodal stream processing engine based on Kafka-Flink is designed to introduce attention mechanism and online incremental learning strategy to achieve real-time feature fusion and model adaptive optimization of heterogeneous data. Further, a group risk prediction algorithm under heterogeneous information network is proposed, which utilizes TransE embedding and meta-path wandering techniques to capture the higher-order interaction patterns of user-device-behavior, and combines the BERT-CRF model to extract transactional intent and emotion fluctuation features in the call text. At the privacy protection level, the innovative federated learning framework is combined with the improved Chan-Taylor localization algorithm to optimize the scatterer parameter estimation through particle filtering, which significantly improves the base station localization accuracy in the NLOS environment. The algorithm architecture is shown in Figure 1.

Figure 1.

Methodological framework diagram.

Encrypted traffic dynamic parsing algorithm

A hybrid architecture based on semi-supervised learning and GCN is proposed for the semantic restoration challenge of HTTPS encrypted traffic.²⁷ First, information entropy is utilized to quantify the randomness of traffic features, defining the packet length entropy $H_{l e n}$ and the arrival time interval entropy $H_{t i m e}$ :

H_{l e n} = - \sum_{i = 1}^{n} p (l_{i}) \log_{2} p (l_{i})

(5)

H_{t i m e} = - \sum_{j = 1}^{n} p (Δ t_{j}) \log_{2} p (Δ t_{j})

(6)

where

l_{i}

denotes the length of the ith packet,

Δ t_{j}

is the neighboring packet arrival time interval, and

p (\cdot)

is its probability distribution. Combined with the TLS handshake features, the feature vector

x = [H_{l e n}, H_{t i m e}, d_{c e r t}, k_{t y p e}]

is constructed.

In the semi-supervised training phase, a pseudo-label generation strategy is used to optimize the classification boundaries.²⁸ For unlabeled samples $x_{u}$ , their pseudo-labels ${\hat{y}}_{u}$ are predicted by a random forest classifier trained on the labeled set L:

{\hat{y}}_{u} = \arg \max_{c} P (y = c | x_{u}; θ_{R F})

(7)

where

θ_{R F}

is the random forest model parameter. The training set is expanded by screening high-confidence pseudo-labels through cross-validation.

GCN is further introduced to model the traffic node topology relationship. Define the adjacency matrix A and the feature matrix X. The l-th layer GCN output is:

H^{(l + 1)} = σ

(8)

where D is the degree matrix, σ is the ReLU activation function, and

W^{(l)}

is the trainable weight matrix. The multi-hop correlation feature between agent nodes is captured through multi-layer aggregation and the probability of abnormal traffic is output

P_{f r a u d} = S i g m o i d (h_{f i n a l}^{T} w_{o})

, where

w_{o}

is the classification weight vector.

Multimodal real-time feature fusion model

To achieve millisecond fraud detection, a stream processing engine based on Kafka-Flink is designed.²⁹ Define multimodal data streams $D_{t} = {x_{t e x t}, x_{t r a n s}, x_{n e t}}$ within event time window $W_{t}$ , representing call text, transaction records and network behavior, respectively. Text features $e_{t e x t} \in R^{768}$ are extracted by BERT model, transaction features $x_{t r a n s}$ are encoded as segmented solo heat vectors, and network behaviors $x_{n e t}$ are normalized to temporal sequences.

The cross-modal feature fusion is realized using the multi-head attention mechanism. Let the query matrix Q, key matrix K, and value matrix V be generated by linear transformation, and the attention weights are calculated as:

A t t e n t i o n (Q, K, V) = S o f t \max (\frac{Q K^{T}}{\sqrt{d_{k}}}) v

(9)

Dynamic risk scores are generated by parallel computation with multiple attention heads and stitching the results $s_{r i s k} \in [0, 1]$ .

To cope with data distribution bias, an online incremental learning strategy is introduced. The loss function is defined as a weighted combination of cross-entropy and KL scatter:

L_{o n l i n e} = λ \cdot L_{C E} (y, \hat{y}) + (1 - λ) \cdot D_{K L} (p_{o l d} ‖ p_{n e w})

(10)

where

λ

controls the old and new knowledge retention strength, and

p_{o l d}

and

p_{n e w}

denote the probability distribution of the historical model and current output, respectively.

Federated positioning optimization algorithm

A joint optimization framework based on federated learning is proposed for NLOS localization errors. Assuming that the set of base stations is $B = {b_{1}, . . ., b_{M}}$ , the local model parameter of the mth base station is $w_{m}$ , and the global model is updated by federated averaging:

w_{g l o b a l}^{(t + 1)} = \sum_{m = 1}^{M} \frac{n_{m}}{N} w_{m}^{(t)}

(11)

where

α

is the amount of local data at base station m and

N = \sum n_{m}

In the localization parameter estimation phase, the Chan-Taylor equations are improved to jointly optimize TOA $τ$ and AOA $θ$ :

\min_{τ, θ} {\sum_{i = 1}^{K} (\frac{‖ p - p_{i} ‖}{c} - τ_{i})}^{2} + α {(∠ (p, p_{i}) - θ_{i})}^{2}

(12)

where p is the position to be estimated,

p_{i}

is the reference point, c is the speed of light, and

p_{i}

is the weight coefficient. The system of nonlinear equations is solved by Newton’s iterative method, and the initial value selection is optimized by combining particle filtering.

Intentional self-evolutionary reasoning model

Construct a DRL-based intent inference framework. Define the state space S as the current semantic context, the action space A as the possible intention transfer paths, and the reward function $r (s, a)$ is designed as:

r (s, a) = β \cdot S i m_{\cos} (s, a) - (1 - β) \cdot E n t r o p y (a)

(13)

where

S i m_{\cos}

measures the semantic similarity between action a and the true intention,

E n t r o p y (a)

penalizes path uncertainty, and

β

is a balancing factor.

The contextual features of fraudulent discourse are extracted by layered attention mechanism.³⁰ Let the input sequence $X = [x_{1}, . . ., x_{T}]$ , the lth layer Transformer output is:

z^{(l)} = l a y e r N o r m (H^{(l - 1)} + M u l t i H e a d A t t n (H^{(l - 1)}))

(14)

The final intended probability distribution is:

P_{i n t e n t} = S o f t \max (W_{p} Z^{(L)} + b_{p})

(15)

where

W_{p}

b_{p}

are the projection layer parameters and L is the total number of layers. The online feedback mechanism updates the policy network parameters by minimizing the timing difference error

δ_{t} = r_{t} + γ V (s_{t + 1}) - V (s_{t})

to achieve model self-evolution.

Experimentation and verification

To verify the validity and reliability of the methodology in this paper, the experiments build dynamic test scenarios based on the public traffic dataset NGSIM and CARLA simulation platform, covering typical failure modes such as sensor failure, actuator bias, and sudden interference. All experiments are deployed on an integrated on-board computing unit, and the comparison baseline methods include traditional robust control (RMPC), LQR-based fault tolerant control (FTC-LQR), and adaptive sliding mode control (ASMC).

Troubleshooting and response performance

For sensor failure scenarios, the multi-source data-driven diagnostic module proposed in this paper outperforms the baseline method in terms of fault detection delay (FDD) and false alarm rate (FPR). As shown in Figure 2, the experimental data shows that when the sensor failure is injected, the method in this paper completes the fault localization within 120 ms and the false alarm rate is controlled at 3.2%. Notably, due to the dynamic updating strategy of the JITL localization model, the system’s false alarm rate fluctuation under continuous condition changes is only ±0.7%, which is significantly lower than that of ±2.9% for ASMC.

Figure 2.

Fault detection performance comparison.

Trajectory tracking accuracy comparison

In the bidirectional four-lane lane-changing scenario with actuator bias fault (40% steering torque attenuation) condition, the root mean square error (RMSE) of lateral position tracking of this paper’s method is 0.18 m, as shown in Figure 3, which is 41.9% and 28.0% higher than that of RMPC and FTC-LQR, respectively. For longitudinal speed tracking, the proposed HDP controller has a response delay of 0.42 s at 80 km/h high speed condition, which is less different from ASMC, but the overshoot is reduced to 8.3%. This indicates that the two-layer optimized structure has a unique advantage in balancing response speed and stability.

Figure 3.

Cross-sectional tracking error distribution.

Dynamic interference suppression capability

In order to test the control robustness under sudden disturbance, the experiment simulates the composite working condition of lateral wind disturbance (equivalent wind speed 12 m/s) and sudden change of road surface adhesion coefficient. As shown in Figure 4, the lateral acceleration fluctuation range of this paper’s method is controlled within ±0.35 g, and the oscillation convergence time is shortened to 1.2 s. By quantitatively analyzing the control energy consumption, the average power demand of the proposed method in the disturbance suppression phase is 2.1 kW, which is 44.7% lower than that of the ASMC, and verifies the energy efficiency advantage of the collaborative optimization strategy of evaluation-execution network.

Figure 4.

Interference suppression energy consumption.

Extreme operating conditions reliability verification

In the extreme scenarios constructed by the CARLA simulation platform, the success rate of this paper’s method in the emergency obstacle avoidance test is maintained at 89.3%, and no control instability occurs. Further analysis shows that the JITL-driven local model update frequency is automatically boosted to 50 Hz under extreme working conditions, which ensures the ability to capture nonlinear dynamics quickly.

Conclusion

In the face of the severe challenges brought by the continuous upgrading of telecom fraud technology, this paper constructs a multi-dimensional intelligent prevention and control system integrating big data and artificial intelligence algorithms, and realizes technical breakthroughs from the core aspects of encrypted traffic parsing, real-time early warning, identification of high-risk groups, accurate positioning and intent inference, providing solutions with both theoretical innovations and practical values for the prevention and control of telecom fraud.

Through the synergistic optimization of deep packet inspection and semi-supervised learning, the problem of semantic reduction of HTTPS encrypted traffic is solved, combined with the topological modeling of multi-hop agent behavior by GCN, the recognition accuracy of fraud tools under encrypted environments is significantly improved; based on the streaming computation engine and multimodal feature fusion mechanism of Kafka-Flink, the timeliness bottleneck of real-time processing of massive heterogeneous data is overcome, and millisecond-level anomaly detection and dynamic reasoning are realized. millisecond anomaly detection and dynamic model updating; through the high-order correlation mining of HIN and GNN, a fine-grained group portrait model is constructed, which breaks through the limitations of traditional static analysis and accurately locates user nodes susceptible to fraud. In terms of the balance between privacy protection and accurate combat, the innovative federal learning framework is combined with the improved Chan-Taylor localization algorithm, which provides a high-precision and low privacy leakage risk solution for virtual base station tracking through the estimation of signal scattering parameters in NLOS environment. In addition, the DRL-driven intent inference framework, through the semantic fusion of hierarchical attention mechanism and knowledge graph, realizes adaptive defense against new fraudulent discourse, forming a closed loop of detection-localization-blocking-warning for the whole chain of prevention and control.

Future research will focus on three breakthroughs: first, the introduction of large language model to enhance the generation of fraudulent discourse countermeasures and improve the foresight of intent inference; second, exploring cross-agency anti-fraud coalition empowered by blockchain technology to crack the problem of data silos and privacy compliance; and third, optimizing the parsing algorithms of quantum encrypted traffic to cope with the challenges of next-generation communication protocols. This research provides a theoretical foundation and technical path for building an intelligent and proactive universal anti-fraud ecology, which has significant social benefits and industry promotion value.

Footnotes

ORCID iD

Shanshan Yu

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Malkiel

. The wire fraud boom. Oklahoma Law Rev 2022; 75: 531.

O'Sullivan

. Morrison’s flawed “focus” test and the transnational application of the (misinterpreted) wire fraud statute. Am Crim Law Rev 2024; 61: 251.

Asaoka

Soma

Yamauchi

, et al. Service identification of TLS flows based on handshake analysis. J Inf Process 2023; 31: 131–142.

Zheng

Wang

Liu

, et al. Real-time big data processing framework: challenges and solutions. Appl Math Inf Sci 2015; 9: 3169.

Mavaluru

Mubarakali

Narapureddy

, et al. Deep convolutional neural network based real-time abnormal behavior detection in social networks. Comput Electr Eng 2023; 111: 108987.

Wang

, et al. Integrating individual factors to construct recognition models of consumer fraud victimization. Int J Environ Res Publ Health 2022; 19: 461.

Pan

Ding

Zhong

, et al. Collaborative governance: blockchain-based federated learning for construction safety service. IEEE Trans Eng Manag 2025; 11: 325.

Guo

Yan

Zhao

, et al. R/B-SecArch: a strong isolated SoC architecture based on red/black concept for secure and efficient cryptographic services. Microelectron J 2023; 142: 106024.

Shim

K-S

Kim

Lee

. Research on quantum key, distribution key and post-quantum cryptography key applied protocols for data science and web security. J Web Eng 2024; 23: 813–830.

10.

Manuel

Daimi

. Implementing cryptography in LoRa based communication devices for unmanned ground vehicle applications. SN Appl Sci 2021; 3: 397.

11.

Alahmari

Rajeyyagari

Al-Turjman

. Davis Mayer streebog cryptographic hash-based blockchain for secure transaction management using SDN in IIoT applications. J Signal Process Syst 2023; 95: 241–252.

12.

Jiang

Song

, et al. A survey on multi-access edge computing applied to video streaming: some research issues and challenges. IEEE Commun Surv Tutorials 2021; 23: 871–903.

13.

Yang

Shami

. A lightweight concept drift detection and adaptation framework for IoT data streams. IEEE Internet Things M 2021; 4: 96–101.

14.

Qian

Dong

Zhang

, et al. Streaming long video understanding with large language models. Adv Neural Inf Process Syst 2024; 37: 119336–119360.

15.

W-W

Jiang

Yang

K-F

, et al. Lgsnet: a two-stream network for micro-and macro-expression spotting with background modeling. IEEE Trans Affect Comput 2023; 15: 223–240.

16.

Montiel

Halford

Mastelini

, et al. River: machine learning for streaming data in python. J Mach Learn Res 2021; 22: 1–8.

17.

Shmueli

. Predicting intention to receive COVID-19 vaccine among the general population using the health belief model and the theory of planned behavior model. BMC Public Health 2021; 21: 1–13.

18.

Kamalanon

Chen

J-S

T-T-Y

. “Why do we buy green products?” an extended theory of the planned behavior model for green product purchase behavior. Sustainability 2022; 14: 689.

19.

Alkhawaldeh

ALBashtawy

Rayan

, et al. Application and use of Andersen’s behavioral model as theoretical framework: a systematic literature review from 2012–2021. Iran J Public Health 2023; 52: 1346–1354.

20.

Zhang

Chen

Feng

, et al. Toward reliable non-line-of-sight localization using multipath reflections. Proc ACM Interact Mob Wearable Ubiquitous Technol 2022; 6: 1–25.

21.

Guo

Liu

. A ray-tracing-based single-site localization method for non-line-of-sight environments. Sensors 2024; 24: 7925.

22.

Zhu

Guo

Chen

, et al. Non-line-of-sight targets localization algorithm via joint estimation of DoD and DoA. IEEE Trans Instrum Meas 2023; 72: 1–11.

23.

Borukhson

Lorenz-Spreen

Ragni

. When does an individual accept misinformation? an extended investigation through cognitive modeling. Comput Brain Behav 2022; 5: 244–260.

24.

Robinson

. The cognition hypothesis, the triadic componential framework and the SSARC model: an instructional design theory of pedagogic task sequencing. In: The Cambridge handbook of task-based language teaching. Cambridge: Cambridge University Press, 2022: 205–225.

25.

Baker

D’Esterre

Weaver

. Executive function and theory of mind in explaining young children’s moral reasoning: a test of the hierarchical competing systems model. Cogn Dev 2021; 58: 101035.

26.

Biskas

Sirois

Webb

. Using social cognition models to understand why people, such as perfectionists, struggle to respond with self‐compassion. Br J Soc Psychol 2022; 61: 1160–1182.

27.

Chen

Y-W

C-W

, et al. Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network. J Cheminf 2021; 13: 1–16.

28.

Mao

Yin

Zhang

, et al. Pseudo-labeling generative adversarial networks for medical image classification. Comput Biol Med 2022; 147: 105729.

29.

Muvva

. Streaming at scale: benchmarking data platforms for the era of continuous analytics. IJAIDR-J Adv Devel Res 2025; 16: 13.

30.

, et al. Hiformer: sequence modeling networks with hierarchical attention mechanisms. IEEE/ACM Trans Audio Speech Lang Process 2023; 31: 3993–4003.