Combining GCN with attribute mathematical theory to improve attribute recognition of recycled coarse aggregate

Abstract

Recycled coarse aggregate exhibits significant heterogeneity in its physical and chemical properties, along with complex interdependencies among various feature attributes. These characteristics often lead to challenges in achieving high accuracy in attribute recognition, highlighting the urgent need for a structured and intelligent analytical framework. To address this issue, this study proposes a novel hybrid approach that integrates graph convolutional networks (GCN) with attribute mathematics theory, aiming to enhance feature representation and improve recognition performance. The method begins by constructing a multi-dimensional attribute graph based on the physicochemical properties of recycled coarse aggregate, capturing the intrinsic correlations among different features. A multilayer GCN is then employed to extract deep-level, globally coupled feature representations. Subsequently, attribute mathematics theory is applied to simplify and logically abstract the output features through membership functions and covering operators, enabling effective feature selection and dimensionality reduction. The refined feature set is finally fed into a discriminant classifier to achieve accurate attribute recognition. Experimental results demonstrate the superiority of the proposed fusion model over traditional machine learning methods such as SVM, random forest, and MLP. The model achieves average recognition accuracies and recall rates of 0.89 and 0.89 across eight material categories, and 0.90 and 0.89 across seven particle size ranges, respectively. Five-fold cross-validation yields an average accuracy between 0.889 and 0.918, with a low standard deviation of 0.012, indicating strong stability and generalization performance. Moreover, the feature simplification strategy achieves an average feature reduction rate of 0.67 while retaining 0.92 of the original information. These results confirm that the proposed GCN–attribute mathematics framework significantly enhances the attribute recognition capability of recycled coarse aggregate, offering a robust and efficient solution for intelligent identification in sustainable construction materials research.

Keywords

recycled coarse aggregate graph convolutional network attribute mathematics feature reduction attribute recognition

Introduction

The resource utilization of recycled coarse aggregate plays an important role in the green building materials industry chain. Its quality and performance are directly related to the durability and frost resistance of concrete engineering.^1,2 The error in the identification of recycled coarse aggregate properties directly affects the material classification and engineering adaptation, and increases the cost of testing and screening. The fusion method helps to improve the identification efficiency and application reliability. The physical properties of recycled coarse aggregate include density, porosity, and water content, while the chemical composition includes indicators such as carbonate content, silicate content, and alkali metal content.^3,4 There is a complex coupling between these attributes, and nonlinear dependence affects the subsequent mechanical property judgment and environmental risk assessment. Faced with the increasing requirements for aggregate performance in reinforced concrete structures, traditional feature extraction strategies make it difficult to accurately distinguish different quality levels in a high-dimensional redundant information environment, and classifiers are prone to misjudgment or missed detection in practical applications.^5,6 In the green building materials development strategy, the quality control and application requirements of recycled coarse aggregate are becoming increasingly stringent. The current specifications set clear boundaries for particle size distribution and calcium carbonate content.^7,8 In the face of the dual goals of resource recycling and emission reduction, it is necessary to accurately identify the properties of recycled coarse aggregate to achieve safe and reliable material performance assurance.^9,10 The current evaluation methods rely on manual sampling and traditional testing methods for physical and chemical properties, which are inefficient and cannot reveal the deep correlation between properties. Innovative solutions based on big data and intelligent algorithms are urgently needed.

Figure 1 shows several types of recycled coarse aggregates. Aggregates with a particle size between 5 mm and 40 mm are coarse aggregates, commonly known as stones. Commonly used ones are crushed stone and pebbles. In concrete, coarse aggregates act as skeletons with sand and stone.

Figure 1.

Recycled coarse aggregate.

At present, machine learning algorithms such as multi-dependent support vector machine (SVM),^11,12 random forest (RF)^13,14 and multilayer perceptron (MLP)^15,16 are widely used in the identification of recycled coarse aggregate properties. Salimbahrami SR proposed to produce green concrete using recycled concrete waste. Through experiments, the mechanical properties of natural concrete, recycled concrete and recycled fiber concrete were compared, and the SVM method was used to predict their compressive strength. The SVM performed more consistently in 124 groups of tests, verifying the feasibility of recycled concrete in terms of mechanical properties and environmental benefits.¹⁷ SVM realized nonlinear mapping through kernel function and is suitable for small sample classification, but its training complexity was high and its ability to scale to large-scale data is limited. Some studies on the identification of recycled coarse aggregate properties had the problems of high training complexity and limited ability to scale to large-scale data.^18,19 Yang found that RF showed significant advantages in predicting the compressive strength of recycled aggregate self-compacting concrete. Its error index was low, and the tree-based characteristics gave the model stronger a parameter interpretation ability. It had better interpretability while maintaining high accuracy, and was particularly suitable for modeling the nonlinear behavior of composite materials.²⁰ Random forests improve noise resistance by integrating decision trees, but it is difficult to explicitly express the complex coupling between attributes, which limits the deep mining of associated features.^21,22 As a classic neural network structure, multilayer perceptron can capture nonlinear relationships, but it lacks an effective screening mechanism for high-dimensional redundant features, which easily leads to model overfitting. When dealing with multi-dimensional attribute coupling, the above methods often ignore the structured dependency between attributes, which limits the improvement of recognition performance. The uneven distribution of samples and high-dimensional redundant information also increases the difficulty of model recognition, affecting its promotion and application in practical engineering.^23,24 Existing research still has obvious deficiencies in accurately revealing the coupling relationship between attributes and improving recognition efficiency, and it is difficult to meet the dual requirements of high accuracy and real-time performance in the application of recycled coarse aggregate.

In order to deal with the problem of multi-dimensional attribute coupling and redundant features, some studies have applied principal component analysis (PCA)^25,26 and maximum relevance and minimum redundancy (mRMR)^27,28 algorithms for dimensionality reduction, but their linear assumptions limit the revelation of nonlinear dependencies. Attribute mathematical theory is based on lattice theory and membership function, which can achieve logical simplification and rule extraction of attributes, and has shown strong explanatory power in system reasoning and classification tasks.^29,30 Graph convolutional networks (GCNs) have been widely used in many fields due to their efficient integration of node and neighbor information. They are suitable for modeling complex correlation structures between attributes.^31,32 However, the use of GCN alone lacks support for logical simplification of deep features, making it difficult to eliminate redundant information and improve the interpretability of results. Combining attribute mathematical theory with GCN can capture global coupling features, complete the screening and abstraction of key attributes through membership and covering operators, and enhance the model’s discrimination ability and logical expression. Based on this fusion idea, this paper aims to improve the accuracy and stability of recycled coarse aggregate attribute recognition and overcome the limitations of traditional methods in feature redundancy and nonlinear association processing.^33,34

This study addresses the challenge of attribute recognition for recycled coarse aggregate, where conventional methods face limitations in handling complex coupling relationships among multi-dimensional features. To enhance both accuracy and efficiency, we propose a novel hybrid framework that integrates graph convolutional networks (GCN) with attribute mathematics theory, enabling deep modeling and logical simplification of attribute association structures. The proposed approach leverages the topological learning capability of GCNs to model the global interdependencies among various physicochemical attributes of recycled coarse aggregate. This is achieved by constructing a multi-dimensional attribute graph that captures the intrinsic correlations among heterogeneous features. Subsequently, attribute mathematics theory is applied to refine the extracted features through membership functions and covering operators. This step effectively eliminates redundancy, highlights key discriminative attributes, and enhances both the interpretability and generalization performance of the model. Unlike traditional recognition models that are sensitive to high-dimensional and redundant data, the proposed method combines data-driven feature fusion with logical reasoning mechanisms, offering a more robust and explainable solution. Through graph-based attribute mapping, coupled with mathematical feature simplification, the framework significantly improves the classification performance of downstream discriminant models.

Experimental results demonstrate the superiority of the GCN–attribute mathematics fusion scheme over conventional machine learning approaches such as SVM, random forest, and MLP. The model achieves high recognition accuracy and recall across multiple material categories and particle size ranges, along with strong stability and low variance in cross-validation tests. Furthermore, the feature simplification strategy achieves substantial dimensionality reduction while preserving most of the original information, validating its effectiveness in enhancing recognition performance. The innovation of this work lies in the development of a hybrid model that seamlessly integrates deep learning with mathematical logic, providing a new paradigm for intelligent detection in the field of recycled construction materials. This approach not only advances theoretical methodologies but also offers significant potential for practical engineering applications, contributing to the development of sustainable infrastructure technologies.

Method

Figure 2 systematically presents the architecture of recycled coarse aggregate attribute recognition based on GCN and attribute mathematical theory. A multi-level layout design is adopted. The data input layer collects physical and chemical attributes and generates an attribute matrix, correlation matrix, and co-occurrence frequency matrix. The processing area displays the core modules of graph structure modeling (Z-score standardization and adjacency matrix construction), GCN double-layer feature extraction, and attribute mathematical simplification (membership calculation → covering operator → logical simplification). The decision output layer realizes quality grade classification and contribution analysis.

Figure 2.

Recycled coarse aggregate attribute recognition architecture based on GCN and attribute mathematical theory.

Building a multi-dimensional attribute graph structure model

The various physical and chemical properties of recycled coarse aggregate are noded, and a graph structure adjacency matrix is constructed based on statistical correlation and attribute co-occurrence. This graph is used to represent the spatial coupling and logical association between the internal attributes of the sample, providing an input graph for GCN processing.

Attribute node definition and graph structure construction

In the identification of recycled coarse aggregate properties, physical and chemical properties are regarded as nodes in the graph, and each type of property data is systematically integrated. For the attribute data set of the sample, the correlation index between the attributes is calculated based on the statistical analysis method. Specifically, the Pearson Correlation Coefficient^35,36 is used as a tool to quantify the degree of linear correlation between the attributes. Its calculation formula is

r_{i j} = \frac{\sum_{k = 1}^{n} (x_{i k} - \bar{x_{i}}) (x_{j k} - \bar{x_{j}})}{\sqrt{\sum_{k = 1}^{n} {(x_{i k} - \bar{x_{i}})}^{2}} \sqrt{\sum_{k = 1}^{n} {(x_{j k} - \bar{x_{j}})}^{2}}}

(1)

where

r_{i j}

represents the correlation coefficient between attributes;

x_{i k}

is the observed value of the attribute in the

k

-th sample;

\bar{x_{i}}

is the mean of the attribute;

n

represents the total number of samples. The calculated correlation coefficient matrix provides the basis for the strength of the correlation between the attributes.

Combined with the co-occurrence frequency analysis of attributes, attribute pairs with significant statistical significance are further screened out as the basis for connecting edges between nodes in the graph. The node connection weight is obtained by weighted fusion of the correlation coefficient and the co-occurrence frequency to express the spatial coupling strength between attributes. The adjacency matrix is defined, and the element $a_{i j}$ represents the connection strength between nodes. The calculation formula is as follows:

a_{i j} = α \cdot r_{i j} + (1 - α) \cdot f_{i j}

(2)

where

f_{i j}

is the co-occurrence frequency of attributes;

α

is the weight adjustment factor, ranging from 0 to 1. This adjacency matrix accurately depicts the complex coupling relationship between the attributes of recycled coarse aggregate through weighted fusion, forming a complete multi-dimensional attribute graph structure. This structure not only covers the statistical correlation of attributes but also reflects the coexistence of attributes in actual samples, becoming the input basis for subsequent graph convolution operations.

The parameters listed in Table 1 are the key statistics and weight settings for constructing the multi-dimensional attribute graph structure of recycled coarse aggregate. The total number of samples reflects the data scale and ensures the stability and representativeness of statistical analysis. The correlation coefficient threshold screens out weakly correlated attribute pairs to avoid the interference of invalid connections on the graph structure and improve the accuracy of graph expression. The weight adjustment factor achieves a reasonable balance between correlation and attribute co-occurrence frequency, highlighting the statistically linear relationship between attributes. The co-occurrence frequency threshold is used to eliminate low-frequency noise connections and ensure the sparsity and effectiveness of the adjacency matrix.

Table 1.

Attribute correlation statistics and weight configuration parameters.

Parameters	Description	Value range/setting	Remarks
Total sample size	Number of data samples used	150	Reflects actual experimental scale
Correlation coefficient threshold	Minimum correlation to retain	0.35	Attribute pairs below this threshold are excluded
Weight adjustment factor	Ratio for weighting correlation and co-occurrence	0.7	Emphasizes statistical correlation importance
Co-occurrence frequency threshold	Minimum frequency for co-occurrence	0.2	Ensures meaningful edge connections

Property graph data preprocessing and standardization

Standardize the constructed attribute node feature matrix. The original attribute data has problems such as inconsistent dimensions and large differences in numerical ranges. Directly inputting the graph convolutional network may apply training bias. The Z-score standardization method is used to adjust the attribute value distribution. The calculation formula is as follows:

z_{i k} = \frac{x_{i k} - μ_{i}}{σ_{i}}

(3)

where

z_{i k}

is the standardized value of the attribute in the k-th sample;

μ_{i}

and

σ_{i}

represent the mean and standard deviation of the attribute, respectively. This process eliminates the dimensional differences between the attributes, ensuring that the input features are within a unified numerical scale, which is conducive to the graph convolutional network capturing the deep coupling features between attributes.

The adjacency matrix of the attribute graph is symmetrically normalized to improve the stability and convergence speed of graph convolution. The normalization operation of the adjacency matrix is expressed as

\hat{A} = D^{- \frac{1}{2}} A D^{- \frac{1}{2}}

(4)

where

D

is the degree matrix and

\hat{A}

is the normalized adjacency matrix. This normalization step ensures the balance of information propagation during multilayer convolution, avoids the adverse effects of node degree differences on feature updates, and promotes the effective fusion of information between attributes. The multi-dimensional attribute graph structure after the above processing has clear topological relationships and high-quality node feature representations, which provides a solid foundation for the input of the graph convolutional network and promotes the accurate implementation of feature extraction and recognition tasks.

Apply GCN to extract related attribute features

The constructed attribute graph is input into the graph convolutional network, and the hidden feature expression under the global coupling attribute is extracted through adjacency propagation and convolution operations. The extraction process uses a multilayer GCN structure to transmit attribute information layer by layer to obtain a comprehensive representation with context dependency.

Multilayer graph convolution information propagation mechanism

After the constructed multi-dimensional attribute graph is input into the graph convolution network, the node connection relationship described by the adjacency matrix is used to realize the diffusion and aggregation of attribute information. The features of each node in the network are propagated weightedly through the adjacency relationship to form a high-dimensional implicit feature expression. The convolution operation is performed on the graph structure. Different from the traditional two-dimensional convolution, it uses the spectral domain filtering theory and combines the normalized adjacency matrix to realize information transmission. The calculation form of a single-layer graph convolution is defined as

H^{(l + 1)} = σ (\hat{A} H^{(l)} W^{(l)})

(5)

where

H^{(l)}

represents the node feature matrix of the

l

-th layer;

W^{(l)}

is the weight parameter matrix of the

l

-th layer;

\hat{A}

is the symmetric normalized adjacency matrix;

σ

is the nonlinear activation function. The input layer is the standardized node feature matrix. This formula reflects the fusion process of node features and neighbor node information, realizes feature space mapping with the help of a weight matrix, and enhances expression ability with nonlinear activation.

The hierarchical structure allows attribute information to diffuse in multiple steps in the graph, capturing the deep coupling relationship between attributes. With multiple layers of stacking, node features not only reflect their own attributes but also integrate multi-order neighbor information to enhance the contextual semantics of node features. This mechanism effectively solves the bottleneck that nonlinear and complex associations between attributes are difficult to express with traditional feature engineering, laying a solid foundation for subsequent feature extraction.

Feature representation optimization and global coupling capture

The node feature matrix output by multilayer convolution is updated layer by layer through iteration, and a comprehensive representation of the fusion graph structure information is obtained. In order to improve the discriminability of features, a convolution kernel design with weight sharing is adopted to ensure the consistency of the feature extraction process and the control of parameter quantity. The node feature expression not only considers the local neighbor relationship but also integrates the context of the entire graph to reflect the global coupling characteristics of the attributes. This process can be described in the following mapping function form:

Z = f (H^{(L)}) = Softmax (H^{(L)} W^{(L)})

(6)

where

H^{(L)}

is the output feature matrix of the last layer of graph convolution;

W^{(L)}

is the corresponding weight matrix; function

f

obtains the final node feature representation

Z

through activation function and weight mapping. This output implies multilayer neighborhood information and has context dependency, which is beneficial to pattern recognition and classification in downstream tasks.

During the training process, the cross entropy loss function is used to optimize the model, which enables the network to learn high-quality feature expressions. This training strategy strengthens the ability to express complex relationships between attributes and improves the generalization performance of the model. The multilayer GCN architecture and its parameter tuning effectively solve the problem of insufficient attribute coupling information extraction, so that the associated attribute features are reflected in a structured form, supporting subsequent in-depth analysis and refined applications.

Implement attribute mathematical feature simplification mechanism

In the feature representation of GCN output, the membership function and covering operator in attribute mathematics are applied to construct the attribute dependency matrix. Through the logical simplification of redundant attributes, the core attribute set that maintains the classification discriminative power is screened out.

Constructing the attribute dependency matrix

The attribute features are extracted from the node feature matrix output by the graph convolutional network, and the membership function is used to evaluate the fuzzy attribution of each attribute to quantify the correlation and dependence between the attributes. The specific operation uses the membership function $μ_{A} (x)$ , which is defined as

μ_{A} (x) = \frac{1}{1 + e^{- α (x - β)}}

(7)

where

x

represents the characteristic value of the attribute to be evaluated;

α

controls the slope of the function and determines the rate of change of the membership;

β

is the threshold adjustment parameter of the membership. This function can map continuous features to the interval [0,1], reflecting the strength of attribute attribution. After calculating the membership of all attribute features, the coverage operator is used to construct the attribute coverage relationship, that is, to determine whether a certain attribute set can cover the target attribute, thereby forming an attribute dependency matrix. The matrix element

r_{i j}

represents the degree of dependence of attribute i on attribute j, which is defined as

r_{i j} = \max_{a \in A_{i}} \min (μ_{a} (x_{i}), μ_{a} (x_{j}))

(8)

where

A_{i}

is the attribute set;

μ_{a} (x_{i})

and

μ_{a} (x_{j})

represent the values of attributes i and j on the membership function, respectively. The covering operator accurately describes the fuzzy dependencies between attributes in the form of a mathematical matrix, providing a clear criterion and operational basis for subsequent logical simplification, ensuring the systematic and scientific nature of the attribute screening process.

The three-dimensional heat map shown in Figure 3 reflects the spatial distribution characteristics of the attribute dependency matrix and reveals the complex changes in the coupling strength between different attributes. The horizontal and vertical axes represent the attribute index, and the vertical axis shows the dependency strength between two attributes. The overall matrix maintains a certain regular symmetry, reflecting the mutuality of the attribute dependency relationship. The local peak area in the heat map highlights that some attributes have a strong influence on multiple related attributes. This phenomenon usually originates from the inherent logic or physical and chemical connection between attributes, reflecting the heterogeneity of attribute coupling within the sample. In contrast, some regions show low-intensity dependence, indicating that the association between attributes is weak or even relatively independent, which helps to identify redundant information and provides theoretical support for attribute simplification. The diversity of dependency strength reflects the complexity of sample attribute performance, emphasizing the necessity of accurately screening key attributes in the mathematical feature reduction mechanism. The overall spatial coupling pattern provides an intuitive and rigorous basis for a deep understanding of the intrinsic correlation of recycled coarse aggregate properties.

Figure 3.

3D heat map of attribute dependency matrix.

Logical simplification core attribute screening

By using the constructed attribute dependency matrix, logical simplification is performed on the attribute set with redundant information to eliminate the influence of redundant attributes on classification discrimination performance. The simplification process verifies the coverage ability of attribute subsets and confirms that the minimized attribute set can still maintain the overall discrimination power. Defining the simplified set $S \subseteq A$ , which satisfies:

R (S) = R (A), and | S | = \min

(9)

R (S)

and

R (A)

represent the dependency matrices of the attribute subset and the complete attribute set, respectively, and

| S |

represents the number of attributes in the reduced set. This equation ensures that the reduced set minimizes the number of attributes while maintaining the integrity of the attribute relationship. The algorithm iteratively deletes the attributes that contribute the least to the discrimination and monitors the changes in the dependency matrix until no more attributes can be deleted without losing the integrity of the matrix. The reduction process ensures the compactness and expression efficiency of the feature space and improves the computational performance and discrimination accuracy of the model. The core attribute set selected by the logical reduction result has shown strong discrimination ability after further verification, laying a solid foundation for subsequent model construction and analysis.

Complete attribute recognition logic modeling and classification prediction

The attribute features after simplification and graph learning are input into the discriminant classifier model for training. The classifier uses logical rule generation and feature matching mechanisms to make the final prediction output of the sample attribute type.

Logical rule generation and feature matching mechanism construction

The core attribute features obtained after simplification are integrated into a unified input vector and input into the discriminant classifier for training. The discriminant model is based on the generation of logical rules and feature matching mechanisms. Through hierarchical analysis of input features, it automatically extracts the implicit discriminant relationship between attributes. The construction process sets the corresponding logical expression for each dimension in the attribute vector to describe its degree of association with the classification label. The logical rule function form is as follows:

L (f_{i}) = σ (\sum_{j = 1}^{m} w_{i j} \cdot f_{j} + b_{i})

(10)

where

f_{i}

represents the feature combination of the i-th logical rule;

w_{i j}

is the feature weight parameter, reflecting the importance of the feature to the rule;

b_{i}

is the bias term, which adjusts the sensitivity of the model output;

σ

is the activation function, which is used to apply nonlinear mapping and improve the ability of discriminant expression. Model training optimizes the adaptability of logical rules to attribute features by continuously adjusting weights and biases, and enhances the generalization ability of the model in sample diversity.

Classification prediction and output result optimization

The trained discriminant model predicts the attribute feature vector of the new input, and determines the category label based on the activation degree of the logical rule and the matching confidence. The prediction process is defined as

\hat{y} = \arg \max_{k} (\sum_{i = 1}^{n} L_{k} (f_{i}))

(11)

where

\hat{y}

is the final predicted category;

k

represents the category index;

L_{k}

is the activation output of the

k

-th category corresponding to the i-th logical rule; and

n

is the total number of logical rules. The classifier sums the activation values of each logical rule to comprehensively measure the degree of discrimination of different categories, ensuring that the final output has a high degree of discrimination confidence. This prediction mechanism effectively reduces the risk of misjudgment of a single feature dimension and achieves comprehensive discrimination from multiple angles and multiple rules.

The result output link incorporates post-processing optimization, using threshold adjustment and confidence correction to suppress the probability of misjudgment caused by noise interference. Specifically, by setting a dynamic threshold, samples with prediction confidence lower than the threshold are marked as uncertain categories, further triggering manual review or model retraining strategies. This ensures the stability and reliability of the classifier in practical applications. The overall process design completes the closed loop from attribute feature extraction and logical expression construction to classification prediction output, meeting the accuracy and efficiency requirements of recycled coarse aggregate attribute identification.

Method effectiveness evaluation

Experimental data and environment construction

Experimental data

The study used a data set of recycled coarse aggregate samples from a building materials laboratory, which included 150 groups of samples, covering 8 different material categories and 7 particle size ranges. The physical and chemical properties of each sample are measured, including 6 key indicators: density, porosity, water absorption, compressive strength, pH value, and chloride ion content. During the data collection process, standard experimental methods are used to ensure the accuracy of the measurement results.

Environment setup

Hardware configuration

NVIDIA RTX 3090 GPU (24 GB video memory), Intel Xeon Gold 6248R CPU (3.0 GHz, 48 cores), and 128 GB memory to ensure efficient processing of large-scale graph data.

Software framework

Based on Python 3.8, PyTorch 1.10.0 is used to implement the GCN model, and Scikit-learn 1.0.2 is used to complete the traditional machine learning comparison experiments (such as SVM and random forest). The core algorithms of attribute mathematical theory (membership calculation and covering operator) are implemented through NumPy and SciPy.

Experimental settings

GCN uses a two-layer structure, with hidden layer dimensions of 64 and 128, respectively; the activation function is ReLU (Rectified Linear Uni), and the optimizer is Adam (learning rate 0.001). Five-fold cross-validation is used during training, and the number of epochs per fold is set to 200. The early stopping strategy (patience = 20) prevents overfitting. The parameters of the comparison methods (SVM, RF, MLP) are all tuned through grid search to ensure fairness.

Internal correlations among key properties of recycled coarse aggregate

The experimental determination of 30 groups of different recycled coarse aggregate samples covers six properties: density, porosity, water absorption, compressive strength, pH value and chloride ion content. The data of each attribute are standardized to eliminate the dimension effect, and then the Pearson correlation coefficient between the attributes is calculated to generate a symmetric correlation matrix. In order to construct the weighted structure of the graph, the synchronous occurrence probability of the high-frequency co-occurrence intervals of each attribute value in the sample is further counted to form an asymmetric co-occurrence weight matrix, which is then fused with the correlation matrix. The construction of the weighted adjacency matrix is achieved through threshold screening and structural normalization.

Figure 4 shows the internal correlation structure between the key attributes of recycled coarse aggregate, which is visualized in the form of correlation matrix and weighted adjacency matrix. The correlation heat map in Figure a shows that physical properties such as density, porosity, water absorption and compressive strength are highly synergistic, forming a tightly coupled substructure, and the color represents the correlation coefficient. This correlation is derived from the consistency of the material microstructure, indicating that the pore distribution has a dominant influence on the overall physical properties. There is a certain degree of negative correlation between physical properties and chemical indicators (pH value and chloride content), revealing that the compositional differences during the production process may lead to dual constraints of physical stability and chemical corrosiveness. Figure b applies co-occurrence frequency for structural weighting, which allows weakly connected nodes that do not reflect significant correlation to be retained, enhancing the connectivity and information integrity of the graph structure. The color represents the joint frequency weight. The connection strength between attributes not only retains the significant relationship but also further reflects the logical coupling pattern of attribute co-occurrence. The edge weights of some weakly correlated nodes are increased under the action of co-occurrence weights, which shows that statistical co-occurrence has complementary value in revealing potential structures. This weighted structure provides a more reasonable connection semantic basis for graph neural networks, helps to accurately model the complex and nonlinear interaction mechanism between attributes, and improves the adaptability and stability of the overall structural representation.

Figure 4.

Internal correlations among key attributes of recycled coarse aggregate.

Hidden feature extraction effect

Table 2 shows the feature extraction performance of attribute nodes under different GCN layers. As the number of layers increases, the node feature dimension gradually expands, and the representation ability is correspondingly enhanced. The node representation density increases from 18.7% to 41.2%, reflecting the layer-by-layer fusion effect of the potential association between attributes. The parameter growth is controlled within a reasonable range, and there is no significant overfitting risk. The training error shows a downward trend, and the three-layer structure drops to 0.112, indicating that the discriminative ability of the coupled features is significantly enhanced. The complexity of feature expression is measured by information entropy, which shows a steady increase after the number of layers increases, indicating that the information contained in the node representation is richer, which helps to reveal the spatial relationship and logical dependency structure hidden in the attribute graph.

Table 2.

Contribution of GCN hierarchical structure to attribute feature extraction.

GCN layer	Node feature dimension	Node representation density (%)	Model parameters (×10⁴)	Training error (cross entropy)	Feature expression complexity (entropy)
1	64	18.7	3.25	0.213	2.87
2	128	29.3	3.47	0.158	3.56
3	256	41.2	3.65	0.112	4.12

The topological structure of the input graph convolutional network (GCN) is formed by processing the data of the recycled coarse aggregate sample. GCN propagates and aggregates node information layer by layer through multilayer convolution operations to extract hidden features of global coupling. The first layer captures local neighbor relationships; the second layer fuses multi-order neighbor information; the third layer further abstracts global contextual dependencies and finally outputs feature representations with contextual semantics. This process reveals the coupling relationship between attributes from shallow to deep, providing a high-quality feature foundation for logical reduction and classification prediction.

Figure 5 shows the hidden feature representation extracted by the multilayer graph convolutional network, including the feature distribution of the first, second, and third layer GCN outputs. The horizontal axis represents the sample index (from 1 to 150), the vertical axis represents the node index, and the color represents the feature value range (normalized between −2 and 2). Each sample has different feature values at different nodes. The data is normalized to eliminate dimensional differences and ensure that the input features are on a uniform numerical scale. As the number of GCN layers increases, the range of feature values in different nodes gradually decreases, indicating that the model gradually transitions from local features to global features. The first layer of GCN mainly captures the information of local neighbors, and the range of feature values is large, reflecting the initial modeling of direct associations between attributes. The second-layer GCN begins to integrate multi-order neighbor information, and the range of feature value changes is reduced, showing more contextual dependencies. The third-layer GCN further captures global coupling features, and the features are more concentrated, indicating that the model can effectively integrate multi-order neighbor information and enhance the contextual semantics of node features. These changes show that GCN improves the attribute recognition effect by gradually excavating the deep coupling relationship between the attributes of recycled coarse aggregate by transmitting attribute information layer by layer. This hierarchical feature extraction mechanism enables node features to not only reflect their own attributes but also integrate multi-order neighbor information, providing higher quality feature representation.

Figure 5.

Hidden feature representation extracted by GCN.

Changes in attribute membership and mathematical feature simplification effects

The membership function curve calculates the corresponding membership based on the feature value range by setting different α parameters to achieve the regulation of attribute sensitivity. The specific steps include first mapping the high-dimensional features output by GCN to the membership space, and then adjusting the α parameter to gradually generate different curves to reflect the changes in attribute membership. The coverage and attribute compression ratio data come from the redundant attribute elimination records during the attribute reduction process. By calculating the impact of the remaining attributes on the overall discrimination ability, the attribute dependency matrix is constructed to evaluate the coverage changes of the attribute set at each stage. The attribute compression ratio is expressed as the ratio of the number of remaining attributes to the number of initial attributes, showing the reduction effect.

Figure 6 shows the change in attribute membership and the effect of mathematical feature simplification:

Figure 6.

Changes in attribute membership and mathematical feature simplification effects.

Figure a shows the response characteristics of the membership function as the parameter α changes. The horizontal axis is the attribute feature value, and the vertical axis is the corresponding membership. As the α value gradually increases from 1 to 5, the transition interval of the function curve narrows significantly, showing a steeper change trend. This change reflects the increased sensitivity of the membership to the attribute characteristics; that is, the high α value strengthens the influence of the attribute on the discrimination, making the discrimination of some key attributes more prominent in the feature space. This feature helps to capture core attributes more accurately during mathematical feature reduction and avoid feature ambiguity or confusion caused by low sensitivity. Figure b reveals the dynamic relationship between coverage and attribute compression ratio during attribute reduction. The horizontal axis represents the number of attributes removed, the left vertical axis corresponds to coverage, and the right vertical axis reflects the attribute compression ratio. As the number of attributes gradually decreases, the coverage shows a slow downward trend, indicating that the discriminative ability of the overall feature set remains stable, and the elimination of redundant information does not cause obvious coverage loss. This phenomenon shows that the screening strategy effectively retains the discriminative information while compressing the size of the feature set, ensuring that the model performance is not weakened. At the same time, the compression ratio has steadily increased, indicating that the feature space has been effectively simplified, reducing the complexity and computational burden of model training. The data revealed that the mathematical feature reduction mechanism has achieved a good balance between maintaining discriminant performance and reducing redundancy. By adjusting the sensitivity of the membership function and covering operator analysis, the constructed attribute dependency relationship achieves efficient and accurate core attribute screening.

Evaluation of attribute recognition accuracy and recall

Accuracy and recall are used as the main quantitative indicators. Accuracy is used to reflect the correctness of the model’s overall discrimination of various types of recycled coarse aggregate samples, while recall focuses on the model’s ability to detect samples of the target category, especially in multi-class fine-grained classification tasks. Each method is run in a unified test set environment to maintain the consistency of sample distribution and feature dimension. The classification results are counted according to the two dimensions of material type and particle size. The material type and particle size represent different attribute compositions. The accuracy and recall curves of the four methods of GCN and attribute mathematical fusion method, SVM, random forest and MLP are extracted in each category. The classification performance differences of each method are compared horizontally to verify the recognition effect advantage of the fusion mechanism under high coupling feature expression.

Figure 7 shows the accuracy and recall performance of four different methods in material category recognition and particle size recognition tasks, including GCN-attribute mathematical fusion method, support vector machine (SVM), random forest (RF) and multilayer perceptron (MLP). In the Figure, sub-graphs a and b respectively show the recognition accuracy and recall rates of eight different material categories, while sub-graphs c and d correspond to the recognition accuracy and recall rates of seven particle size ranges.

Figure 7.

Attribute recognition accuracy and recall.

As can be seen from sub-graphs a and b, the overall accuracy and recall of the GCN-attribute mathematics fusion scheme are better than those of the other three methods. The average recognition accuracy and recall of eight different material categories are both 0.89, reflecting its ability to efficiently mine the multi-dimensional properties of materials and capture associated features. The performance fluctuates between different material categories, indicating that the method can adapt to the heterogeneity of material properties, and performs particularly well in key categories such as demolished concrete aggregate and waste mortar aggregate. The support vector machine and random forest have similar accuracy performance on some materials, and the performance difference is mainly reflected in the stability of the recall rate. The overall performance of the multilayer perceptron is slightly inferior, and the recall ability on complex material categories is insufficient. Sub-figures c and d reflect the ability of each method to identify particle size. The GCN fusion method still maintains its leading position, with the average recognition accuracy and recall rate of 0.90 and 0.89 in the seven particle size ranges, respectively, showing robust performance in all particle size ranges, while other models perform poorly in the particle size range, indicating that their generalization ability is limited. Support vector machines and random forests have a certain performance in particle size recognition, but there is a gap with the GCN-attribute mathematical fusion solution, and the multilayer perceptron has the worst overall performance. The data verifies the advantages of the GCN-attribute mathematical fusion method in the case of complex attribute multi-dimensional coupling and fully demonstrates the effect of this method on improving the recognition accuracy and recall ability of different material categories and particle sizes.

Evaluation of model stability and generalization ability

In order to verify the robustness of the model under different sample division conditions, a 5-fold cross-validation strategy is adopted. The training and testing phases are independently modeled according to the five subsets after division, and the accuracy of each method in each fold is statistically analyzed. The mean and standard deviation of the 5-fold accuracy sequence are then extracted to measure the model’s ability to maintain performance consistency in a multi-sample structure. The mean accuracy describes the overall prediction ability, and the standard deviation reflects the degree of fluctuation of the model response under different data partitioning conditions. The combination of the two reflects the generalization performance of the algorithm in complex attribute disturbance scenarios. The experiment sets up multiple comparison models at the same time, maintaining the same sample splitting strategy and parameter configuration to ensure the fairness and comparability of the indicator results.

Table 3 systematically presents the mean and standard deviation of the accuracy of the four classification models under the five-fold cross-validation condition, covering the specific accuracy performance of each fold, and calculating the overall stability index based on this. The results show that the accuracy of the GCN fusion attribute mathematical method fluctuates little in each fold, with an accuracy between 0.889–0.918 and a standard deviation of 0.012, reflecting its strong robustness and structural adaptability to data partition disturbances. In contrast, the performance differences between SVM and MLP models at different folds are more significant, with standard deviations of accuracy of 0.019 and 0.015, respectively, indicating that they have strong training dependence and relatively unstable generalization performance. The standard deviation of random forest is 0.010, and although it has high stability, its overall accuracy is lower than that of the GCN fusion model. The data results verify the advantages of the linkage between attribute graph structure modeling and graph learning mechanism in improving model robustness from a numerical level. The application of the standard deviation indicator not only reveals the degree of discreteness of the performance of each model but also provides a theoretical reference for model optimization and integrated design, which helps to assist in judging the feasibility and practicality of the recognition scheme from the perspective of stability.

Table 3.

Comparison of mean and standard deviation of accuracy under cross-validation.

Fold/Metric	GCN + attribute math	Support vector machine (SVM)	Random forest (RF)	Multilayer perceptron (MLP)
Fold 1 accuracy	0.895	0.844	0.872	0.822
Fold 2 accuracy	0.914	0.829	0.861	0.807
Fold 3 accuracy	0.902	0.862	0.88	0.835
Fold 4 accuracy	0.889	0.812	0.854	0.799
Fold 5 accuracy	0.918	0.847	0.869	0.828
Mean accuracy	0.904	0.839	0.867	0.818
Standard deviation	0.012	0.019	0.010	0.015

Comparison of feature dimension compression effects

In the process of evaluating the effect of feature dimension compression, multi-fold data subsets are constructed to verify the trade-off ability of different feature selection methods between redundancy elimination and information retention. The feature reduction rate and information retention rate of the attribute mathematical reduction strategy (Attr-Reduction) and PCA and mRMR in each folded subset are calculated to measure their effectiveness and information integrity performance when compressing dimensions. The feature reduction rate reflects the proportion of dimensions that are successfully compressed in the initial feature set, and the information retention rate describes the contribution of the key attributes retained after compression to the original classification task. The attribute mathematical reduction strategy performs attribute pruning based on its internal logical consistency. PCA focuses on variance coverage, and mRMR emphasizes the minimum redundancy between features and the maximum inter-class difference.

Figure 8 shows the comparison of feature reduction rate and information retention rate of three feature dimension compression methods on the 50% data subset. The attribute mathematical simplification strategy shows a high level of feature reduction rate, with an average of 0.67, indicating that it is more efficient in removing redundant features and can effectively reduce irrelevant or repeated information. PCA can also maintain a relatively stable dimensionality reduction effect, reflecting that it relies on the feature variance contribution ranking and can reasonably compress the data dimension, but the overall feature reduction rate is lower than that of the attribute mathematical method. The mRMR method has the lowest feature reduction rate and focuses on maximizing the difference between features and the distinction between categories, thus retaining more information dimensions to enhance the discrimination ability. In terms of information retention rate, the attribute mathematical simplification strategy also shows excellent performance, with an average of 0.92, indicating that while ensuring compression efficiency, it relies on the dependency matrix between attributes to accurately screen out core features and effectively retain the key information required for classification and discrimination. PCA is dominated by variance explanation and may ignore some low-variance features that have a subtle contribution to classification. mRMR considers redundancy minimization and correlation balance in feature selection, so the amount of information retained in the folding is sacrificed. The attribute mathematical simplification strategy accurately identifies redundant features by establishing an attribute dependency network, achieving a balance between stability and information integrity in a diverse data environment, and showing a better balance between dimensionality reduction and information retention.

Figure 8.

Comparison of feature compression efficiency and information preservation.

Analysis of recognition time consumption and computational efficiency

The model efficiency is quantified by the average training time and single sample prediction time. The computational resource consumption of the GCN-attribute mathematical method is compared with that of the support vector machine (SVM), random forest (RF), and multilayer perceptron (MLP) methods when processing the same test set.

Table 4 systematically summarizes the training time and single sample prediction time of the four models under the 5-fold cross-validation condition, which is used to measure the performance differences of different recognition strategies in terms of computational efficiency. From the perspective of training time, the GCN-attribute mathematics fusion method shows a high time overhead in all five folds, with an average training time of 127.4 s, which is slightly longer than other models. This is mainly due to the coupling characteristics of its internal graph convolution operation and attribute logic calculation. Relatively speaking, the random forest model takes the least time in the training phase due to its stable structure and strong parallelism, with an average of only 89.9 s. The averages of the support vector machine and multilayer perceptron are 97.4 s and 104.1 s, respectively, maintaining a good computational balance. In the prediction phase, the support vector machine and random forest show better response speed, with an average prediction time of 2.1 milliseconds and 1.7 milliseconds for a single sample, respectively, which is lower than the 3.9 milliseconds of the GCN fusion strategy. The GCN-attribute mathematical fusion method is still robust overall, and this small difference does not affect the overall performance of the model. The numerical trend presented in the data clearly reveals the specific differences between different models in terms of training efficiency and inference latency, providing a clear reference for model selection under limited computing resources in actual deployment.

Table 4.

Analysis of recognition time and computational efficiency.

Model	Training time fold 1	Training time fold 2	Training time fold 3	Training time fold 4	Training time fold 5	Average training time (s)	Prediction time per sample (ms)
GCN-attribute mathematical	124.3	130.1	126.5	128.7	127.4	127.4	3.9
SVM	95.7	98.2	97.4	99	96.8	97.4	2.1
RF	88.5	91.3	90.2	89.7	90	89.9	1.7
MLP	102.9	105.5	104	103.7	104.3	104.1	2.8

Conclusion

To address the challenge of complex feature coupling in the identification of recycled coarse aggregate properties, this study proposes a novel framework that integrates graph learning with attribute mathematical simplification for feature modeling and classification. By constructing an attribute graph structure, the intrinsic relationships among various physicochemical indicators are effectively captured, while graph convolution operations enable deep extraction of coupled information, thereby overcoming the limitations of traditional methods in modeling global correlations. To tackle the issue of high-dimensional feature complexity, a logical relationship compression mechanism based on attribute mathematics is introduced. This mechanism accurately identifies the core attribute set that most significantly influences the classification task, achieving effective feature dimensionality reduction while preserving the integrity of decision boundaries. In the classification stage, the integration of attribute logic modeling and matching rules enables precise recognition of both material categories and particle size classifications of recycled aggregates. Comprehensive performance evaluation demonstrates that the proposed GCN–attribute mathematics fusion model outperforms conventional machine learning methods—including SVM, random forest, and MLP—in terms of recognition accuracy and recall across multiple material types and particle size ranges. The attribute mathematical simplification strategy achieves an impressive average feature reduction rate of 67%, while retaining 92% of the original information, confirming its effectiveness in enhancing recognition performance without sacrificing classification fidelity. The proposed fusion strategy provides a promising approach for the intelligent identification of samples characterized by multi-source attributes and complex structural relationships, exhibiting strong generalization capability and engineering applicability. Future work will explore the application of time-series graph modeling in dynamic multi-source attribute scenarios, aiming to further enhance the adaptability and robustness of the recognition model under time-varying feature structures.

Footnotes

ORCID iD

Xiang Chen

Funding

The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Science and Research Project of the Hunan Provincial Department of Education (Project No.: 23C0481).

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Sivamani

Renganathan

Palaniraj

. Enhancing the quality of recycled coarse aggregates by different treatment techniques—A review. Environ Sci Pollut Res Int 2021; 28(43): 60346–60365.

Alyaseen

Poddar

Alahmad

, et al. High-performance self-compacting concrete with recycled coarse aggregate: comprehensive systematic review on mix design parameters. Journal of Structural Integrity and Maintenance 2023; 8(3): 161–178.

Panghal

Kumar

. Enhancing concrete durability and strength: an innovative approach integrating abrasion and cement slurry treatment for recycled coarse aggregates. Struct Concr 2025; 26(2): 1455–1476.

Wang

Cao

, et al. An overview of bond behavior of recycled coarse aggregate concrete with steel bar. Rev Adv Mater Sci 2021; 60(1): 127–144.

Elansary

Ashmawy

Abdalla

. Effect of recycled coarse aggregate on physical and mechanical properties of concrete. Adv Struct Eng 2021; 24(3): 583–595.

Zhao

Zhang

Xie

, et al. Effects of nano-SiO2 modification on rubberised mortar and concrete with recycled coarse aggregates. Nanotechnol Rev 2022; 11(1): 473–496.

Patil

Balakrishna Rao

Nayak

. Prediction of recycled coarse aggregate concrete mechanical properties using multiple linear regression and artificial neural network. J Eng Des Technol 2023; 21(6): 1690–1709.

Ramos

Marchão

Pacheco

, et al. A review of punching behavior of slab–column connections with recycled coarse aggregate concrete. Struct Concr 2025; 26(2): 1388–1401.

Batikha

Ali

STM

Rostami

, et al. Using recycled coarse aggregate and ceramic waste to produce sustainable economic concrete. Int J Sustain Eng 2021; 14(4): 785–799.

10.

Zhang

Wan

, et al. Flexural behavior of reinforced geopolymer concrete beams with recycled coarse aggregates. Adv Struct Eng 2021; 24(14): 3281–3298.

11.

Abdullah

Abdulazeez

. Machine learning applications based on SVM classification a review. Qubahan Academic Journal 2021; 1(2): 81–90.

12.

Chandra

Bedi

. Survey on SVM and their application in image classification. Int J Inf Technol 2021; 13(5): 1–11.

13.

Rhodes

Cutler

Moon

. Geometry-and accuracy-preserving random forest proximities. IEEE Trans Pattern Anal Mach Intell 2023; 45(9): 10947–10959.

14.

Ashwathi

Soundariya

Tharsanee

, et al. Prediction of strength properties of concrete under the influence of recycled aggregate using machine learning models. Interactions 2024; 245(1): 1–21.

15.

Zhang

Yin

, et al. Applications of artificial neural networks in microorganism image analysis: a comprehensive review from conventional multilayer perceptron to popular convolutional neural network and potential visual transformer. Artif Intell Rev 2023; 56(2): 1013–1070.

16.

Kilincer

Ertam

Sengur

, et al. Automated detection of cybersecurity attacks in healthcare systems with recursive feature elimination and multilayer perceptron optimization. Biocybern Biomed Eng 2023; 43(1): 30–41.

17.

Salimbahrami

Shakeri

. Experimental investigation and comparative machine-learning prediction of compressive strength of recycled aggregate concrete. Soft Comput 2021; 25(2): 919–932.

18.

Hao

Pabst

. Prediction of CBR and resilient modulus of crushed waste rocks using machine learning models. Acta Geotech 2022; 17(4): 1383–1402.

19.

Ahmed

AHA

Jin

Ali

MAH

. Artificial intelligence models for predicting mechanical properties of recycled aggregate concrete (RAC): critical review. J Adv Concr Technol 2022; 20(6): 404–429.

20.

Yang

Chen

. Estimation on compressive strength of recycled aggregate self-compacting concrete using interpretable machine learning-based models. Eng Comput 2024; 41(10): 2727–2773.

21.

. Using the automated random forest approach for obtaining the compressive strength prediction of RCA. Multiscale and Multidiscip Model Exp and Des 2024; 7(2): 855–867.

22.

Singh

Tipu

Mir

, et al. Predictive modelling of flexural strength in recycled aggregate-based concrete: a comprehensive approach with machine learning and global sensitivity analysis. Iranian Journal of Science and Technology, Transactions of Civil Engineering 2025; 49(2): 1089–1114.

23.

Oskooei

Mohammadinia

Arulrajah

, et al. Application of artificial neural network models for predicting the resilient modulus of recycled aggregates. Int J Pavement Eng 2022; 23(4): 1121–1133.

24.

Onyelowe

Gnananandarao

Ebid

, et al. Evaluating the compressive strength of recycled aggregate concrete using novel artificial neural network. Civ Eng J 2022; 8(8): 1679–1693.

25.

Beattie

Esmonde-White

FWL

. Exploration of principal component analysis: deriving principal component analysis visually using spectra. Appl Spectrosc 2021; 75(4): 361–375.

26.

Hameed

AlOmar

Baniya

, et al. Incorporation of artificial neural network with principal component analysis and cross-validation technique to predict high-performance concrete compressive strength. Asian J Civ Eng 2021; 22(6): 1019–1031.

27.

Chelimilla

Chinthapenta

Kali

, et al. Review on recent advances in structural health monitoring paradigm for looseness detection in bolted assemblies. Struct Health Monit 2023; 22(6): 4264–4304.

28.

Yin

Zhai

Xie

, et al. Feature selection using max dynamic relevancy and min redundancy. Pattern Anal Appl 2023; 26(2): 631–643.

29.

Joseph

Pachiappan

Avudaiappan

, et al. Prediction of the mechanical properties of concrete incorporating simultaneous utilization of fine and coarse recycled aggregate. Revista de la construcción 2023; 22(1): 178–191.

30.

Ahmed

AHA

Jin

Ali

MAH

. Comparative analysis of intelligent models for predicting compressive strength in recycled aggregate concrete. Model Earth Syst Environ 2024; 10(4): 5273–5291.

31.

Tian

Yang

, et al. A coarse aggregate particle size classification method by fusing 3D multi‐view and graph convolutional networks. Computer aided Civil Eng 2025; 40(7): 940–958.

32.

Chen

Wang

Yue

, et al. Semantic knowledge integrated graph convolutional network for zero-sample tracing of control performance degradation. Ind Eng Chem Res 2023; 62(49): 21265–21277.

33.

Wei

Fang

Yang

, et al. Prediction of water absorption of recycled coarse aggregate based on deep learning image segmentation. Tm-Technisches Messen 2025; 91(12): 658–671.

34.

Pan

Guo

, et al. Hybrid random aggregation model and Bayesian optimization‐based convolutional neural network for estimating the concrete compressive strength. Computer aided Civil Eng 2024; 39(4): 559–574.

35.

Zhang

, et al. Pearson correlation coefficient-based performance enhancement of broad learning system for stock price prediction. IEEE Trans Circuits Syst II 2022; 69(5): 2413–2417.

36.

Deng

Cheong

. Combining conflicting evidence based on pearson correlation coefficient and weighted graph. Int J Intell Syst 2021; 36(12): 7443–7460.