The application of RF algorithm integrated with F-measure in the quality evaluation of agricultural personnel training

Abstract

The basis for societal progress and development is the caliber of people training provided by colleges and universities. In the context of the development of agricultural modernization, a key assurance for fostering the growth of agricultural modernization is the caliber of the staff training provided in agricultural colleges and universities. At present, the existing evaluation models for the quality of personnel education in agricultural universities have defects such as low efficiency in data processing and long running time. Therefore, to better judge the training quality of agricultural graduates in colleges and universities, the study proposes to use the random forest algorithm to build an evaluation method for the quality of agricultural talent training, and on this basis, use the TRRF algorithm to improve it. At the same time, to weight the data, the F-measure assessment method is applied. In this way, the effectiveness of the evaluation model can be improved. The experimental findings demonstrate that the F-TRRF model suggested in the study has 99.17% accuracy when evaluating the standard of agricultural staff training. Therefore, the random forest evaluation model integrated with F-measure suggested in the study has a high level of precision in evaluating the training quality of agricultural talents, which can effectively meet the actual needs of talent evaluation, provide a basis for cultivating more and higher quality agricultural talents, and provide reference opinions on talent cultivation.

Keywords

quality of personnel training agricultural colleges random forest trees reduction random forest F-measure

Introduction

With education’s ongoing evolution, changing the way of higher education works continues to deepen. Enhancing the standard of personnel training is the top priority for optimizing the structure of higher education and improving the level of running schools and social service capabilities of colleges and universities.¹ “Sannong” occupies an important position in China’s socialist modernization drive, and cultivating more high-quality agricultural talents is the core and key to solving the “three rural” issues.^2,3 At present, the quantity of agricultural professional personnel training in China cannot fulfill the real requirements of social progress. There are many defects in the existing methods in the process of assessing the level of agricultural personnel education.^4,5 There is a relative lack of specialized talent evaluation methods, especially in the agricultural field, where the evaluation content is complex and the evaluation process is difficult to measure. In addition, existing evaluation indicators cannot objectively reflect the problems that universities face in talent cultivation. Therefore, to raise the standard of agricultural students training at universities, the study aims to assess how well agricultural students are trained at universities, and analyze the problems in the process of agricultural personnel training through the evaluation results. As one of the integrated learning algorithms, random forest (RF) is often used to solve data classification and regression tasks. In terms of generalization power, classification impact, and noise tolerance, the method performs well. The study uses the RF to construct an evaluation model to assess the quality of agricultural talent training and to improve the model by using the TRRF algorithm to solve the problems. Meanwhile, the data characteristics are weighted by using the F-measure. F-measure, as a metric that combines recall and precision, has been widely used in binary classification, multi-label classification, and even in structured output prediction. The larger the F-measure value, the better the performance of the classifier. The main method for classifying imbalanced data is to design optimized algorithms for F-measure evaluation criteria, or design corresponding algorithms based on processed sample data. The F-measure is used to weight the data features. An improved RF agriculture talent quality evaluation model based on F-measure weighting is constructed (F-TRRF). It is hoped that the accurate evaluation of the quality of agricultural personnel training can be realized, and the quality and level of agricultural personnel training in universities can be improved.

The innovation of this article is as follows. The study uses RF algorithm to construct a quality evaluation method for agricultural talents. Then the TRRF algorithm is innovatively introduced to improve the RF, optimizing the quality evaluation method for agricultural talents.

Compared with traditional methods, the proposed method first designs a relatively objective and scientific talent quality evaluation index system. Secondly, in response to the shortcomings of traditional RF models in evaluation, the TRRF algorithm is introduced to optimize it, and F-measure is used to weight the data. The evaluation model obtained can effectively address the shortcomings of traditional methods in data feature extraction and classification during the application process.

Related works

The stochastic forest technique, one of the ensemble learning algorithms, is popular in many disciplines due to its high flexibility and effective fitting impact. Biici and Zeybek established a landslide susceptibility evaluation model based on self-organizing feature map network and RF model to explore the impact of evaluation unit and non-landslide sample selection methods on landslide susceptibility prediction. The outcomes demonstrate that the tissue feature map network’s prediction accuracy based on RF reaches 94.94%. This may significantly increase the RF model’s assessment accuracy and be used to choose non-landslide samples.⁶ Cai et al. used the RF algorithm to figure out the pavement friction prediction model’s texturing parameters. The outcomes of the experiment demonstrate the prediction model’s efficacy and accuracy after introducing the RF algorithm are significantly improved, and at the same time, there are clearer judgments on low-speed friction and high-speed friction.⁷ Chen et al. used the RF algorithm to build a regional bridge earthquake damage state assessment model to monitor the state of regional bridge structures. The findings indicate that the model’s assessment accuracy is greater than 90%, which provides guidance for the seismic design, disaster prevention and structural renovation of bridges.⁸ Uhlenkott et al. used the RF algorithm to classify dengue fever patients with different severities, so as to help doctors diagnose and treat patients according to their condition. Experiments have proved that the method has excellent performance in predicting the clinical extent of dengue fever.⁹ Yuchi et al. used the RF algorithm to model the indoor PM_2.5 concentration to better analyze the indoor environment. Through experiments, this algorithm has a good effect in predicting indoor air quality.¹⁰

Talents play a crucial role in the development of the society. Therefore, the talent cultivation is particularly prominent. As the cradle of talent cultivation, the core task of universities is to cultivate high-quality talents for social development. Luo and Stoeger pointed out that the high-quality skilled talents are the core of the increasingly fierce competition. In view of the current market needs, the reform of secondary vocational education should be actively promote, generate more highly qualified professionals and cater to the real requirements of social advancement.¹¹ Wang and Wei pointed out that it is vital to develop the scientific research system to increase the caliber of abilities, strengthen the teaching staff and establish a more reasonable talent quality evaluation plan for Chinese universities.¹² Hu et al. developed an assessment model for the cultivation of creative talents in colleges and universities after the epidemic in light of the pandemic’s effects on the development of talents in universities. Four first-level indicators of environment, teaching linkages, instructors, and students make up an assessment index system for the quality of creative and entrepreneurial abilities training in colleges and universities. The empirical findings demonstrate that this strategy may increase the model’s accuracy in evaluating talent training, and it has a specific reference value for that talent training in universities.¹³ Zhang and Zhao used talent evaluation indicators to construct a corresponding skier talent evaluation model when selecting and evaluating skier talents. The outcomes demonstrated that the assessment model is valid and reliable for evaluating athletes.¹⁴ Peng and Dai mainly used the teaching effect of fuzzy decision tree classroom to assess the quality of talents training. Through this model, we can comprehend its effects on talent quality and raise the standard of talent development.¹⁵

To sum up, the RF algorithm has relatively mature applications in the fields of mathematics, engineering, and medical treatment. This algorithm has unique advantages in the process of data classification. According to studies on talent quality assessment, the research in this area mainly focuses on the improvement of talent quality, but there are few related studies on how to evaluate talent quality effectively and reasonably. Therefore, the study combines RF to build a RF-based agricultural talent quality evaluation model, hoping to improve the quality of the effect of agricultural talent quality training and continuously improve the talent training program.

Construction of evaluation model for agricultural talent quality based on the RF algorithm of F-measure

Construction of agricultural talent quality evaluation indicators

Talent quality evaluation is to evaluate the various aspects of the ability and quality of talents in the professional field, so as to reflect the effect of talent training. Evaluating the quality of agricultural talents can better understand the talent training and talent reserve in this field. Talent cultivation is a complex dynamic system with multiple levels, types, and factors. Therefore, its quality evaluation is a complex process that should follow systematic principles and evaluate the quality of graduates based on certain standards. This can enable timely and effective monitoring and feedback of the talent cultivation process and methods, further improving the quality of talent cultivation. When evaluating the quality of agricultural talents, building a reliable and efficient mechanism for assessing agricultural talent is essential. Based on the subject objects in teaching activities, the assessment system uses nine first-level index data and the second-level indicators under each first-level index.^1,16 Table 1 displays the makeup of the individual indicators.

Table 1.

Evaluation index system of agricultural talent quality.

Evaluation subject	One-level indicators	Two-level indicators
Employer	Knowledge	Basic knowledge; professional knowledge; extracurricular knowledge
	Quality	Moral quality; professional quality; psychological quality
	Ability	Learning ability; execution ability; communication ability; knowledge application ability
	Employment competitiveness	Working ability; social adaptability; innovation ability
	Employment expectation	Salary; career development
Other aspects	Employment quality	Employment rate; employment satisfaction
	Graduation level	Entrepreneurship; further education and employment
	Social reputation	School reputation; employer satisfaction
	Teaching and training	Core course learning; sense of professional achievement; campus activities

The evaluation subjects mentioned in Table 1 include employers and other evaluation subjects, among which the other subjects mainly include some social institutions and universities. The first-level indicators include the knowledge, quality, ability, employment competitiveness, employment expectation, employment quality, graduate level, social reputation, and teaching and training of the evaluation object.

In practical work, a solid and professional level of theoretical knowledge is the foundation for completing work. At the same time, good extracurricular knowledge is also indispensable, such as proficiency in quantity calculation and operation, good language communication skills, etc., which are essential knowledge in practical work. Therefore, knowledge level evaluation includes basic knowledge, professional knowledge, and extracurricular knowledge. Personal qualities can reflect a graduate’s comprehensive abilities, including moral qualities, professional quality, and psychological qualities are reflected in various aspects such as interpersonal communication, daily communication, and work execution. The professional skills of graduates are the basic skills they need to cultivate to adapt to future job positions, including learning ability, execution ability, communication skills, knowledge application ability, etc. Employment competitiveness refers to the potential and ability of individuals to obtain job opportunities in the job market, including intrinsic vocational and professional skills, as well as the influence of external factors such as society, family, and learning. Employment expectations refer to the comprehensive reflection of the job positions, employment areas, and salary standards that graduates hope to obtain, as well as whether they can be respected by colleagues and valued by leaders, and whether their abilities and strengths can be utilized. The quality of employment mainly refers to the satisfaction of graduates with the job they are currently engaged in, including work environment, benefits, promotion, etc., which can effectively reflect the quality of talent cultivation in universities. The higher the employment satisfaction of graduates, the higher the recognition of the talent training model of universities in society. The graduate level mainly refers to their future choices, including further education, employment, and entrepreneurship. Social reputation is reflected in the credibility of the graduating school and the evaluation of its employees by employers. Teaching and training abilities are reflected in the core course learning, sense of professional achievement, campus activities, etc.

Both the first- and second-level indications selected above are evaluation indicators determined after the comprehensive judgment of agricultural talents, which can reflect the quality of agricultural talents in a relatively comprehensive, intuitive, and scientific way. The above index system will be used to build the agricultural talent quality evaluation model.

Construction of evaluation model for agricultural talent quality based on RF algorithm

The RF algorithm is a simple, flexible, and easy to operate machine learning method. It can produce excellent results in most cases, which is widely used in data classification and regression tasks. Given its advantages in data classification processing, this study chooses it as the algorithm for this article. RF is a typical algorithm in the bagging type of integrated learning algorithm, and it is a combination classifier. The classification basis is the decision tree (DT). A DT is a tree model that contains three different nodes, namely, the root node, intermediate nodes, and leaf nodes. The object’s attributes are represented by the nodes, the branched paths between them by the potential values of those attributes, and the values of those attributes, in turn, are represented by the leaf nodes.^17,18 There are multiple independent CART DT combinations in RF. CART is an algorithm that takes the value of a split property, splits it in half, and then uses each half to categorize the results. Figure 1 depicts the specific process.

Figure 1.

Schematic diagram of decision tree.

In Figure 1, N represents the total number of samples. N₁ and N₂ represent the two actual categories outputted from N. D₁ and D₂ represent the two actual categories output from N₁. V₁ and V₂ represent two categories classified from N₂. S₁ and S₂ represent two categories classified from V₁. The current training set is divided in half using the binary recursion approach based on the Gini index calculated from the training set, thereby generating subtrees with two branches on the left and right. The Gini metric is utilized by the algorithm to measure data partitioning when a node splits.^19,20 Its calculation process is shown in Formula (1).

G i n i (S) = 1 - \sum_{i = 1}^{m} P_{i}^{2}

(1)

In Formula (1), $P_{i}$ reflects the category $C_{j}$ ’s probability appearing in sample set $D$ . If the sample set $D$ is divided into two subsets, the obtained Gini coefficient is shown in Formula (2).

G i n i_{s p l i t} (D) = \frac{| D_{1} |}{D} G i n i (D_{1}) + \frac{| D_{2} |}{D} G i n i (D_{2})

(2)

In Formula (2), $D_{1}$ and $D_{2}$ , respectively, indicate the two subsets. During feature selection, some features are selected from all the features as the candidate features, and the subtree generation principle of the classification and regression tree (CART) is adopted. Among the features to be selected, the ideal feature for producing a DT is chosen as one that minimizes the sum of squared prediction errors. The randomness of the RF algorithm is manifested in the random selection of samples and the random selection of features. On this basis, the RF algorithm can effectively avoid the occurrence of overfitting.^21,22 The RF extracts sample data based on the bagging method and the self-service sampling technology, and then builds a RF by modeling the DT for each bootstrap data set using the random subspace approach. Data set $D = {(x_{1}, y_{1}), (x_{2}, y_{2}), . . ., (x_{n}, y_{n})}$ , and the number of sample features is $m$ . Using the booststrap sampling method, a data set $D$ of size is extracted from the data set $n$ (the total sample size is $N$ ) with replacement to $K$ form a sample set. The likelihood that each training sample cannot be obtained is given in Formula (3).

p = {(1 - 1 / N)}^{N}

(3)

In Formula (3), when the sample $N$ tends to infinity, the value of $p$ is close to 0.368. That is, 36.8% of the samples in data set $D$ will not be drawn. After the sample data is extracted, the classifier model is constructed using the classification regression tree CART method. Train a $K$ DT separately. When generating a DT, select a feature when dividing a node $m t r y (m t r y = \log_{2} m)$ , and then select the most important feature for node division according to the feature evaluation method. Considering the $M$ input characteristics, randomly select $m$ the features $m \leq M$ as the characteristics set of the DT node, and select the optimal split feature and split point. Combine the decision trees obtained after training the $K$ booststrap sample set according to the CART method into a RF model ${g_{i}, i = 1, 2, . . . K}$ , and use the samples $q$ to test the model, and the test results obtained are shown in Formula (4).

Q = {g_{1} (q), g_{2} (q), . . ., g_{K} (q)}

(4)

After the classification results are obtained, the results of each decision tree are counted. The final regression result will then be the average of all the forecast findings, as shown in Formula (5).

T (q) = \frac{1}{K} \sum_{i = 1}^{K} t_{i} (q)

(5)

In Formula (5), $t$ represents the expected outcome of a DT. The implementation process of the RF algorithm can be obtained in Figure 2.

Figure 2.

Flow diagram of random forest algorithm.

In RF, $R F = {h (X, θ_{k}) | k = 1, 2, . . . K}$ , $K$ is the quantity of DTs in the forest, and the residual function of the RF is displayed in Formula (6).

m r (X, Y) = a v_{k} I (h (X, θ_{k}) = Y) - \max a v_{k} I (h (X, θ_{k}) = j)

(6)

In Formula (6), $X$ reflects the set of feature vectors of $p$ dimension; $Y$ is the vector category; $I (\cdot)$ represents the function; and $a v_{k} (\cdot)$ indicates the average value. The larger the residual function of RF is, the higher the correctness of the model identification sample is. The model has a better performance. From Formula (6), the generalization error of the DT in the RF algorithm satisfies the convergence process of Formula (7).

\underset{υ \to \infty}{l i n} P E^{*} = P x y (P_{Θ} (k (X, Θ) = Y) - \max_{j \neq Y}) P_{Θ} (k (X, Θ) = j) < 0

(7)

In Formula (7), $υ$ represents the number of DT. The generalization error of RF algorithm is displayed in Formula (8).

P E = P_{X, Y} (m g (X, Y) < 0)

(8)

In Formula (8), $P E$ represents the generalization error of the RF algorithm. $X, Y$ represents the definition space of probability. The RF algorithm’s generalization error is influenced by the DT’s correlation and strength. There is an upper limit of the generalization error, as shown in Formula (9).

P E^{*} \leq \frac{\bar{ρ} (1 - s^{2}) /}{s^{2}}

(9)

In Formula (9), $P E^{*}$ represents the upper limit of generalization error for the RF algorithm. $\bar{ρ}$ represents the average value of the relationship between DTs. $s$ indicates the average strength of the DT. When constructing the RF algorithm, each DT corresponds to an initial data and a data set without sample extraction. If the data set with no sample is defined as $W_{k} (x)$ , $W_{k} (x, y_{j})$ means the proportion of the input random vector voting $x$ in $W_{k} (x)$ the classification category of $y_{j}$ . Formula (10) illustrates the calculating process.

W_{k} (x, y_{j}) = \frac{\sum_{k} I (h_{k} (x)) = y_{j}, (x, y) \in W_{k} (x)}{\sum_{k} I (h_{k} (x), (x, y) \in W_{k} (x))}

(10)

In Formula (10), the numerator represents the sum of each decision tree and the number of correct classifications corresponding to the data set that has not been extracted, and the denominator represents the total of the number of data set samples that has not been extracted.

Construction of agricultural talent quality evaluation model based on improved RF algorithm

Although RF has significant advantages in data processing, for some noisy classification data, RF is prone to overfitting during training. At the same time, different data attributes can also affect the classification performance of RF. Therefore, based on the RF algorithm, the trees reduction random forest (TRRF) algorithm is used to improve the RF algorithm. The number of DTs in RF is simplified by the method of classification precision of the DT and the similarity between the DTs, and the DTs with higher quality are extracted to form a new random sub-forest, so as to enhance the model’s performance.^23,24 Figure 3 depicts the modified RF algorithm process.

Figure 3.

Flow chart of random forest based on TRRF algorithm.

Calculate the AUC value of every DT in RF, arrange the calculation results from high to low, and remove the decision tree with a lower AUC value. In this process, affected by factors such as the data’s characteristic dimension, set and noise, the quality of DT with a higher AUC value may also be lower.^20,25 Therefore, when screening the DT, the research first determines the average classification accuracy $V$ in the initial RF, $R F = {g_{i}, i = 1, 2, . . . K}$ of a single DT, and then finds the DT greater than the value $V$ , defined as $S u b R F$ . The calculation method of $S u b R F$ is shown in Formula (11).

S u b R F = {g_{i} : A u c_{i} \geq V} | \frac{1}{| K |} \sum_{j = 1}^{| K |} A u c_{j}

(11)

In Formula (11), $A u c_{i}$ represents the AUC value of the DT. If the number of $S u b R F$ DTs in the middle is greater than 2/3 of the DTs in the initial RF, first calculate the standard deviation $σ$ of AUC values of all DTs in the initial RF and then select AUC value greater than or equal to the average classification accuracy $σ$ , as shown in Formula (12).

S u b R F = {g_{i} : A u c_{i} \geq A - σ}

(12)

According to the selection method of Formula (12), the decision tree with higher precision is selected from the initial RF to form the sub-forest to be clustered. In determining classification judgments, the RF algorithm employs the average majority voting mechanism. However, this classification method will cause every DT to have the same voting weight during classification, which directly affects the quality of classification results.^26,27 Based on this consideration, the research uses F-measure to construct a weighting method based on DT. F-measure combines the precision rate and the recall rate. As the initial information retrieval evaluation standard, it has been widely used in prediction problems such as binary classification and multi-label classification, and is a relatively common evaluation standard. In traditional binary classification learning, the commonly used evaluation indicators are accuracy, precision, and recall. The calculation method of the precision rate is displayed in Formula (13).

\Pr e c i s i o n = \frac{(T P + T N)}{T P + F P + F N + T N}

(13)

In Formula (13), $T P$ indicates that the evaluation result is high-quality agricultural talents, which are actually high-quality agricultural talents. $T N$ means that they are actually low-quality agricultural talents, and the evaluation results are also low-quality agricultural talents. $F P$ indicates that it actually belongs to low-quality agricultural talents, but the evaluation result is high-quality agricultural talents. $F N$ indicates that it is actually a high-quality agricultural talent, but the evaluation result is a low-quality agricultural talent. The calculation method of the recall rate is shown in Formula (14).

R e c a l l = \frac{T P}{T P + F N}

(14)

The larger the precision rate and recall rate indicators are when evaluating the performance of the classifier, the better the performance of the classifier. However, these indicators have better performance when evaluating balanced data sets. For unbalanced data sets, the above indicators cannot accurately measure the performance of classifiers. In this case, the weighted average F-measure of recall and accuracy is utilized to assess the classifier’s performance. F-measure is an indicator for comprehensively evaluating the performance of classifiers. When the data set is imbalanced, the classification performance of the classifier improves as the value increases.²⁸ Each DT will generate a category prediction for each record in the verification set after receiving the data from the verification set. The DTs’ anticipated results will then be compared to the actual outcomes. The formula displays the computation process, as shown in Formula (15).

F - m e a s u r e = \frac{(λ^{2} + 1) \times R e c a l l \times \Pr e c i s i o n}{R e c a l l + λ^{2} \times \Pr e c i s i o n}

(15)

In Formula (15), Recall is the recall rate; Precision is the precision rate; and $λ^{2}$ is a parameter greater than 0. When $λ^{2} = 0$ , the value of F-measure is the same as that of Precision; when the value of $λ$ is infinite, F-measure is equal to Recall. In actual use, the value of $λ$ is 1, that is, F1-measure (F1), that is, the importance of the recall rate and the accuracy rate is equal. When $F_{1} = 1$ , the classifier’s performance is the best; when $F_{1} = 0$ , the classifier’s performance is the worst. When using F-measure for evaluation, the negative impact of the average voting method on the classification results is reduced.

Performance analysis of agricultural talent quality evaluation model based on RF algorithm

Performance analysis of improved RF algorithm

To evaluate the effectiveness of the agricultural talent quality evaluation model proposed in the study, the study selected 2350 graduates of an agricultural university to train the model. After screening the data and deleting redundant, invalid and repeated data, 2000 pieces of data were left for experimentation. The RF model, TRRF model and F-TRRF model are trained separately. Using the same sample data to train it, Figure 4 shows how many times the three models are iterated throughout the training phase.

Figure 4.

Comparison of iteration times of three evaluation models.

From Figure 4, the iteration numbers of the three models are significantly different. The number of iterations of the RF model fluctuates greatly, the training process is unstable, and there are multiple maximum and minimum values. The maximum number of iterations, which total 77, appears when there are 58 rounds; when the quantity of rounds is 25, the minimum number of iterations is 14, the range of iterations is 63 times, and the average quantity of iterations of the whole RF evaluation model is 42 times. The TRRF model’s overall fluctuation trend is somewhat less pronounced than the RF approach. The maximum number of iterations is 74 when there are 66 rounds; when the quantity of rounds is 35, the minimum number of iterations is 42, and the range of iterations is 32 times, and the average quantity of iterations for the entire model is 58 times. The whole training process of the F-TRRF model has little change, no obvious ups and downs, the average number of iterations is 38, and the convergence is good. Compared to the RF and TRRF models, the average quantity of iterations of the F-TRRF method is 4 times and 20 times lower, respectively, and the convergence is significantly better than the other two agricultural talent quality evaluation models.

Figure 5 shows the running time during the training period.

Figure 5.

Running time of three models in training process.

From Figure 5, there are large differences in the running time of the three agricultural talent quality evaluation models. In the overall change trend, with more training data, the three models’ running times lengthen. Specifically, the RF evaluation model has changed the most. In the beginning, there is a running time of 0.28 seconds, and 1.2 seconds when the training samples reach 2000. The running time of the TRRF model varies slightly less than that of the RF model, from 0.2 seconds in the initial stage to 0.78 seconds when the training samples reach 2000. The F-TRRF model consumes the least time and has the highest efficiency during operation. The running time of the initial stage is 0.1 seconds, and when the training samples reach 2000, the running time is 0.41 seconds. Compared with the RF model and the TRRF model, F-TRRF takes 0.79 seconds and 0.37 seconds less time when the training samples reach 2000, respectively, and the model’s performance is significantly superior to that of the RF model and the TRRF model.

Figure 6 displays the accuracy of the three algorithms throughout training.

Figure 6.

Comparison of the training accuracy of three evaluation models.

From Figure 6, the accuracy rates of the three models grow with the quantity of training data. This is because the model can extract more detailed data characteristics from the more samples that are extracted. After the training samples reached 1200, the accuracy rate of the RF model was in a stable state at 96.21%; the accuracy rate of the TRRF model was 97.64%; the training accuracy rate of the F-TRRF model was 98.86%, 65% and 1.22% higher than the RF method and TRRF method, respectively. It may be inferred that the F-TRRF model is considerably more accurate than the RF model and the TRRF model.

The relative errors of the three models during training are shown in Figure 7.

Figure 7.

Comparison of relative errors of three evaluation models.

As illustrated in Figure 7(a), the relative error value of the RF evaluation model has a large change trend, and the relative error value is between 0.10 and 0.45. There are multiple maxima and minima of relative errors in the whole process. The greatest relative error value among them is 0.45, while the smallest is 0.09. In Figure 7(b), the trend of relative error value for the TRRF evaluation model is substantially less than that of the RF evaluation model, and the relative error value ranges from 0.20 to 0.45. There are multiple maxima and minima of relative errors in the whole process. Among them, the maximum relative error value is 0.41; the minimum relative error value is 0.20. In Figure 7(c), the relative error value of the F-TRRF evaluation method is significantly less than that of the RF evaluation model and TRRF evaluation model, and the relative error value is between 0.15 and 0.35. There are multiple maxima and minima of relative errors in the whole process. Among them, the maximum relative error value is 0.31; the minimum relative error value is 0.18. From the above analysis, it can be seen that the research proposed that the F-TRRF model has a small error and an ideal accuracy rate in the evaluation of the quality of agricultural talents.

Application analysis of improved RF algorithm

Applying the trained evaluation model to the evaluation of the talent quality of agricultural university graduates, the accuracy rate obtained is demonstrated in Figure 8.

Figure 8.

Comparison of accuracy of three evaluation models in application.

As illustrated in Figure 8, the accuracy rates of the three models in the application process are quite different. The accuracy of both the RF model and the F-TRRF model rises as the number of test samples grows. The TRRF model’s precision is demonstrated a slight decline after a small increase. Among them, the TRRF model’s accuracy is 92.68%. The TRRF model’s accuracy is 93.12%. The TRRF model’s accuracy varies greatly. When the sample size is 5000, the accuracy rate reaches a balanced state, which is 99.17%. By comparison, the accuracy of the F-TRRF model is 6.49% and 6.05% higher than that of the RF method and the TRRF method, respectively. It can be concluded that the F-TRRF model has a good application effect and can meet the evaluation and estimation of the quality of agricultural talents by universities and social organizations.

Figure 9 illustrates the accuracy rate, recall rate, and F1 value of the three models.

Figure 9.

Accuracy, recall, and F-measure value of the three models.

Figure 9 illustrates the outcomes of experiments comparing the effectiveness of the three models during the application process. The RF model’s precision rate, recall rate, and F1 value of are 88.18%, 85.15%, and 85.21%, respectively. The TRRF model’s precision rate, recall rate, and F1 are 88.82%, 87.54%, and 87.56%, respectively. The precision rate, recall rate, and F1 value of the F-TRRF model are 89.49%, 89.97%, and 90.46%, respectively. After comparison, the precision rate of the F-TRRF model is 1.31% and 0.67% greater than that of the RF model and TRRF model, respectively. The recall rate of F-TRRF model is 4.82% and 2.43% higher than that of RF model and TRRF model, respectively. The F1 value of F-TRRF model is 5.25% and 2.9% higher than that of RF model and TRRF model, respectively. On the whole, the three models all have high precision, recall and F1 value, which can well realize the quality evaluation of agricultural talents. However, the F-TRRF model performs substantially better than the other two types. Therefore, the F-TRRF model proposed by the research has the best evaluation effect on the quality of agricultural talents.

To better analyze the effectiveness of the proposed agricultural talent quality evaluation model, the study compared it with commonly used talent quality evaluation methods, including genetic algorithm optimized BP neural network evaluation method (GA-BP), Analytic Hierarchy Process (AHP), and Naive Bayes. The results are shown in Figure 10.

Figure 10.

Prediction performance of various technologies on academic performance.

Figure 10 shows the evaluation of each algorithm. Overall, as the algorithm progresses, the student’s performance index gradually increases, which is consistent with the true value. Compared with various algorithms, the F-TRRF proposed in the study is closest to the true value, with an average difference of 0.213, followed by Naive Bayes, AHP, and GA-BP.

Based on the above content, the evaluation method for agricultural talent quality proposed in the study can reasonably predict the quality of agricultural college graduates. Compared with traditional evaluation methods, the proposed method has better evaluation performance. Due to various factors, there are many differences in the quality of agricultural talent cultivation in higher education institutions. Therefore, the evaluation results of agricultural talent quality should consider its influencing factors from multiple aspects in order to improve the overall quality of agricultural talents and better serve agricultural development.

Conclusion

The goal of cultivating high-quality talents is the requirement of social development, and cultivating more high-quality agricultural talents is the fundamental measure to realize agricultural modernization. The evaluation of the quality of agricultural talents can provide suggestions and methods for the cultivation of agricultural talents in agricultural universities. Meanwhile, it can facilitate the country and society to better grasp the reserve of relevant talents. The research first selects the appropriate index to form the index system of agricultural talent evaluation. Then build an agricultural talent evaluation method based on RF algorithm, and improve the evaluation model, and use the F-measure evaluation index to weight the data of the evaluation model to enhance the effectiveness of the model assessment. According to the experimental findings, the suggested assessment model’s running time in training process is 0.41 seconds; the model’s precision rate, recall rate, and F1 value are 89.49%, 89.97%, and 90.46%, respectively. Therefore, The F-TRRF agricultural talent quality evaluation model suggested in the study offers higher accuracy and operational efficiency. Simultaneously, its performance in practical application is also more ideal, which can meet the actual needs. However, due to the limitation of data collection, the research only selects some data of agricultural college graduates for experiments, and the experimental data is less. In future research, more data should be collected to continuously improve the evaluation model.

Statements and declarations

Footnotes

Author contributions

Shubing Qiu contributed to the motivation, the interpretation of the methods, the data analysis, and results. Shirong Hu provided the draft versions and revised versions, reference. Yong Liu provided the related concepts and minor recommendations, and extracted the conclusion and discussion.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Projects of Major Humanities and Social Sciences of Anhui University (Grant: SK2021ZD0044), the Project of National Innovation and Entrepreneurship for College Students (Grant: 202210363117X), and the Humanities and Social Sciences of the Ministry of Education (Grant: 18YJCZH228).

References

Ramesh

Krishnan

. Professional competence of teachers in Indian higher agricultural education. Curr Sci 2020; 118(3): 356–361.

Gyapong

. Land grabs, farmworkers, and rural livelihoods in West Africa: some silences in the food sovereignty discourse. Globalizations 2021; 18(3): 339–354.

Townsend

. Universities and their cities: urban higher education in America. Isis 2018; 109(2): 440–441.

Wang

Bian

. Application of the big data analysis model in higher education talent training quality evaluation. Complexity 2021; 2021: 8321030.

Han

Wang

, et al. What to expect from dynamical modelling of cluster haloes II. Investigating dynamical state indicators with Random Forest. Mon Not R Astron Soc 2022; 514(4): 5890–5904.

Biici

Zeybek

. Effectiveness of training sample and features for random forest on road extraction from unmanned aerial vehicle-based point cloud. Transp Res Rec 2021; 2675(12): 401–418.

Cai

Zhu

, et al. Prediction and analysis of net ecosystem carbon exchange based on gradient boosting regression and random forest. Appl Energ 2020; 262: 114566.

Chen

De Hoogh

Gulliver

, et al. Development of Europe-wide models for particle elemental composition using supervised linear regression and random forest. Environ Sci Technol 2020; 54(24): 15698–15709.

Uhlenkott

Vink

Kuhn

, et al. Predicting meiofauna abundance to define preservation and impact zones in a deep-sea mining context using random forest modelling. J Appl Ecol 2020; 57(7): 1210–1221.

10.

Yuchi

Gombojav

Boldbaatar

, et al. Evaluation of random forest regression and multiple linear regression for predicting indoor fine particulate matter concentrations in a highly polluted city. Environ Pollut 2019; 245: 746–753.

11.

Luo

Stoeger

. Developing eminence in STEMM: an interview study with talent development and STEMM experts. Ann NY Acad Sci 2023; 1521(1): 112–131.

12.

Wang

Wei

. Talent demand and training strategy of oceangoing cruise company based on customized talent development model. J Coast Res 2020; 106(1): 233–236.

13.

Luo

. Quality evaluation of practical training of innovative and entrepreneurial talents in universities based on statistical learning theory after COVID-19 epidemic. J Intell Fuzzy Syst 2020; 39(6): 9045–9051.

14.

Zhang

Zhao

. Design and operational validity of the English language application ability evaluation system based on the international talent assessment concept. Basic Clin Pharmcol 2020; 127: 217–218.

15.

Peng

Dai

. Research on the assessment of classroom teaching quality with q-rung orthopair fuzzy information based on multiparametric similarity measure and combinatorial distance-based assessment. Int J Intell Syst 2019; 34(7): 1588–1630.

16.

Raman

. Internationalization of Indian higher-education. Curr Sci 2018; 115(5): 809–810.

17.

Ishida

EEO

Beck

González-Gaitán

, et al. Optimizing spectroscopic follow-up strategies for supernova photometric classification with active learning. Mon Not R Astron Soc 2019; 483(1): 2–18.

18.

Liu

Ding

Rong

, et al. Prediction of cell penetrating peptides and their uptake efficiency using random forest‐based feature selections. AIChE J 2022; 68(9): e17781.

19.

Greaves

Kelestyn

Blackburn

RAR

, et al. The black student experience: comparing STEM undergraduate student experiences at higher education institutions of varying student demographic. J Chem Educ 2022; 99(1): 56–70.

20.

Hand

Christen

Kirielle

. F^*: an interpretable transformation of the F-measure. Mach Learn 2021; 110(3): 451–456.

21.

Zhang

. Study on the reform path of higher education management based on big data and information mode. Basic Clin Pharmacol 2019; 125: 275–276.

22.

Cormas

Gregg

Louise

, et al. A professional development framework for higher education science faculty that improves student learning. Bioscience 2021; 71(9): 942–952.

23.

Wadud

Royston

Selby

. Modelling energy demand from higher education institutions: a case study of the UK. Appl Energ 2019; 233: 816–826.

24.

Torrecilla

Gutiérrez-de-Rozas

Cancilla

. Thinking-based learning at higher education levels: implementation and outcomes within a chemical engineering class. J Chem Educ 2021; 98(3): 774–781.

25.

Van de Heyde

Siebrits

. The ecosystem of e-learning model for higher education. S Afr J Sci 2019; 115(5-6): 78–83.

26.

Overberg

Broens

Guenther

, et al. Internal quality management in competence-based higher education-an interdisciplinary pilot study conducted in a postgraduate programme in renewable energy. Sol Energy 2019; 177: 337–346.

27.

Michalopoulou

Shallcross

Atkins

, et al. The end of simple problems: repositioning chemistry in higher education and society using a systems thinking approach and the United Nations' sustainable development goals as a framework. J Chem Educ 2019; 96(12): 2825–2835.

28.

Davis

. Student mental health: a guide for psychiatrists, psychologists, and leaders serving in higher education. Am J Psychiat 2018; 175(10): 1025–1026.