Abstract
This study was designed to investigate the differences in the physicochemical properties among twelve strawberry cultivars by using pattern recognition tools, to provide a theoretical basis for quality variation among samples. The data of 14 indicators were subjected to principal component analysis (PCA), descriptive statistical analysis, correlation analysis, and hierarchical cluster analysis, to filter core evaluation factors. Quality evaluation index weights and quality comprehensive evaluation were determined by an analytic hierarchy process. Of the 14 indicators selected as indices of fresh strawberry quality evaluation index, including a value, sugar-acid ratio, firmness, vitamin C, TAC, and TPC. The data were deployed to adjust the multivariate kinetics using Analytic Hierarchy Process (AHP), and the results were compared to those sensory score using sensory personnel. Results showed that the correlation coefficient of sensory scores and AHP comprehensive score is 0.9239. This high correlation coefficient indicates that the use of our mathematical model for strawberry quality evaluation is feasible. The information herein provides a practical strategy for the evaluation of strawberry quality.
Keywords
Introduction
Strawberry (Fragaria
Currently, most studies on strawberry focus on fruit processing and storage [5, 6, 7]. Very few studies have performed comprehensive quality evaluations of raw material. Fruit quality is the most important parameter to consider when evaluating fruit and vegetable cultivars [8, 9]. Evaluation of the quality of strawberries is important for the selective breeding of better strawberry cultivars, as well as for the selection of preferred fruits. During quality evaluation, a large number of variables or factors are analysed by statistical tests to reduce dimensionalities, and ultimately, to present the results in graphical format [10]. Moreno et al. [11] performed quality evaluation using principal component analysis (PCA) to select high-quality materials for tomato processing. Nogales-Delgado et al. [12] performed quality evaluation for different nectarine cultivars using PCA to select the optimal fresh-cut processing material.
To comprehensively describe the qualitative characteristics of fruit, physicochemical and sensory analyses are needed. The chemical composition of strawberries varies significantly with genotype [13, 14, 15, 16, 17]. Statistical processing of such a large amount of heterogeneous chemical data according to classical methods provides a rich source of information on all tested variables [18]. However, this approach does not provide a global knowledge of the relationships between these different chemicals, nor does it allow the grouping of samples with homogeneous characteristics. In such situations, multivariate statistical methods are often employed, which allow for variable reduction and facilitate the presentation of results in a clear, graphical manner [19]. One such example is PCA, which permits the identification of the most important directions of variability in a multivariate data matrix and allows graphical representation of the results.
PCA is a mathematical tool that reduces the dimensionality of data and allows the visualization of underlying structures in experimental data and relationships between data and samples. It has been applied to the characterization of textural properties of select ready-to-eat meat products [20]. For instance, Keenan et al. [21] used multivariate analyses of physicochemical properties to select apple cultivars for use in ready-to-eat desserts. Furthermore, PCA has been widely used in property evaluation of wheat, corn, peanuts, and other crops, as well as in the comprehensive evaluation of germplasm resource [22, 23].
The present study was designed to use chemometric tools to analyse differences in fruit quality based on the physical and chemical properties of several strawberry varieties. Fruit weight, firmness, vitamin C, total anthocyanin content, total phenolic content (TPC), pH, crude protein, water, L value, a*, b*, total sugar/total acid, and total soluble solids/total acids were all considered. They are used to evaluate the quality of strawberry cultivars using principal component analysis (PCA), descriptive statistical analysis, correlation analysis, and hierarchical cluster analysisd and compared the results with the results of sensory analysis. The results of this approach provide a solid theoretical basis than can be applied during the evaluation of strawberry quality and eventual selection for culture, harvesting, and consumption, in order to find the best hybrid strawberry for production.
Materials and methods
Strawberry samples
Strawberry fruits are susceptible to the effect of environment, soil, water and other factors [24]. Red-freshed strawberries (Fragaria
Details of physical and chemical properties is as shown in supporting file.
Sensory analysis
Sensory evaluation is a comprehensive and objective evaluation of the fruit colour, size, appearance, texture, flavour, and odour of fruit through multiple organs such as vision, smell, touch, taste, and hearing. Real data will be obtained for comprehensive evaluation through mathematical statistics. The quantitative descriptive analysis (QDA
Sensory evaluation was conducted on five properties of strawberry fruit, including colour, size, texture, flavour, and odour. Each property was divided into five grades based on the quality. To better reflect the differences in senses between cultivars, the fruits were graded into five levels (5, 4, 3, 2, and 1) from the best to the worst quality, respectively. Before evaluation, a scoring system was developed, based on which the assessors performed the sensory evaluation. The development of the form followed these principles:
All selected sensory properties reflect the unique characteristics of strawberry raw material or a certain feature. The sensory properties should have correlations with the measurements determined by instruments. The sensory score form should include a description of the quality grading and the corresponding scores.
The basic principle of analytic hierarchy process (AHP) is based on the evaluation system related to decision-making. Through expert consultation, the relative importance of elements at each level is analyzed and compared to each other. These data are used to form a judgment matrix, and then the component of the corresponding characteristic direction of the maximum characteristic root of the judgment matrix, which is used as the corresponding coefficient. Finally, the weight of the scheme layer relative to the target layer is judged, which is a decision-making method combining qualitative analysis and quantitative analysis. In the process of quality evaluation, before using analytic hierarchy process, each evaluation index should be centrally normalized. And the relative importance of each evaluation index is finally determined by comprehensively consulting the opinions put forward by experts and consulting the relevant literature. In this way, the comparison matrix of evaluation indexes should be established, and the weight value of each evaluation index should be determined by using the comparison matrix. Finally, each normalization index is weighted and summed to obtain the comprehensive scores and ranking of different evaluation products.
Data analysis
All measurements were performed three times. Pattern recognition methods were applied to data collection as previously reported [26, 27]; These included PCA as an unsupervised classification method and HCA as an unsupervised learning method. PCA and HCA were performed using the SPSS21.0 software package. PCA transforms the original, measured variables into new uncorrelated variables called principal components [27, 28, 29]. The first principal component covers as much of the variation in the data as possible, whereas the second principal component is orthogonal to the first and covers as much of the remaining variation as possible, and so on. HCA calculates the distances (or correlation) between all samples using a defined metric, such as Euclidean distance and Manhattan distance [28]. HCA is the most common approach, wherein clusters are formed sequentially. The most similar objects are first grouped, and these initial groups are merged according to their similarities. Eventually, as the similarity decreases, all subgroups are fused into a single cluster. PCA has previously been used to characterize quince fruit [30, 31]. PCA and HCA have also been used to classify fruits, vegetables, and spices based on their in vitro antioxidant activities [27, 32]. The statistical software SPSS21.0 was used for data processing and statistical analysis.
Data standardization
(1) Initialization of quality evaluation indices
To eliminate the impact of different dimensions and magnitudes on quality evaluation, the quality evaluation indices were initialized. Initialization involves the use of the absolute value of the difference between the actual value and the ideal value of the quality evaluation index. After initialization of the quality evaluation indices, the closer the original value is to the ideal value, the smaller the initialized value is. The initialized value
(2) Positizing and standardization
After initialization, the value range of the quality evaluation indices was
The effect of positizing and standardization of the quality evaluation indices was to bring original values that were close to the ideal level (in either the positive or negative direction) closer to 1. Thereby, the standardized value range was
Physical and chemical characteristics of 12 strawberry varieties
The composition of twelve strawberry cultivars is presented in Table 1. The coefficient of variance (CV) was used to weigh each variable. The CVs of 14 properties ranged from 1.31% to 23.72%. TSS, TA, water content, and L value showed little variation among the different strawberry varieties (
In order to determine the variables that are most suitable for the evaluation of fruit quality, we performed correlation analysis using pooled data. Table 2 shows the correlation coefficients between fourteen variables for the twelve samples. Most mechanical variables were significantly correlated (
Descriptive statistics of strawberry physicochemical characteristics
Descriptive statistics of strawberry physicochemical characteristics
Correlation coefficients between the strawberry quality variables
Note: X1: Fruit weight; X2: Firmness; X3: Vitamin C (V
PCA was applied to the standardized values of analysed parameters of the twelve strawberry cultivars. In order to define the most appropriate descriptors of quality of materials, we used the cross-validation technique to establish that five principal components are sufficient to account for the total variability. The results of the calculations are presented in Tables 3 and 4.
PC1 explained 27.552% of the total variance in the data set, PC2 explained 19.710%, PC3 explained 15.191%, PC4 explained 12.386%, and PC5 explained 11.655%. The accumulative variance contribution of these five main components extracted by PCA was 86.494%. In other words, 86.494% of the total variance in the 14 considered variables could be condensed into five new variables (PCs).
Scree plots can be used to determine the optimal number of principal components. In the scree plot, the steepest part of the eigenvalue slope corresponds to the number of principal components. Figure 1 shows the scree plot of evaluation indexes upon PCA of samples; this reveals that the first five principal components yielded a greater coefficient of variation, while attachment is relatively steep. The first five principal components (
Results from the principal component analysis of the first three principal components
Results from the principal component analysis of the first three principal components
Component score coefficient matrix of PCA
Scree plot of evaluation indexes of strawberry by principal component analysis.
The component score coefficient matrix of the five main principle components is shown in Table 4. The data reveal that the most important variables for the PC1 were fruit weight, firmness, soluble sugar (SS) and crude protein (CP), among which fruit weight, soluble sugar (SS), and crude protein (CP) were positively correlated with PC1. PC2 mainly accounted for total phenolic content (TPC), total soluble solid (TSS), and colour (a
After standardization of the original data, using the Euclidean distance model and the between-group average linkage cluster analysis, all quality indices were clustered into five principle components on the basis of PCA results. After cluster analysis, a dendrogram was automatically calculated and generated by SPSS software in Fig. 2.
Clustering dendrogram of the fourteen evaluation factors.
A hierarchical agglomerative procedure was employed to establish clusters. Samples were grouped on the basis of similarities in hierarchical cluster analysis, without taking into account the information regarding class membership. The results obtained following HCA are shown as a dendrogram in Fig. 2 in which five well-defined clusters are visible. Samples are grouped in clusters in terms of their nearness or similarity. Cluster analysis (CA) uses less information (distances only) than PCA. The first group of samples is clearly discernible and is composed of water, SS, CP, and sugar-acid ratio (SA). These species are associated with high firmness and lower PH, TS/TA, and fruit weight. This is in agreement with the results of the PCA. The second cluster consists of firmness and a
Due to the different characteristics of each index, the ideal value of each index (x0) was first determined. Among all indices, colour, TPC, TAC and V
Result of data standardization of strawberry
Result of data standardization of strawberry
The contribution and the importance of the six core quality evaluation indices were combined with the experts’ experience to generate a hierarchical relational structure and to construct a judgement matrix using a 1–9 scale method in Table 6. The weight (WI) of each index was calculated with the pairwise comparison matrix. The proportional scaling number indicated the relative importance of the index.
Pairwise comparison matrix and its consistency
CR
The sensory score sheet of strawberry raw material was added in Table S4 of supporting file. The composite score Y of each strawberry variety was calculated by multiplying the weight of each strawberry raw material’s quality evaluation index by the standardized score of each quality index in Table 7.
Y
Composite score and sensory score of twelve strawberry varieties
Composite score and sensory score of twelve strawberry varieties
In order to investigate the accuracy of AHP integrated value rating model and its results for strawberry raw material quality, sensory evaluation of strawberry raw material was conducted, and the model was validated by standardized sensory scoring. Since the difference between individual evaluators and their personal preferences may lead to bias in the results, the aim of standardization was to eliminate the impact of individual evaluators. The relationship between the results of sensory evaluation and the calculated results using AHP was analysed by regression analysis in Tables 2–6. The sensory and AHP scores are indicated on the X-axes and Y-axes, respectively. There was a linear relationship between the two variables: y
Fitness test for strawberry of sensory and AHP values.
Considerable variations were observed between different strawberry cultivars in terms of physical and chemical properties. The unsupervised pattern recognition techniques of PCA enabled visualization of this complex dataset, and unmasked the underlying relationships that were responsible for the clustering observed. Fourteen properties were reduced to five main components. Twelve strawberry cultivars clustered significantly on PC1, PC2, PC3, PC4, and PC5. The combination of characterization and multivariate data analysis facilitated the inference of similarities and differences between strawberry cultivars based on their physical and chemical properties. This data analysis technique provides powerful insights into the variations of properties between different strawberry cultivars, and thus could be applied to the selection of optimum hybrid strawberry cultivars tailored for different purposes in strawberry breeding programs. Further studies could focus on the evaluation of strawberry quality of different hybrid strawberry cultivars, in order to find the best hybrid strawberry for production. And this approach could be integrated into commercial agricultural programs in order to maximize freshness, harvest timing, and appeal to consumers.
Footnotes
Acknowledgments
Shenyang Bureau of Science and Technology Particularly Dispatched Rural Mission (project number: 20-207-3-46). This project was the key planned project of year 2020 of Department of Science & Technology of Liaoning province (Serial number: 2020JH2/10200039).
