Abstract
Proper classification and bale selection are prerequisites to success in a modern cotton spinning operation. Currently, for crops where automatic High Volume Instrument (HVI) classification is the norm, fiber selection is done based on HVI data which does not include adequate characterization of fiber length distribution. This research evaluates the effectiveness of current cotton fiber classification and selection procedures in controlling for variability in fiber length distribution and presents a new approach to adequately clustering cotton bales into homogenous groups based on empirical length distributions. The results show that using the common HVI parameters to group the bales produces categories with uncontrolled length distribution variability. Differences in distribution patterns appeared related to the potential for bales with the same micronaire levels to differ significantly in maturity and thus in propensity to break.
Cotton fiber traits are determined by complex interactions among genetic, environmental and processing conditions. Because of these interactions, fiber properties vary significantly at multiple levels, that is, between fields, between individual plants within fields, and even within single plants and on the same seed. 1 – 3 Thus, the major challenge in cotton processing is to convert a highly variable raw material into a uniform product with quality that remains consistent over long production cycles. To address this challenge, it is critical that all the important fiber properties be adequately measured, and that accepted cotton bale classing based on those measurements be made. Accordingly, cotton classing has historically had a vital impact not only on the economics of cotton production and marketing, but also on the efficiency and the ultimate profitability of the textile manufacturing operation. In fact, decision making in the cotton industry is often, if not always, based on categorizing or clustering cotton bales into relatively homogeneous quality groups using measured fiber properties.
Cotton classing has considerably changed with progress in fiber quality measurement technology over several decades. Early graders manually and visually classified cotton according to grade, staple length and character. 4 The development of technology that enabled automatic and rapid measurement of micronaire, color, then length, strength and trash, led to the current classification system based on High Volume Instruments or HVI. 5 With the widespread adoption of quality measurement and classification technology and thus the availability of fiber information, cotton bale selection and laydown arrangement systems have evolved from the reliance on skills and experience of spinners to highly sophisticated information management and engineered decision-making tools. 6
In order to optimally use this information in fiber selection, significant research efforts have been accomplished and various approaches have been developed over decades. For instance, the concept of Engineered Fiber Selection or EFS® was first developed by Cotton Incorporated in the late seventies. 7 – 9 El Mogahzy 10 proposed a linear programming approach to optimize cotton purchase and planning decisions, and to control warehouse inventory based on HVI data. Later research went beyond purchase and inventory management to integrate bale picking for laydown mix selection. 11 The various bale picking schemes in use today are based on correctly and efficiently clustering the population of bales into homogeneous groups with respect to selected fiber characteristics. Those properties are limited to the major parameters available through HVI testing (i.e. micronaire, length, strength, and sometimes other characteristics such as color). 12 Micronaire is typically considered as a primary criterion in view of the major problems such as fabric barré or color shade differences that inconsistent micronaire can entail. 13 Staple length is often the next essential criterion when mixing laydowns; although in more general terms, fiber properties may vary in importance according to technology and end use. Since the establishment in 1991 of 100% classification by HVI in all USDA classing offices, and over the following decade, the widespread adoption of HVI by spinning mills the world over, 14 little has changed in the fundamentals of the classing system. In an attempt to simplify the selection process by aggregating multiple criteria, complex indices such as the fiber quality index (FQI), the spinning consistency index (SCI), or the premium/discount index (PDI) have been developed based on combinations of fiber properties and on regression models. 15 – 17 Those indices often depend on the range of bales used to develop the equations and are not readily generalizable to characterize the complex multivariate nature of cotton fiber quality. In addition, those indices consist of linear combinations of the same HVI parameters discussed above and thus, fundamentally, they convey the same set of information with the same shortcomings.
In particular, despite intensive research and development efforts, classing data still fails to include meaningful and reliable measurements of some fiber properties now at the forefront of concerns for spinners, namely, neps 18 and short fibers or more generally fiber length distribution.19,20 To evaluate those properties, spinners depend on measurement methods with testing speeds not compatible with those of HVIs. The Advanced Fiber Information System, or USTER® AFIS, is one such method where fibers are individualized using an aeromechanical opener/separator, then individually conveyed through a set of optical sensors which generate electrical signals proportional to fiber length and other dimensions.21,22
Thus, the criteria used as input to control the blends that feed the spinning mill are exclusively based on HVI measurements, while the spinner’s quality concerns at the output of the mixing line are increasingly geared toward parameters that cannot be measured using HVIs, namely neps, short fiber content or fiber length distribution.19,23,24 More generally and beyond fiber length, the intrinsic variability of all fiber properties (within cottons/bales) is not taken into account during fiber selection and laydown arrangement. In practice, each bale is identified by the average values of its HVI fiber characteristics. Information about within-bale variability or about distributions of individual fiber characteristics is usually unavailable at the laydown constitution stage.
The absence of this information from HVI classing data means that critical fiber properties are not taken into account in the fiber selection and laydown constitution process. This may lead to unpredictable changes in within-laydown variability which can be rather detrimental, 25 unless the current procedures would allow an indirect control of this variability. For instance, if those properties can be predicted using HVI parameters, the current fiber selection practices may have the potential to control for their variability in the laydown. However, this assumption remains to be verified because it is unclear whether controlling micronaire, length, length uniformity and bundle strength is sufficient to control variability in properties such as fiber length distribution. Indeed, fiber length distribution patterns typically show complex features and are therefore difficult to classify using parameters such as mean values.19,24 The research reported in this paper aims at testing the aforementioned assumption with a focus on fiber length distribution. We examine the performance of HVI parameters as criteria for clustering cottons into homogenous distribution patterns and present a new approach to classifying cotton bales using empirical distributions of fiber properties.
Materials and methods
A total of 172 commercial US upland cotton bales with a wide range of fiber properties were included in this study. To ensure the representativeness of the fiber property measurements, each bale was divided into 10 layers and fiber samples were collected from each layer for testing on HVI (High Volume Instrument, four replications for micronaire, four for color, and 10 for length and strength) and AFIS (Advanced Fiber Information System, three replications of 3000 fibers each). All testing was done after proper conditioning (65% RH, 21°C). Testing instrument calibration was checked daily using standard cottons and proper daily maintenance and monitoring procedures ensured reliability of all instruments.
26
Table 1 contains a summary of the properties of the selected cottons and shows the wide range achieved in all variables. In addition to the summary parameters, empirical histograms for length, fineness and maturity were retrieved from the AFIS test. Averages per bale for all HVI and AFIS parameters, as well as for length distribution histograms were derived to fully characterize each bale. Using the data collected, we evaluated bale classification using clustering techniques based on three sets of criteria:
The usual HVI parameters using average values per bale; the parameters considered were micronaire, Upper Half Mean Length (UHML), length uniformity index, and bundle strength (this corresponds to the set of criteria used in common practice). AFIS length parameters using average values per bale of the mean length by number (Ln), the 5th length percentile, as well as dispersion parameters, namely length CV% by number (LnCV%), and short fiber content (SFCn%). Empirical histograms of individual fiber length using the average histogram per bale. Clustering the bales based on the empirical distribution is considered the reference ranking in this analysis since the criteria used constitute the most complete information available about individual fiber properties, which should yield the highest possible homogeneity within quality groups. Main fiber properties of the selected bales (HVI and AFIS measurements on raw cotton)
With each of the sets of criteria above as dimensions, we used the k-Means clustering algorithm available in the STATISTICA Data Miner program 27 to classify the bales into homogenous groups by minimizing the within-group distances in the respective criteria taken simultaneously. The analysis was conducted using the ‘Generalized EM and K-Means Cluster Analysis’ tool which allows for an a-priori unknown number of clusters (k) and estimates k from the data using the v-fold cross-validation algorithm. 27 Thus, the analysis generates an estimate of the number of clusters (k) from the data, then partitions the observations into the k clusters that minimize the distances or dissimilarities between observations within clusters, and maximize the distance between clusters. Each cluster is characterized by its centroid (the vector of means for the continuous variables or criteria 27 ). The dissimilarities between clusters and between observations within clusters are estimated using the squared Euclidean distance between centroids or, respectively, between each observation and its cluster centroid in the multidimensional space constituted by the classification criteria. For instance, in the cluster analysis using HVI properties as criteria, micronaire, UHML, uniformity and strength constitute a four-dimensional space.
The number of clusters was estimated based on the empirical histogram data. With each set of classification criteria, length distribution data of the bales partitioned into groups was used to estimate a length distribution centroid, and then squared Euclidean distance between each bale and the corresponding cluster centroid was calculated to estimate the dissimilarity in length distribution patterns within clusters. Likewise, the centroid distributions were used to calculate the distances between clusters.
Results and discussions
As mentioned above, the classification of the bales using the empirical histograms of the fiber length distributions was considered the reference ranking in this analysis. To derive this classification, the frequencies observed for each length bin were used as classification criteria in the k-means cluster analysis. The number of clusters estimated using the cross-validation algorithm as discussed above was five. Thus, both HVI and AFIS data were used to cluster the bales into five homogenous quality groups. We first examine the reference classification obtained with the individual fiber length distributions, then discuss the clusters derived with the commonly used HVI properties.
Bale classification using empirical histograms of individual fiber length
Figure 1 depicts the observed probability density traces of the individual bales classified into homogenous groups using the k-means cluster analysis with length histograms as classification criteria. The density traces for the individual bales are shown in fine gray lines. The density trace shown in bold broken line represents the centroid for the corresponding cluster. The broken vertical line at x = 30 mm was added to emphasize the relative positioning of the five clusters on the length axis.
Probability density traces of the 172 bales categorized into five clusters using empirical length histograms.
The plots generated for the five groups show distinct patterns across clusters with relatively homogenous distribution shapes within clusters. Therefore, using the k-means clustering approach and the observed length distribution data, it was possible to automatically and quickly classify a sizeable number of cotton bales into groups with homogenous distribution patterns.
Observed cotton fiber length distribution patterns result from a combination of intrinsic (genetic and environmental) and processing factors. Mechanical damage in cotton fiber processing, both shifts the fiber length distribution and alters its shape. As a result of these interactions, the distributions exhibit complex, often bimodal, patterns which depend on the degree of fiber damage undergone by the cotton.19,24,28 With such complex shapes, the summary statistics typically used to describe fiber length (means, percentiles, short fiber content…) are not representative of the distribution, and cannot be used to classify cottons into groups with similar distribution patterns. Thus, the common way to compare and classify samples with varied degrees of fiber damage into groups with similar distribution shapes is to visually examine the empirical length histograms. However, this can only be done with a limited number of samples and cannot be practically applied when dealing with hundreds or thousands of bales to constitute laydowns, or when analyzing hundreds of samples to select genotypes in breeding programs. The approach we show above overcomes this problem and allows the automatic and quick classification of a large number of samples into groups with similar distribution patterns.
The distribution groups, shown in Figure 1, differ in both shape and position on the length axis, which, as indicated above, corresponds to both intrinsic and process-related sources of variability. We have sorted the five groups on Figure 1 (from A to E) by order of increasing fiber damage according to the characteristic distribution shapes. 19 In particular, clusters A, B, and C (Figure 1) show a clear bimodal shape with a peak in the range of very short fibers (x < 5 mm), and another distinct peak in the length categories between 20 and 30 mm. This pattern is characteristic of an intermediate stage of fiber breakage process typically seen in raw cotton that underwent some degree of mechanical aggressiveness in ginning and lint cleaning.19,28 Clusters D and E (Figure 1) on the other hand, still exhibit the peak at x < 5 mm, but the peak at the longer fiber categories appears to dissipate gradually and to almost completely disappear for cluster E. This pattern is indicative of a more advanced breakage process, where fibers shift from the longer length categories to those closer to the origin of the length axis, and thus the dip between the two peaks apparent at the lower damage levels disappears as damage increases.19,28
Figure 2 summarizes the major HVI fiber properties (i.e. micronaire, staples length, length uniformity and strength), observed for each of the five bale clusters discussed above. The results show that the clusters, based on the observed length distribution, differ significantly in HVI fiber properties, with exception made of the fact that clusters A and B have equal fiber strength values. Those two clusters, seen above as having a distribution pattern characteristic of low-intermediate degree of fiber damage, appear to be constituted of the strongest bales (average strength is 30.2 g/tex), and are characterized by the two highest micronaire levels, respectively 4.2 and 4.6 (Figure 2). At the other end of the spectrum, cluster E shows the lowest micronaire (2.9) and strength (24.6 g/tex), and as discussed above, the length distribution pattern with the most advanced fiber damage level. Overall, the different distribution shapes seen across the five groups of bales correspond to different degrees of damage that can be caused by variations in upstream processing conditions (mechanical aggressiveness in ginning and lint cleaning) or variations in the cottons’ propensity to break, which was shown to depend on fiber maturity and strength.19,24 For instance, the distribution shape seen in cluster E (Figure 1) is distinctive of immature-weak cotton that reached a degree of extensive fiber damage even at the bale stage. Therefore, the k-means clustering approach using observed length distributions allowed the classification of the tested cotton bale population into homogenous groups. The various distribution patterns observed for those groups appear to be representative of varying degrees of fiber damage. Because of the close relationship between fiber damage and maturity and strength,19,24 clustering the cotton bales into homogenous groups according to length distribution patterns shows the potential of effectively discriminating between cottons with differing micronaire and strength levels.
Variation of HVI fiber properties among length distribution pattern clusters. (Vertical bars denote +/− standard errors.).
Parametric classification using HVI and AFIS statistics
In addition to the classification discussed above, both HVI and AFIS parameters were used to cluster the bales into five homogenous quality groups. HVI classification constituted bale clusters based on micronaire, UHML, length uniformity index, and bundle strength. AFIS classification was based on four length parameters by number (mean length, length CV%, length 5th percentile, and short fiber content 19 ). The clustering technique was similar to above; the analysis constituted five groups that minimized the within-group and maximized the between-group variability in the selected classification criteria.
Cluster means for micronaire, staple length (mm), length uniformity index (%), and bundle strength (g/tex)
Cluster means for AFIS mean length by number (Ln), length CV%, and length 5th percentile (Pc5.0)
We now examine the three classifications obtained above and compare the performance of each set of criteria in adequately grouping the bales, that is, in producing distinct and homogenous clusters that minimize the within-group variability and maximize the between-group variability.
Classification performance
As mentioned in the methods section, the dissimilarity of the distribution patterns within clusters was estimated using the squared Euclidean distances between the length distributions of individual bales and the corresponding cluster centroid. Respectively, the dissimilarity of the distribution patterns between clusters was estimated using the squared Euclidean distances between cluster centroids. This was done for each of the three classifications discussed above, namely, the classification based on the empirical histograms and the two ‘parametric’ classifications based on HVI and AFIS parameters. The squared Euclidean distance results were used to calculate the ratio of total between-cluster over the total within-cluster variability of distribution patterns for each of the three classifications. Based on the discussion above, this ratio measures the classification performance because the higher it is, the more distinct and homogenous the clusters are. Figure 3 shows the ratios so obtained for the three classifications.
Between-/within-cluster ratio of Euclidian distance (distribution dissimilarity) based on the three sets of criteria.
It is apparent that as expected, the criteria based on the empirical histograms produce the classification with the highest ratio. The classification based on AFIS parameters produces the middle ratio while the one based on HVI parameters produces the lowest ratio. This indicates that as we move from the empirical histogram to the parameters used in the industry to classify cotton bales, the probability to obtain bale categories with heterogeneous distribution patterns increases.
To scrutinize this observation in more depth, we examine the detail of the distances obtained for the individual bales partitioned into the five clusters using HVI parameters. The results of this analysis are shown on Figure 4 where both individual values (upper plot) and standard deviations (lower plot) of the squared Euclidean distances are plotted against the five HVI groups (HVI-1 to -5).
Variability chart for length distribution pattern dissimilarity within HVI clusters (squared Euclidean distance).
The results in Figure 4 show a high dispersion of the Euclidean distance for the clusters with high micronaire levels (cluster HVI-5 and to some extent cluster HVI-4 which has two bales with extreme length distribution dissimilarity in comparison to the cluster centroid). These results indicate that the clustering of the bales based on HVI parameters resulted in some groups of bales with relatively heterogeneous length distribution patterns. The heterogeneity within groupings appears to be higher for the categories with high micronaire levels.
The practical implication of the observation made above is that in constituting the spinning laydowns based solely on HVI data, bales with dissimilar length distribution patterns could be substituted for each other (being from the same category) and could therefore result in variability between laydowns that remains unaccounted for. In the particular case of the population we tested, bales within the 4.1 and 4.5 micronaire categories (see Table 2) could be considered essentially identical because of having similar HVI properties, but may represent significant variability in length distribution. An illustrative example of this variability in each of the two groups of bales is shown in Figure 5. For each group, we plotted length distribution density traces for two bales showing high Euclidean distance from the cluster’s centroid (shown in broken bold line).
Length distribution pattern variability within clusters HVI-4 (a) and HVI-5 (b).
In both cases shown in Figure 5, the distribution patterns are different and exhibit distinct shape features that typically correspond to cottons with different degrees of fiber damage, (i.e. different propensities to break and/or processing history). 19 Those bales were classified in different clusters when using the empirical histograms as criteria but were attributed to the same groups when HVI parameters were used as criteria. This result is indicative of the fact that the four major HVI fiber properties are not sufficient to predict length distribution patterns since cotton bales having similar HVI measurements may have very distinct length distributions.
The bales depicted in Figure 5 are just examples among about 39 bales (or 23% of all tested bales) that appeared to be misclassified based on HVI data. In order to identify the factors impacting this misclassification, we examined the combinations of fiber properties of pairs of bales that had comparable HVI properties but exhibited significantly different distribution patterns (similar to the cases illustrated in Figure 5). A particular emphasis was placed on those fiber characteristics that are known to impact the cotton’s propensity to break and thus length distribution pattern, which include fiber maturity.19,24
Figure 6 depicts the relationship between micronaire and the two fiber maturity parameters measured using the AFIS (i.e. the immature fiber content (IFC %) and maturity ratio) for the bale clusters exhibiting misclassified bales with HVI. The pairs of bales with similar HVI properties but distinct length distribution patterns are represented using two different point markers depending on the distribution’s positioning relative to the respective cluster’s centroid. Bales with distributions at the left of the centroid, that is those having degraded length with relatively extensive fiber damage (e.g. bales #30 and #12 in Figure 5), are shown in bold dark dots. Bales with distribution at the right of the centroid (i.e. with a lower degree of fiber damage), are shown with asterisk point markers. Two curves (dotted lines) were added to the scatter plots to outline each of the two groups of bales. The rest of the bales are shown in gray circle markers.
Relationship between micronaire and maturity parameters; (a) Immature Fiber Content (IFC%), and (b) Maturity ratio. Bales showing extreme length distribution patterns within clusters are shown in distinct point markers.
It can be seen from Figure 6 that the two groups correspond to pairs of bales having two levels of maturity for the same micronaire values. The samples with degraded length distributions are those that have a higher IFC% and a lower maturity ratio, while the bales with a lower degree of damage have a lower IFC% and a higher maturity ratio for the same micronaire levels. It appears therefore that the misclassification of some of the bales based on HVI parameters is related to the fact that bales having the same micronaire may correspond to different maturity levels, given the nature of micronaire as a complex measure of both maturity and fineness. Thus bales having the same micronaire (i.e. classified in the same HVI categories), but having different maturity levels will ultimately lead to the high variability in fiber length distribution as illustrated in Figure 5.
Conclusions
In order to test the effectiveness of current cotton fiber classification and selection procedures in controlling for variability in fiber length distribution, k-means cluster analysis was used to classify a broad range of 172 commercial cotton bales into homogenous quality groups based on three sets of classification criteria. The first set of criteria consisted of the major HVI parameters commonly used in commercial classification and in fiber selection in spinning operations; the parameters considered were micronaire, UHML (mm), length uniformity index (%) and bundle strength (g/tex). Another set of criteria consisted of fiber length parameters provided by the AFIS. In addition to those parametric criteria, a new approach based on empirical histograms of fiber length distribution was also used. Using this new approach, it was possible to quickly classify a sizable number of cotton bales into groups with homogenous length distribution patterns which appeared representative of varying degrees of fiber damage. Because of the impact of fiber maturity and strength on the propensity to break, clustering the bales based on length distribution patterns resulted in groups with different micronaire and strength levels.
A comparative analysis of the three approaches revealed that when classification was done using HVI properties only, approximately 23% of the bales appeared misclassified, (i.e. cottons with significantly different length distributions were attributed to the same categories), which could result in undesirable laydown variability in critical properties such as short fiber content. The examination of the interactions among fiber properties indicated that the misclassification of those bales based on HVI parameters is related to the nature of micronaire as a complex measure of both maturity and fineness. Therefore, bales having the same micronaire may correspond to different maturity levels, and given the link between maturity and fiber damage, this can result in significant variability in fiber length distribution. This result underscores the need for, and the potential usefulness of high volume measurement tools that could provide separate determination of maturity and fineness. The availability of such methods could prevent bale misclassifications resulting from misinterpretation of micronaire values and thus ensure better control of the variability in fiber length distribution. This research continues in order to quantify the potential impact of this variability on processing performance and, ultimately, on yarn quality, as well as to identify combinations of classification criteria that could help minimize variability within bale categories.
Footnotes
Acknowledgements
This research was funded, in part, by the Food and Fibers Research Grant Program administered by the Texas Department of Agriculture (grant number FF-d1011-7) and by Cotton Inc., Texas State Support (grant number 11–813TX).
