Abstract
Categorical color constancy has been widely investigated and found to be very robust. As one of object material properties, the surface gloss was found to barely contribute to color constancy in a natural viewing condition. In this study, the effect of surface gloss on categorical color constancy was investigated by asking eight observers to categorize 208 Munsell matte surfaces and 260 Munsell glossy surfaces under D65, F, and TL84 illuminants in a viewing chamber with a uniform gray background. A color constancy index based on the centroid shift of the color category was used to evaluate color constancy degree of each color category across illumination changes from D65 to F or TL84 illuminant. The result showed that both matte and glossy surfaces showed almost perfect color constancy on all color categories under F and TL84 illuminants, and there was no significant difference between them. This result suggests that surface gloss has little effect on categorical color constancy in a uniform gray background where the local surround cue was present, which is consistent with the previous findings.
When the intensity and spectral composition of illuminant change, the same surface reflects different spectra, but the human visual system keeps constant color perception of this surface, which is referred to as color constancy (Foster, 2011; Li et al., 2009). Color constancy research has had a long history, and been performed with matte surfaces as stimuli in most cases. Recently, research has increasingly begun to focus on complex and realistic 3-D scenes (Giesel & Gegenfurtner, 2010; Hedrich et al., 2009; Uchikawa, 2014; Zaidi & Bostic, 2008). Within these scenes, the surface of objects had other various properties except for color, such as the texture, the gloss, and 3-D shape, which might influence the color perception of object surfaces. The surface gloss (Ji et al., 2021; Ma et al., 2009) has been found to have a significant effect on the perception of surface saturation and lightness (Giesel & Gegenfurtner, 2010; Isherwood et al., 2021; Xiao & Brainard, 2008), and less effect on the perception of hue (Giesel & Gegenfurtner, 2010) under a constant illumination.
Several previous studies investigated the effects of surface gloss on color constancy across illumination changes by an achromatic setting and asymmetric color matching methods. Xiao et al. (2012) compared the color constancy of matte spheres and glossy spheres in simulated 3-D scenes. The result showed that the color constancy of glossy spheres was the same as that of matte spheres when the chromaticity of uniform gray background was consistent with illuminant chromaticity, and better than that of matte spheres when it was inconsistent with illuminant chromaticity. Granzier et al.'s study (2014) showed that glossy cylinders had better color constancy than matte cylinders under a completely black background without or with a checkerboard. Wedge-Roberts et al.’s study (2020) showed that specular highlights had little effect on color constancy under the uniform gray background, but had significant improvement effects under the mosaic background. Mizokami et al.'s study (2014) also showed that specular highlight had little effect on color constancy under a normal viewing condition in real 3-D scenes. In the above studies, Wedge-Roberts et al.'s mosaic background condition, Granzier et al.'s completely black background condition, and Xiao et al.'s inconsistent gray background condition are similar: the use of a uniform background color as a cue to aid color constancy was prevented, and the contribution from local surround adaptation to color constancy was limited. It can be concluded from these studies that the surface gloss of 3-D objects had improvement effect on color constancy only when local surround cue was absent.
Achromatic setting and asymmetric color matching methods are two traditional methods to investigate color constancy, and focus on the color appearance of one color patch under different illuminants. Color naming is also one typical method to investigate color constancy, in which surface color is classified into one of several color categories, such as the 11 basic color categories proposed by Berlin and Kay (1969), under different illuminants. Categorical color constancy concentrates on how constant the perception of one color category is across different illuminants. The color constancy of one color category was usually measured by its centroid shift across illumination changes. The centroid was calculated as the mean chromaticity coordinates of all surfaces classified into this category.
Previous studies on categorical color constancy employed matte surfaces as stimuli. Olkkonen et al. investigated categorical color constancy in real (Olkkonen et al., 2010) and simulated (Olkkonen et al., 2009) scenes across illumination changes from D65 to chromatic illuminants on the axes of DKL color space. They found robust categorical color constancy in both real and simulated scenes. Ma et al. (2018) investigated categorical color constancy under RGB-LED light sources in real scenes. They found that the color constancy of red, brown, orange, and yellow color categories was poor under blue illumination, which was considered to be caused by spectra of narrow spectral bands of RGB-LED light sources.
Little information is known about the effect of surface gloss on categorical color constancy. It is quite common for us to encounter a situation in which we need to judge whether a red glossy object is still red after the illumination changed from a white light to a chromatic light. Considering the surface gloss might influence the color perception, the first purpose of this study was to compare the color classification results between matte and glossy surfaces under a constant illumination to investigate whether the surface gloss influences the categorical color perception. The second purpose of this study was to investigate the effect of surface gloss on categorical color constancy, that is, how the color category of surfaces interacts with their glossiness under changes in the illumination. There are two hypotheses about it. One is that consistent with previous results investigated by asymmetric color matching and achromatic setting methods, surface gloss would have no improvement effect on the categorical color constancy under a uniform gray background condition. The other one is that surface gloss would impede categorical color constancy because the gloss perception of surfaces changes across different illuminant conditions, and that may cause color perception changes of surfaces.
In addition to color classification task, observers were also required to choose the best examples for each color category, that is, prototypical colors, among the surfaces (Lillo, et al., 2014; Moreira et al., 2014; Olkkonen et al., 2009, 2010). Color category prototypes play an important role in learning basic color terms. Moreira et al.'s studies (Lillo, et al., 2014; Moreira et al., 2014) compared the difference of the use of category prototypes between normal trichromats and red-green dichromats under a constant illumination. Olkkonen et al.'s studies using matte surfaces (2009, 2010) showed good color constancy for category prototypes. The present study was to investigate the effect of surface gloss on the color constancy of color category prototypes.
In the study, the observers were asked to classify Munsell matte and glossy surfaces into nine basic color categories under D65, F, and TL84 illuminants in a uniform gray background in real scenes. The centroids and prototypes of color categories were compared between matte and glossy surfaces under D65, F, and TL84 illuminants respectively in CIE LAB color space. The color constancy indices across illumination changes from D65 to F or TL84 illuminant were calculated based on the centroids and prototypes for matte and glossy surfaces.
Methods
Apparatus
The experiment was conducted in a color assessment cabinet (Shenzhen Qiantongcai Color Management Co., Ltd) located in a dark room. A schematic diagram of the experimental setup is shown in Figure 1a. The inside of color assessment cabinet has a height of 425 mm, a width of 575 mm, and a depth of 505 mm. The experiment was performed under D65, F, and TL84 illuminants. D65 illuminant was generated by two Philips MASTER TL-D 90 Graphica 36W/965 fluorescent lamps, F illuminant by two 40W tungsten lamps, and TL84 illuminant by two Philips TLD 18W/840 fluorescent lamps. All lamps used to produce illuminants were located in the ceiling of the viewing cabinet and not covered by a diffuser; the detailed location of all lamps in the ceiling was shown in Figure 1b. The ground of the viewing cabinet was gray with a spectral reflectance of about 25%, roughly corresponding to the gray of Munsell surface N5.5/. In the experiment, the surfaces were laid out on the ground of the viewing cabinet in a random arrangement; the possible farthest surface had a distance of 75 cm from the observer at an angle of approximately 22° and the possible nearest 30 cm at an angle of 69°.

(a) Schematic diagram of the experimental setup with stimuli and the observer. (b) The location of all lamps in the ceiling of the viewing cabinet.
Observers
Eight naïve observers took part in the experiment. All of them (four females and four males, aged between 21 and 25 years old) were students of Taiyuan University of Technology. All observers were confirmed to have normal color vision after the pseudoisochromatic plates (Ziping Yu, 6th ed.), the Farnsworth D-15 color vision test and the Farnsworth-Munsell 100-Hue test under an indoor illumination with the color temperature of about 6100 K. Written informed consent was given by all observers prior to the experiments.
Stimuli
We chose all matte surfaces on the plane of Munsell value 5 in the Munsell Book of Color Matte Edition (Model: M40291B), 208 in total, and all glossy surfaces on the plane of Munsell value 5 in the Munsell Book of Color Glossy Edition (Model: M40115B), 260 in total, as stimuli. Two hundred and eight matte surfaces contained 207 chromatic surfaces including 40 hues (with 2.5-unit step) and all chroma levels available for each hue, and 1 achromatic surface N5/. Two hundred and sixty glossy surfaces contained 240 chromatic surfaces including 40 hues and all chroma levels, 1 achromatic surface N5/, and 19 supplementary chromatic surfaces with additional hues, namely 1.25R5/12, 1.25R5/14, 6.25R5/12, 6.25R5/14, 3.75R5/12, 3.75R5/14, 1.25G5/12, 8.75R5/12, 8.75R5/14, 8.75R5/16, 3.7PB5/12, 6.25PB5/12, 1.25YR5/12, 3.75YR5/12, 6.25RP5/12, 6.25RP5/14, 8.75RP5/14, 3.75RP5/12, 8.75RP5/12. The glossiness levels of 10 matte and 10 glossy surfaces with 10 hues starting from 5R (with 10-unit step), value 5, and chroma 6 were measured by a gloss meter (CS-380 20/60/85°, Caipu Technology (Zhejiang) Co., Ltd.). The average glossiness over surfaces was shown in Table 1.
Glossiness levels of Munsell matte and glossy surfaces.
The spectral power distributions of D65, F, and TL84 illuminants were measured by a spectroradiometer (PR-655, Photo Research Inc.) from the Munsell matte white surface with a 45°/0° (illuminating/viewing) geometry, and were shown in Figure 2 (also see Figure 2 in our previous study (Ma et al., 2022)). Table 2 shows the corresponding correlated color temperature, CIE1976 u'v' chromaticity coordinates, luminance values measured from the white surface (also see Table 1 (Ma et al., 2022)), illuminance levels measured by an illuminance meter (DLY-1802, Delixi) from the center of the cabinet ground, and general color rendering indices Ra.

Relative spectral power distributions of all illuminants.
Properties of all illuminants.
We used spectral reflectances for matte and glossy surfaces (specular excluded) provided by the spectral color research group database at the University of Eastern Finland (https://www.uef.fi/spectral/spectral-database) in the calculation of chromaticity coordinates as previous studies (Olkkonen et al., 2010; Wedge-Roberts et al., 2020). In order to examine how well the reflectance data in database matched those of real surfaces, we measured spectral reflectances of 40 matte and 40 glossy surfaces (specular excluded) with 40 hues (with 2.5-unit step), value 5, and chroma 6 by the spectroradiometer PR-655. It was found that the Euclidean distances in the CIE1976 u'v' chromaticity diagram between chromaticity points calculated with two kinds of spectral reflectances ranged between 0.0008 and 0.0114 with an average of 0.0047 and a standard deviation of 0.0023 for matte surfaces, between 0.0041 and 0.0345 with an average of 0.0139 and a SD of 0.0080 for glossy surfaces. The difference between chromaticity coordinates calculated by two kinds of reflectance spectra was small for matte surfaces and relatively large for glossy surfaces. Considering that reflectance spectra in database were obtained by a spectrometer (Lamda 18 UV/VIS, Perkin Elmer) and those of real surfaces were obtained by a spectroradiometer PR-655 with lens far from surfaces to measure, we used reflectance spectra in database for glossy surfaces. Figure 3 shows the distribution of CIE1976 u'v' chromaticity coordinates under all illuminants of all matte and glossy surfaces with spectral reflectances from database.

Distribution of CIE1976 u'v' chromaticity coordinates of Munsell matte and glossy surfaces under D65 (denoted by circles), F (squares), and TL84 (triangles) illuminants.
Procedure
During the experiment, the observer sat in front of the color assessment cabinet with a black coat and white gloves. The observer adapted to illuminant for 3 min before classification. All 208 matte or 260 glossy surfaces were laid out on the ground of the viewing cabinet simultaneously in a random arrangement, and observers were instructed to divide them into nine color category groups (red, green, blue, yellow, brown, orange, purple, pink, and gray) and choose the best example for each color category, that is, color category prototype. During the color classification, observers were allowed to interact with each surface and move the surface to the color category group it belongs to.
First, each observer completed sequentially classification tasks of matte surfaces under D65, F, TL84, and D65 illuminants. Then, classification tasks of glossy surfaces were completed according to the same sequence of illuminant conditions. Each observer completed one session corresponding to one illuminant condition a day. The first session of D65 illuminant condition, hereinafter referred to as D65(1) condition, was used as a training session. The second session of D65 illuminant condition, referred to as D65(2) condition, was used as a standard illuminant condition from which the color constancy degree of each color category under F and TL84 illuminants was evaluated. D65(1) condition was also compared with D65(2) condition to obtain intra-observer variability of classification in results part.
Each session took about 30 min on average. The whole experiment took about 32 h in total (0.5 h × 4 illuminants × 8 observers × 2 types of color surfaces). Data for matte surfaces have already been presented in our previous study (Ma et al., 2022) as the data of color normal observers in the comparison with data of color-deficient observers.
Data Analysis
In order to examine whether the surface gloss has an effect on the color classification under a constant illumination, the classification results of matte and glossy surfaces under D65, F, or TL84 illuminant were compared in a*-b* plane of CIE LAB color space. The centroids, prototypes, and response frequency for color categories were also compared between matte and glossy surfaces.
Our visual system has an ability to keep constant color perception of objects across illumination changes, that is, the color perception of objects remains unchanged although their chromaticity coordinates have changed with illumination changes. Arend et al.'s color constancy index (1991) that was used to evaluate color constancy degree in asymmetric color matching was modified in this study to measure the color constancy degree of each color category.
The matched point, the standard point, and the theoretical point in Arend et al.'s constancy index were replaced with the observed centroid, the standard centroid, and the theoretical centroid of one color category. Consequently, the constancy index of one color category was defined as I = 1−dop/dsp, wherein dop denotes the Euclidean distance in CIE1976 u'v' chromaticity diagram from the observed centroid of this color category to the theoretical centroid, and dsp denotes the Euclidean distance from the standard centroid to the theoretical centroid. The standard centroid was calculated based on the classification results of observers under D65 illuminant, and was the average of the u'v' chromaticity coordinates under D65 illuminant of all surfaces classified into this category under D65 illuminant condition. The theoretical centroid was theoretically calculated based on the assumption that all surfaces classified into one color category under D65 illuminant would continue to be classified into this category under F or TL84 illuminant, and was the average of u'v' chromaticity coordinates under F or TL84 illuminant of all surfaces that were classified into this color category under D65 illuminant condition. The observed centroid was calculated based on the classification results of observers under F or TL84 illuminant, and was the average of u'v' chromaticity coordinates under F or TL84 illuminant of all surfaces classified into this category under F or TL84 illuminant condition. When the observed centroid overlapped with the theoretical centroid, perfect color constancy is obtained and constancy index I is equal to 1. When the observed centroid coincided with the standard centroid, there is no color constancy and constancy index I is equal to 0, meaning that observers made color classifications according to reflected spectra of surfaces under F or TL84 illuminant, not spectral reflectance characteristics of surfaces.
The constancy index for the category prototype was also calculated for illumination changes from D65 to F or TL84 illuminant for matte and glossy surfaces. It reflected how constant the prototypical color chosen by the observer for each color category can be across illuminant changes.
Results
Intra-Observer Variability of the Centroids and Prototypes Between two Sessions of D65 Illuminant Conditions
Theoretically, for a perfectly consistent observer, all surfaces classified into one color category under the first session of D65 illuminant should be classified into the same color category under the second session of D65 illuminant, that is, the centroid of each color category should be the same between two sessions. The prototypical color of each color category chosen by the observer under the first session of D65 illuminant should also be chosen as the prototypical color of this color category under the second session of D65 illuminant.
Figure 4a shows the color differences ΔE*ab in CIELAB color space between two centroids of each color category obtained under two sessions of D65 illuminant for matte and glossy surfaces. The color differences ΔE*ab of centroids were relatively small for both matte and glossy surfaces, ranging from 1.9 units of blue category to 5.5 units of yellow category for matte surfaces (mean 3.2 units) and from 1.8 units of green category to 5.0 units of yellow category for glossy surfaces (mean 3.3 units).

The color differences ΔE*ab in CIELAB color space between two centroids (a) and two prototypical colors (b) of each color category under two sessions of D65 illuminant for matte and glossy surfaces. Error bars represent standard errors of the means over eight observers. Labels on the x-axis correspond to pink (Pk), red (R), orange (O), yellow (Y), brown (Br), green (G), blue (B), purple (P), and gray (Gr) categories from left to right.
In a two-way ANOVA analysis with color category and surface glossiness as the factors, there was no statistically significant difference between matte and glossy surfaces (F(1, 126) = 0.078, p = .781); the overall difference among color categories reached statistical significance (F(8, 126) = 3.229, p = .002). The interaction effect between two factors was not found (p = .947). The result indicates that the intra-observer variability of centroids of color categories in color classification is independent of surface glossiness, but relates to the color category.
Figure 4b shows the color differences ΔE*ab between two prototypical colors of each color category chosen by the observer under two sessions of D65 illuminant for matte and glossy surfaces. In general, the color differences of prototypical colors were larger than those of centroids, meaning that the intra-observer variability was larger for prototypical colors than for centroids of color categories. This can be expected because the centroids were values averaged over surfaces classified into this color category, but the prototypical color was one of surfaces in this category.
Similar with color differences of centroids, color category had a significant main effect (F(8, 126) = 3.103, p = .003) on color differences of prototypical colors, while surface glossiness did not (F(1, 126) = 1.552, p = .215)). There was no significant interaction effect between two factors (p = .249). The result indicates that the intra-observer variability of prototypical colors of color categories is also mainly dependent on the color category.
Comparison of Classification Results Between Matte and Glossy Surfaces
Figure 5 shows the classification results in CIELAB color space under the second session of D65 illuminant, F, and TL4 illuminants for matte and glossy surfaces. For comparison between matte and glossy surfaces, only glossy surfaces having the same hue, value, and chroma with matte surfaces, a total of 207 (the glossy surface 7.5PB 5/12 was absent in Munsell Book of Color Glossy Edition) were reserved and others were excluded. The color category of each surface corresponds to the one classified most frequently by observers. From the figure, the distribution of color categories in the chromaticity diagram was almost the same between matte and glossy surfaces under three illuminants. The centroid of each color category tended to fall near the category center, and the prototypical color tended to be the color of the surface with high chroma level in the category. The shift degree of the centroid or prototypical color of each color category across illuminants was evaluated by a color constancy index in the section of Color constancy across illuminants.

Color categories aggregated over observers under three illuminants for matte surfaces (a) and glossy surfaces (b) shown in CIELAB color space. The circles and triangles with black borders represent the centroids and the prototypical colors of color categories respectively. Symbol colors correspond to color categories that they indicate. Note. Please refer to the online version of the article to view the figure in colour.
Figure 6a shows the color differences ΔE*ab in CIELAB color space of centroids between matte and glossy surfaces for each color category under D65(2), F, and TL84 illuminants. The color differences ΔE*ab of centroids between matte and glossy surfaces ranged from 0.9 units of gray category to 6.4 units of yellow category under D65(2) illuminant condition (mean 3.2 units), from 1.6 units of gray category to 8.4 units of yellow category under F illuminant (mean 4.3 units), and from 1.4 units of gray category to 10.5 units of red category under TL84 illuminant (mean 4.8 units). In a two-way ANOVA analysis with color category and illuminant as the factors, the interaction effect between two factors was found (p = .025).

The color differences ΔE*ab in CIELAB color space of centroids (a) and prototypical colors (b) between matte and glossy surfaces for each color category under D65(2), F, and TL84 illuminants. Error bars represent standard errors of the means over eight observers. Note. Please refer to the online version of the article to view the figure in colour.
Figure 6b shows the color differences ΔE*ab in CIELAB color space of prototypical colors between matte and glossy surfaces for each color category under D65(2), F, and TL84 illuminants. The color differences ΔE*ab of prototypical colors between matte and glossy surfaces ranged from 2.2 units of gray category to 16.0 units of green category under D65(2) illuminant condition (mean 10.3 units), from 3.9 units of gray category to 21.0 units of green category under F illuminant (mean 12.6 units), and from 2.1 units of gray category to 17.0 units of green category under TL84 illuminant (mean 11.2 units). By comparison with data in Figure 4, it can be found that color differences here were slightly higher than those in Figure 4. In a two-way ANOVA analysis with color category and illuminant as the factors, the interaction effect between two factors was found (p = .032).
Figure 7 shows the response frequency of each color category, that is, the proportion of the number of surfaces classified into this color category in the whole surface collection, under the second session of D65 illuminant, F, and TL84 illuminants for matte and glossy surfaces. Two things can be seen from the figure. First, the response frequency of each color category was close between matte and glossy surfaces under all illuminants. Second, the tendency of response frequency as a function of color categories was almost the same across illuminants: the response frequency was highest for green category (26.4% averaged over illuminants), and lowest for red (4.4%), orange (4.2%), and yellow (5.6%) categories.

The response frequency of color categories averaged over eight observers for matte and glossy surfaces under three illuminants. Error bars denote standard errors of the means.
A three-way ANOVA analysis with color category, illuminant, and surface gloss was performed on the response frequency. It was found that there were significant interaction effects between color category and illuminant (F(16, 378) = 2.407, p = .002). The interaction effect between three factors was not found.
Figure 8 shows the color differences ΔE*ab between the centroid and the prototypical color of each color category under three illuminants for matte and glossy surfaces. The tendency of the color differences as a function of color categories was generally similar between matte and glossy surfaces and also similar across illuminants. On average over illuminants and surface type, the color difference between the centroid and the prototypical color decreased from about 16.2 units of pink category to about 9.1 units of brown category, then increased to about 24.6 units of green category, and finally decreased to about 2.5 units of gray category. That is, the deviation between the centroid and the prototypical color was highest for pink and green categories, and lowest for brown and gray categories.

The color differences ΔE*ab between the centroid and the prototypical color of each color category for matte and glossy surfaces under three illuminants. Error bars denote standard errors of the means over observers.
In a three-way ANOVA analysis with color category, illuminant, and surface glossiness as the factors, color category had a significant main effect (F(8, 378) = 49.918, p < .001); there were no other effects. The result indicates that the deviation between the centroid and the prototypical color of one color category was dependent on the color category, but not illuminant and surface glossiness.
Color Constancy Across Illuminants
Figure 9 shows the overall color naming consistency for three illuminant pairs for matte and glossy surfaces, which was defined as the proportion of surfaces classified into the same category for each illuminant pair in the whole surface collection. Overall color naming consistency for three illuminant pairs was approximately 80% for both matte and glossy surfaces.

Overall color naming consistency for three illuminant pairs for matte and glossy surfaces. Error bars denote standard errors of the means across eight observers.
No significant effects of illuminant and surface glossiness were found (F(2, 42) = 0.712, p = .496; F(1, 42) = 0.337, p = .565). Such degree of consistency is in line with overall color naming consistency of 80% collected in full-cue condition in the previous study (Olkkonen et al., 2010). In our previous study (Ma et al., 2018), the overall color naming consistency between D65 and blue (or yellow) illuminants was approximately 80%, consistent with the present result, but that between D65 and red (or green) illuminants was about 85%, higher than the present result.
A color constancy index based on the centroids was also used to evaluate color constancy degree of each color category. The detailed process of constancy index calculation was provided in the section “Data Analysis”. Figure 10 show constancy indices based on the centroids of color categories for matte and glossy surfaces under F (a) and TL84 (b) illuminants. Under F illuminant, color constancy index ranged from 0.87 of yellow category to 0.97 of gray category for matte surfaces (mean 0.93) and from 0.85 of orange category to 0.96 of gray category for glossy surfaces (mean 0.92). The color constancy index under TL84 illuminant was lower than under F illuminant, ranging from 0.62 of yellow category to 0.93 of gray category for matte surfaces (mean 0.80) and from 0.69 of yellow category to 0.90 of gray category for glossy surfaces (mean 0.81).

Constancy indices based on the centroids of color categories for matte and glossy surfaces under F (a) and TL84 (b) illuminants. Error bars denote standard errors of the means across eight observers.
In a three-way ANOVA analysis with illuminant, color category, and surface glossiness as the factors, both illuminant and color category had significant main effects (F(1, 252) = 62.847, p < .001; F(8, 252) = 7.914, p < .001), and there was a significant interaction effect between them (F(8, 252) = 2.552, p = .011) (significance level: 0.05). No other effects were found. The result indicates that there was no significant difference of color constancy index based on the centroids of color categories between matte and glossy surfaces.
Figure 11 shows constancy indices based on the prototypical colors. In general, color constancy indices for the prototypical colors were lower than for the centroids. Under F illuminant, color constancy indices varied between 0.65 of pink category and 0.92 of gray category for matte surfaces (mean 0.79), and between 0.67 of orange category and 0.93 of gray category for glossy surfaces (mean 0.84). Similar with Figure 10, color constancy indices under TL84 illuminant (mean 0.53 for matte surfaces and 0.46 for glossy surfaces) were relatively lower than under F illuminant.

Constancy indices based on the prototypical colors of color categories for matte and glossy surfaces under F (a) and TL84 (b) illuminants. Error bars denote standard errors of the means across eight observers.
Constancy indices of yellow and orange categories under TL84 illuminant were very small or even negative with large individual variability for both matte and glossy surfaces. This is because the prototypical colors chosen by some observers under F or TL84 illuminant, that is, observed prototypical colors, were far from the chromaticity points under F or TL84 illuminant of the prototypes chosen under D65 illuminant, that is, theoretical prototypical colors.
A three-way ANOVA analysis with illuminant, color category, and surface glossiness as the factors revealed that there was a significant interaction effect between illuminant and color category (F(8, 248) = 2.721, p = .007), but no significant effect of surface glossiness, indicating that color constancy indices for the prototypical colors were not significantly different between matte and glossy surfaces.
Figures 10 and 11 showed that constancy indices were higher under F illuminant than under TL84 illuminant, irrespective of surface glossiness. In the following part, we reanalyzed the color constancy degree of each color category in terms of the a and b values in the calculation of constancy index based on the centroids.
Figure 12 shows the a values (a) and b values (b) of color categories across illuminant changes from D65 to F and TL84 illuminants for matte and glossy surfaces. The a value denotes the Euclidean distance between the standard centroid and the theoretical centroid in CIE1976 u'v' chromaticity diagram; the b value denotes the Euclidean distance between the matched centroid and the theoretical centroid. Since illuminant changes from D65 to F illuminant caused larger chromatic changes for surfaces than those from D65 to TL84 illuminant, the a values of all color categories were larger under F illuminant than under TL84 illuminant. The b values averaged over matte and glossy surfaces ranged from 0.003 units of gray and green categories to 0.012 units of purple category under F illuminant (mean 0.007 units), and from 0.004 units of gray and green categories to 0.011 units of red category under TL84 illuminant (mean 0.008 units).

The a values (a) and b values (b) of color categories in the calculation of constancy index based on the centroids across illuminant changes from D65 to F and TL84 illuminants for matte and glossy surfaces. Error bars denote standard errors of the means across eight observers. Note. Please refer to the online version of the article to view the figure in colour.
In a three-way ANOVA analysis with illuminant (F and TL84 illuminants), color category, and surface glossiness as the factors performed on the b values, color category had a significant effect (F(8, 252) = 7.476, p < .001); both illuminant and surface glossiness had no significant effects (F(1, 252) = 1.619, p = .204; F(1, 252) = 0.075, p = .784). No interaction effects were found. This indicates that the b value was dependent on the color category, but not the illuminant and surface glossiness.
It can be noted that the b values shown in Figure 12b were quite close to the distance values in Figure 4a of two centroids of color categories obtained under two sessions of D65 illuminant. The assumption of perfect color constancy is that the classification result under F or TL84 illuminant should be the same as that under D65 illuminant, that is, the matched centroids of all color categories should overlap with their theoretical centroids. The present result showed that the deviations of the matched centroids from the theoretical centroids were close to the distances of the two centroids under two sessions of D65 illuminant for all color categories, indicating that the difference of the classification results between F (or TL84) illuminant and D65 illuminant was close to that between two sessions of D65 illuminant. Thus it can be concluded that almost perfect color constancy was achieved under F and TL84 illuminants. That the constancy index was significantly larger under F illuminant than under TL84 illuminant in Figure 10 is expected to be caused by a values.
Discussion
As three perceptual attributes of color, the perception of saturation and lightness was found to be affected by surface gloss (Giesel & Gegenfurtner, 2010; Isherwood et al., 2021), and that of the hue was barely affected (Giesel & Gegenfurtner, 2010). Xiao and Brainard (2008) found that the effect of surface gloss on the color perception of objects was quite small under a constant illuminant in complex scenes and the visual system compensated for gloss changes to stabilize the color appearance. The present study compared classification results between matte and gloss surfaces under D65, TL84, and F illuminants respectively. Color differences of centroids and prototypical colors between matte and glossy surfaces were found to be slightly larger than intra-observer variability of matte and glossy surfaces, but this was expected to be caused by slight differences of spectral reflectances between matte and glossy surfaces. In general, surface gloss had little effect on the color classification results, which is consistent with Xiao et al.'s study (2008).
The result of this study showed little effect of surface gloss on categorical color constancy under a uniform gray background, agreeing with previous findings that surface gloss did not improve color constancy when the local surround cue was presented. In Wedge-Roberts et al.'s study (2020), 3-D scenes with a uniform gray background and 3-D objects were generated by CG. The color constancy of matte objects was found to have no significant difference from that of glossy objects. In Xiao et al.'s study (2012), 3-D scenes were also generated by CG, in which a local uniform gray background was presented. The color constancy of glossy spheres was not found to be better than that of matte spheres when the chromaticity coordinates of the uniform gray background were the same as those of illuminant. Our result is consistent with theirs, although we used real 3-D scenes and 2-D surface stimuli. In Granzier et al.'s study (2014), real 3-D scenes and objects were used. Color constancy was found to be significantly higher for glossy objects than for matte objects, which was independent of the background. This result is inconsistent with our result and results of two studies (Wedge-Roberts et al., 2020; Xiao et al., 2012). The reason of inconsistency might be due to the use of different backgrounds in their study. They used two kinds of background: one is totally black; the other is a black background where a checkerboard was additionally placed. In both backgrounds, the local surround cue was silenced. By combining results of our study and previous studies (Granzier et al., 2014; Wedge-Roberts et al., 2020; Xiao et al., 2012), it can be found that the setting of surroundings, in particular immediate surroundings of objects, is a key factor to determine whether surface gloss can contribute to color constancy.
In Mizokami et al.'s study (2014), real 3-D scenes and objects were used. Their result showed that specular highlights barely contributed to color constancy in a normal viewing condition. Our result using 2-D flat surfaces without object identity familiar to us in life is consistent with theirs, indicating that the contribution of surface gloss on color constancy is not buried by the use of 3-D shape of objects, but by the presence of local surround cue.
In the present study, three illuminants were produced by different kinds of lamps (see Figure 1b), and they had different correlated color temperature and illuminance levels. The distances from three light sources to stimuli were also different. Both distances from light sources to surfaces and illuminance had been found to have a strong effect on gloss perception (Zhu et al., 2022). When illumination changes from D65 to LT84 or F illuminant, the perception of surface gloss has changed, but good color constancy was obtained for glossy surfaces, suggesting that color perception has priority over gloss perception and color constancy is more robust than gloss constancy.
In the color constancy experiments investigated by asymmetric color matching method, observers were required to make paper match, that is, to adjust the color of test patch under chromatic illuminant to make it appear as if it were cut from the same piece of paper as standard patch under neutral illuminant. In categorical color constancy experiment, observers were instructed to classify surfaces into the same number of color categories under both neutral and chromatic illuminants, which can be regarded as a kind of color category match between chromatic and neutral illuminants. In general, color constancy experiments involve a comparison or match of surface colors between chromatic and neutral illuminants. However, the selection task of color category prototype under chromatic illuminant was independent of that under neutral illuminant, that is, observers selected the prototype for each color category under each illuminant according to the criteria that the color of the prototype should be the best example of this color category under this illuminant, which did not involve a comparison between chromatic and neutral illuminants. Thus it is expected that the constancy of the prototypical color across illuminant changes would be poor. The present study, however, showed that the constancy of the prototypical color for each color category is relatively good across illumination changes from D65 to F or TL84 illuminant, indicating that the prototype of each color category is intrinsic for us and not affected by illumination changes.
Footnotes
Author contribution(s)
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by Humanities and Social Science Foundation of the Ministry of Education of China (22YJCZH125); National Natural Science Foundation of China (61705011 and 61872261).
Correction (March 2023):
Article updated to correct the reference cited in the fourth line of second paragraph in section “Color Constancy Across Illuminants”.
