Abstract
Current databases of facial expressions represent only a small subset of expressions, usually the basic emotions (fear, disgust, surprise, happiness, sadness, and anger). To overcome these limitations, we introduce a database of pictures of facial expressions reflecting the richness of mental states. A total of 93 expressions of mental states were interpreted by two professional actors, and high-quality pictures were taken under controlled conditions in front and side view. The database was validated in two experiments. First, a four-alternative forced-choice paradigm was employed to test the ability to select a term associated with each expression. Second, the task was to locate each face within a 2-D space of valence and arousal. Results from both experiments demonstrate that subjects can reliably recognize a great diversity of emotional states from facial expressions. While subjects’ performance was better for front view images, the advantage over the side view was not dramatic. This is the first demonstration of the high degree of accuracy human viewers exhibit when identifying complex mental states from only partially visible facial features. The McGill Face Database provides a wide range of facial expressions that can be linked to mental state terms and can be accurately characterized in terms of arousal and valence.
Introduction
Faces represent a special, very complex class of visual stimuli and have been extensively studied in a wide range of research areas. In particular, facial expressions are among the most important sources of information about the mental states of others. The capacity to make mental state inferences, whether from faces or other sources, is known as theory of mind (ToM), and it is widely agreed that this capacity is essential to human social behaviour. There is also substantial evidence that a ToM deficit may be associated with a variety of clinical conditions, notably autism (Baron-Cohen, Jolliffe, Mortimore, & Robertson, 1997; Baron-Cohen, Wheelwright, Hill, Raste, & Plumb, 2001) and schizophrenia (Bora, Yucel, & Pantelis, 2009; Brüne, 2005; Harrington, Siegert, & McClure, 2005; Sprong, Schothorst, Vos, Hox, & Van Engeland, 2007). Hence, the assessment of ToM is important for the exploration of social cognition in healthy individuals as well as in some patients. It may also be useful to measure a change in the social capacities of patients in psychotherapy. The “Reading the Mind in the Eyes” test (Baron-Cohen et al., 1997, 2001) is a common ToM test in which participants have to choose a mental state term that best characterizes the expression in a picture of someone’s eyes. However, only a small proportion of possible mental states are tested, and the stimuli themselves are of inconsistent quality with respect to image resolution, luminance and perspective. Most other comparable databases of facial expressions of mental states typically include only a small subset of expressions, typically the basic emotions proposed by Paul Ekman (1992): fear, disgust, surprise, happiness, sadness, and anger) – the emotional expressions that are considered universal. However, multiple secondary emotions where two or more primary emotions are mixed (e.g., hatred being a mix of anger and disgust) are highly underrepresented in the databases available. One exception is the “Mind Reading” database (DVD; Baron-Cohen, Golan, Wheelwright, & Hill, 2004) that contains a much wider range of mental states. The Mind Reading DVD is computer-based platform developed to help individuals diagnosed along the autism spectrum to recognize facial expressions. It contains 412 mental state concepts, each assigned to 1 of 24 mental state classes. However, it is designed for commercial and clinical use and specifically targets patients with autism spectrum disorder and Asperger syndrome. A list of popular face stimuli databases is shown in Table 1. Most databases represent only a very small subset of emotions encountered in daily life and often in exaggerated form. To overcome these limitations, we have developed and validated a large new database of pictures of facial expressions – the McGill Face Database – that reflects some of the richness of human mental states. The database contains high-resolution pictures of 93 expressions of mental states that were interpreted by two professional actors (one male and one female) in front and side view – 372 images in total. In this article, we present two different experiments to investigate subjects’ ability to recognize the facial expressions in the database. In Experiment 1, we employ a four-alternative forced-choice (4-AFC) paradigm, based on previous studies (Baron-Cohen et al., 1997, 2001). The task for the observer in this experiment was to choose, out of four terms, the one that best identifies the mental state expressed. Given that a particular “correct” term is only a representation of the actors’ interpretations of the mental state, a second validation experiment (Experiment 2) was carried out, which did not rely on the semantics of the mental state terms. Instead, the observers located each face within a two-dimensional (2-D) space of valence and arousal (mental state – space) employing a “point-and-click” paradigm (Jennings, Yu, & Kingdom, 2017).
Summary of Face Databases.
Note. n.s. = not specified.
Database
Actor Recruitment
Five male and five female professional native English-speaking actors were invited to take part in an audition. The actors’ performance was judged by a panel of two of the authors and a theatre-experienced Professor of Drama and Theatre in the McGill Department of English. During the audition, one male and one female actor engaged in various improvisation exercises. The “best actors” were those who exhibited the most precise, nuanced, and yet readable range of emotional expression in their faces, that is, that clarity of emotional expression – as captured by the camera – was paramount. Some actors were better able to convey different emotions through subtle recalibration of facial expression, while others either got “stuck in look” or fell into exaggerated or melodramatic countenances. The two best performing actors (male, age 29; female, age 23) were chosen to take part in a photo shoot based on a majority vote. The actors gave informed consent and signed an agreement allowing for the pictures to be used for research and other noncommercial purposes. The actors were compensated for their work.
Images
Equipment
The pictures were taken by a professional photographer with a Canon 70 D digital camera mounted on a tripod at a distance of 1.5 m from the actor. The optic was a Canon 85 mm, f1.8 with a shutter speed of
Image acquisition
The pictures were taken in two separate sessions at a studio specifically prepared for that purpose. During the sessions, the actor was positioned in front of a white screen. The instructor provided the mental state term and read the corresponding short explication provided in the Glossary in Appendix B of Baron-Cohen et al. (2001). The actor was given as much time as needed to prepare the interpretation for the relevant expression. When the actor gave a hand signal to the photographer, a single picture was taken in front view. Importantly, to guarantee a natural interpretation of a given expression, we did not restrict the head tilt. The actor then immediately turned to face a mark 30° from the camera, and a second picture was taken. This procedure was repeated three to four times for each of 93 mental state terms used in the Reading the Mind in the Eyes test (Baron-Cohen et al., 2001; Table 2).
Summary of Terms in the McGill Face Database.
Image selection
A focus group, consisting of six referees (four females and two males) were presented with the different images for a given expression and asked to compare their quality and expressivity of mental state. Four out of six referees had to agree on a picture for it to be selected for inclusion in the database. The full database can be downloaded at: http://www.gunnar-schmidtmann.com/stimuli-software#McGillFaceDatabase.
Image specificities
The database contains 372 jpeg image files with a resolution of 5,472 × 3,648 pixel (colour space profile: sRGB IEC61966-2.1). The size of each image is 7.3 MB. The image files have not been postprocessed. Raw image files are available upon request from the first author.
Experiment 1
Methods
Subjects
All participants were recruited via the McGill Psychology Human Participant Pool or via public advertisements. Thirty-three individuals (7 males, 26 females; mean age 21 years, ±2.96 SD) participated in Experiment 1. All subjects were native English speakers and were naïve as to the purpose of the study. Subjects had normal or corrected-to-normal visual acuity. Informed consent was obtained from each observer. All experiments were approved by the McGill University Ethics committee and were conducted in accordance with the original Declaration of Helsinki.
Apparatus
The face stimuli were presented using MATLAB (MathWorks, Natick, Massachusetts, USA) (MATLAB R 2016b, MathWorks) on either a CRT monitor running with a resolution of 1,600 × 1,200 pixel and a frame rate of 60 Hz (mean luminance
Procedure
A 4-AFC paradigm was employed to test the ability of participants to correctly select the term associated with each picture in the database. All 372 pictures (93 male front view, 93 male side view, 93 female front view, and 93 female side view) were tested in one experimental block. The images were presented in random order, different for every observer. Stimuli were presented for 1 s. This presentation time was based on previous results, where identification accuracy for the same face stimuli was measured as a function of presentation time (Schmidtmann, Sleiman, Pollack, & Gold, 2016). The presentation of the face image was followed by the presentation of the target (correct) term as well as three distractor terms. Importantly, to minimize a decision bias caused by specific terms, the distractor terms were randomly selected from the remaining 92 terms shown in Table 2. In other words, each observer was presented with different distractor terms for each face. The terms were presented on a midgrey screen in a diamond-like arrangement (see Figure 1), corresponding to the cursor keys on a computer keyboard, which were used to by the observers to make their choice. The target term could occur in one out of four locations, which was randomly determined. The task for the observer was to choose the term most appropriate to the expression in the picture. Participants were given a break after 93 presentations, that is, three breaks in total.

Experiment 1: Example of decision display. All 372 images in the database were presented in a random order in a single experimental block. Within one experimental trial, a picture was shown for 1 s, followed by a search display illustrated in the figure.
Results
Table 3 summarizes the performance (percent correct) across 33 subjects. The guess rate in a 4-AFC paradigm is 25%. χ2 tests with a Yates (1934) correction for continuity (p > .05) were performed to determine whether performances were significantly different from chance level for a given term. Performances not significantly better than chance are shown by the by the asterisks in Table 3 and by the lines in Figures 2 and 3 showing the sorted percent correct performances for the actors in front and side view as bar plots. Results show that for the pictures of the female actor, subjects performed significantly better than chance in 78 of 93 images (84%) for the front view condition and 74 of 93 images (80%) of the side view pictures. For the male actor, subjects performed significantly better than chance in 67 of 93 images (72%) in front view and 61 of 93 images (66%) in side view. The nonsignificant terms are summarized in Table 4. Interestingly, 13 of these 52 nonsignificant cases occur in judgements of both the female and male actor. Furthermore, in 8 of these 52 terms, subjects performed no better than chance for 3 or 4 of the images. These terms are indicated by the the asterisks in Table 4.
Percent Correct for the Images Averaged Across 32 Subjects.
Note. The guess rate is 25%. Asterisks indicate performances that are statistically not better than chance (χ2 – Yates correction for continuity; α > .05).

Bar plots showing percent correct for the 93 terms in the database for the female actor in both views. The dashed line represents the guessing rate (25%). Performances that are statistically not better than chance (χ2 – Yates correction for continuity; α > .05) are indicated by the solid lines in each graph.

Bar plots showing percent correct for the 93 terms in the database for the male actor in both views. The dashed line represents the guessing rate (25%). Performances that are statistically not better than chance (χ2 – Yates correction for continuity; α > .05) are indicated by the solid lines in each graph.
Summary of Terms (Sorted Alphabetically) in Which Participants’ Performances Were Not Significantly Better Than Chance.
Note. Asterisks indicate cases that were not significant in three or more conditions.
In addition, we conducted parametric Pearson correlation between each combination of the stimuli tested in Experiment 1. Results show statistically significant correlations between results for the female faces in front and side view (r = .555, p < .001, n = 93), male faces in front and side view (r = .598, p < .001, n = 93), and female and male faces in front view (r = .336, p = .001, n = 93). All other correlations are presented in Table A1.
Experiment 2
Methods
Subjects
Thirty-two subjects participated in Experiment 2 (10 males, 22 females; mean age 22 years, ±4.13 SD).
Procedure
We employed a “point-and-click” task that did not rely on any semantic information being presented to observers during trials (Jennings et al., 2017). The complete set of images (372) was presented in a random order. Each image was displayed for 1 s followed by the 2-D mental state-space (Russell, 1980), presented until the observer submitted a response (Figure 4 shows the 2-D space). Once the 2-D space was displayed, the observers’ task was to click a computer mouse on the point within the space deemed most appropriate to the facial expression displayed in the image. The horizontal direction represented a rating of valence (pleasant vs. unpleasant) and the vertical direction a rating of arousal (low vs. high). Example emotions corresponding to different regions of the space are illustrated by the red text (not visible during testing) in Figure 4. The axes as well as the example mental states (red) were used to instruct the observer during training. To evaluate whether participants tended to locate facial expressions in similar regions of the 2-D space, we calculated an agreement score (ηagreement) for each image among 32 observers in the following way.

Experiment 2: The image was presented for 1 s, followed by the presentation of a valence-arousal space, extending from low to high arousal in one dimension and pleasant to unpleasant in the other dimension. Note: The red terms provide illustrations of the appropriate location of mental state terms used (the red text was not visible during testing).
First, the median arousal (Amedian) and valence (Vmedian) coordinates were calculated across all observer responses for a given condition. Second, the Euclidian distance (r) for each of the observers’ response, and hence the mean (rmean, see Equation (1)), was determined. Finally, these values were normalized (based on the highest mean value, rmax) and shifted according to the lowest value (rmin, see Equation (3)). This transformation produced agreement scores (ηagreement) so that a score of 1 corresponds to the greatest agreement between subjects and as the scores decrease, the agreement between subjects’ decreases, that is, emotion ratings were less tightly clustered around the mean location (see Equation (3)). Figure 5 illustrates the procedure for four hypothetical data points located within a subsection of the arousal-valence space.

A subsection of the valence-arousal space showing four hypothetical responses (black dots); the red dot represents the mean valence and arousal. The agreement score (ηagreement) is determined by the mean Euclidian distance r.
Results
Tables 5 and 6 summarize the agreement scores (ηagreement), for the female and male face stimuli, respectively. To visualize the magnitude of the agreement scores within the valence-arousal space, three examples are illustrated in Figure 6. The circles are rendered with a radius equal to the values produced by Equation (1), and the corresponding agreement values are stated for comparison. The results for each of the 93 terms can be found in Supplemental Material Document 1. A Pearson correlation analysis between all stimulus types (female front view, female side view, male front view, and male side view) revealed statistically significant correlations between the following three conditions: (a) male front view versus female front view (r = .34, p < .001, n = 93), (b) male front view versus male side view (r = .54, p < .001, n = 93) and (c) female front view versus female side view (r = .52, p < .001, n = 93). Interestingly, there was no statistically significant correlation between female side view and male side view (r = .074, p = .48, n = 93).
Agreement Scores (ηagreement) for the Female in Front and Side View.
Note. Terms are sorted from high to low scores in each view.
Agreement Scores (ηagreement) for the Male Actor in Front and Side View Terms Are Sorted From High to Low Scores in Each View.
Terms That Show a Difference Between the Observers’ Decisions for the Female and Male Interpretations of Mental States and the Corresponding Results (% Correct) in Experiment 1.
Note. Asterisks indicate performances that are statistically not better than chance (χ2 – Yates correction for continuity; α > .05).

The figure shows a visualization of the magnitude of the agreement scores (ηagreement) within the valence-arousal space for three examples (amused, depressed, and hopeful). The radius of the green circles is equal to the values produced by Equation (1). The corresponding agreement scores (ηagreement) are represented in each graph.
In addition, we have created a method to visualize the amount of overlap between the different circular regions (i.e., rmean, calculated according according to Equation (1)). The figures presented in Supplemental Material (Document 2) show matrices that show combinations with no overlap with respect to rmean in the valence-arounal space (shown in black). In addition, we have also calculated the overall proportion of combinations that have no overlap. This analysis revealed the following results: female front view: 17.8%; female side view: 15.9%; male front view: 13.3% and male side view: 7%. These matrices are very detailed and should be viewed in a higher magnification. The matrices can be found in Supplemental Material Document 2.
Relationships Between Experiment 1 and Experiment 2
A careful inspection of the raw data provided in the Supplemental Material (Document 1) reveals clear differences between the observers’ decisions for the female and male interpretations of the mental states. Specifically, subjects selected coordinates in the left, unpleasant, more negative quadrants for one stimulus type (e.g., male) and the opposite, pleasant quadrants for the other stimulus type (e.g., female), or vice versa. This seems to suggest that specific facial expression can represent various mental states or that specific mental states can be interpreted (by the actor) in different, opposite ways. For instance, in the case of sarcastic, the subjects consistently selected the two quadrants in the unpleasant region (left: red and black data points) for the female version, but the pleasant quadrants for the male face, which is presumably based on the more positive interpretation of this particular mental state by the male actor (i.e., smiling). Interestingly, these differences are also reflected in the results for Experiment 1 for these terms, which are summarized in the Table 7. The differences between subjects’ performances for male and female interpretations of mental states in Experiment 1 are particularly dramatic for anticipating (side view), comforting, confident, contented, and decisive. With the exception of decisive, where the performance is better for the male version, it is usually the female stimulus that elicits better performances. We believe that this is related to not only the actors’ ability to express mental states, but also their interpretation of the mental state. Sarcastic, however, is an interesting case. Performances between the two stimulus types (female and male) are very similar in Experiment 1, but lead to completely different decisions in Experiment 2.
In a final analysis, parametric Pearson correlation tests were conducted between the percent correct performance for each stimulus in Experiment 1 and the agreement score ηagreement for each stimulus in Experiment 2. This analysis showed statically significant correlations between the results for male faces in front view in Experiment 1 and male faces in front view in Experiment 2 (r = –.302, p = .003, n = 93), for male faces in front view in Experiment 1 and female faces in front view in Experiment 2 (r = –.216, p = .038, n = 93) and for male faces in side view in Experiment 1 and male faces in front view in Experiment 2 (r = –.311, p = .002, n = 93; see Table 8).
Discussion
Most currently available image databases of facial expressions of mental states include only a very small range of possible mental states. With the exception of the “Mind Reading” platform (Baron-Cohen et al., 2004), the vast majority of free databases employ the basic emotions proposed by Paul Ekman (1992; e.g., fear, disgust, surprise, happiness, sadness, and anger; see Table 1). Even the full set of emotions, however, constitute only one category of mental state to which ToM is directed. To investigate ToM comprehensively, a more expansive set of stimuli is desirable. The aim of the current study was to develop and to validate a new database of such stimuli reflecting a greater variety of mental states. The McGill Face Database includes four representations of 93 mental state terms. The pictures are unmodified but can be altered if users wish to do so. To determine the usefulness of the database, two validation experiments were carried out. These experiments revealed considerable agreement among participants regarding the mental state expressed by the faces. Results from Experiment 1 demonstrate that subjects can reliably select the correct term associated with a particular mental state despite the sematic complexity of the terms denoting them. Subjects performed significantly better than chance in 78 of 93 front view images and 74 of 93 side view images of the female actor, and they performed significantly better than chance in 67 of 93 front view and 61 of 93 side view images of the male actor. Results from this experiment also show that subjects performed better with images of the female actor, most likely because she was more expressive than the male actor. It is noteworthy that while subjects’ performance was better for front view images, the advantage over the side view was not dramatic (female: 84% vs. 80%; male: 72% vs. 66%). To our knowledge, this is the first demonstration of the high degree of accuracy human viewers exhibit when identifying complex mental states from only partially visible facial features. The Pearson correlation analyses for Experiment 1 show a highly significant correlation between the two views of the same face as well as between front views of the male and female faces. The slightly more difficult side view task together with differences across the male and female faces presumably accounts for the absence of the full complement of correlations. The aim of the validation in Experiment 2 was to develop a task that is independent of the complex vocabulary used in Experiment 1. This approach has a number of advantages. First, some of the mental state terms may be more likely to be chosen just in virtue of their meanings. These biases would distort subjects’ performance. Second, the facial expressions produced by the actors are interpretations of mental state terms, and some interpretations may be more easily associated with a target term than others. In this respect, the relationship between the facial expressions and the mental state terms explored in Experiment 1 is distinctly different from the relationship between the basic emotions and the facial expressions to which they correspond. Although it is widely agreed that each basic emotion is represented by a single characteristic expression, many facial expressions might be thought to correspond to the mental state terms. Finally, it is of particular importance to be able to carry out ToM experiments without difficult vocabulary if one wants to study individuals with intellectual disabilities or those suffering from conditions associated with impaired linguistic ability. The “point-and-click” paradigm in which subjects had to indicate the location of a given facial expression in a logical space (Russell, 1980), along the dimensions of valence and arousal, makes this possible (Jennings et al., 2017). Results from this experiment show that there is substantial agreement across individuals about how to characterize faces along these dimensions. In addition, there is a high correlation between the face stimuli between perspectives and gender. The imperfect correlation between performance in the two experiments can be attributed to the presence of linguistic items in the first experiment and their absence in the second, as well as the difference in the specificity of the judgements required; the 2-D space used in Experiment 2 is a much coarser framework for classifying facial expressions than is the method of assigning a quite specific term to each face. The McGill Face Database thus provides a wide range of facial expressions of mental states that can be linked to mental state terms as well as accurately characterized in terms of arousal and valence independently of any such terms.
Supplemental Material
PEC901671 Supplementary Material1 - Supplemental material for The McGill Face Database: Validation and Insights Into the Recognition of Facial Expressions of Complex Mental States
Supplemental material, PEC901671 Supplementary Material1 for The McGill Face Database: Validation and Insights Into the Recognition of Facial Expressions of Complex Mental States by Gunnar Schmidtmann, Ben J. Jennings, Dasha A. Sandra, Jordan Pollock and Ian Gold in Perception
Supplemental Material
PEC901671 Supplementary Material2 - Supplemental material for The McGill Face Database: Validation and Insights Into the Recognition of Facial Expressions of Complex Mental States
Supplemental material, PEC901671 Supplementary Material2 for The McGill Face Database: Validation and Insights Into the Recognition of Facial Expressions of Complex Mental States by Gunnar Schmidtmann, Ben J. Jennings, Dasha A. Sandra, Jordan Pollock and Ian Gold in Perception
Footnotes
Acknowledgements
We would like to thank Prof. Erin Hurley from the Department of English at McGill University for her help with the audition and recruitment process of the actors and for her helpful comments in that process. We further thank Scott Brevity Cope (photographer) and Edward Schokking for their help.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by a Social Sciences and Humanities Research Council of Canada grant #435–2017-1215, 2017 given to I. G. and G. S.
Supplemental Material
Supplemental material for this article is available online.
Appendix
Parametric Pearson Correlations.
| Exp_1_female_front | Exp_1_female_side | Exp_1_male_front | Exp_1_male_side | Exp_2_female_front | Exp_2_female_side | Exp_2_male_front | Exp_2_male_side | ||
|---|---|---|---|---|---|---|---|---|---|
| Exp_1_female_front | 1 | .555** | .336** | .201 | –.087 | –.09 | –.123 | –.131 | |
| Sig. (two-tailed) | 0 | .001 | .053 | .408 | .389 | .239 | .211 | ||
| N | 93 | 93 | 93 | 93 | 93 | 93 | 93 | 93 | |
| Exp_1_female_side | .555** | 1 | .157 | .193 | –.168 | –.078 | –.145 | –.089 | |
| Sig. (two-tailed) | 0 | .133 | .064 | .107 | .46 | .167 | .397 | ||
| N | 93 | 93 | 93 | 93 | 93 | 93 | 93 | 93 | |
| Exp_1_male_front | .336** | .157 | 1 | .598** | –.216* | –.145 | –.302** | –.125 | |
| Sig. (two-tailed) | .001 | .133 | 0 | .038 | .167 | .003 | .232 | ||
| N | 93 | 93 | 93 | 93 | 93 | 93 | 93 | 93 | |
| Exp_1_male_side | .201 | .193 | .598** | 1 | –.175 | –.163 | –.311** | –.178 | |
| Sig. (two-tailed) | .053 | .064 | 0 | .093 | .119 | .002 | .089 | ||
| N | 93 | 93 | 93 | 93 | 93 | 93 | 93 | 93 | |
| Exp_2_female_front | –.087 | –.168 | –.216* | –.175 | 1 | .520** | .436** | .321** | |
| Sig. (two-tailed) | .408 | .107 | .038 | .093 | 0 | 0 | .002 | ||
| N | 93 | 93 | 93 | 93 | 93 | 93 | 93 | 93 | |
| Exp_2_female_side | –.09 | –.078 | –.145 | –.163 | .520** | 1 | .391** | .297** | |
| Sig. (two-tailed) | .389 | .46 | .167 | .119 | 0 | 0 | .004 | ||
| N | 93 | 93 | 93 | 93 | 93 | 93 | 93 | 93 | |
| Exp_2_male_front | –.123 | –.145 | –.302** | –.311** | .436** | .391** | 1 | .519** | |
| Sig. (two-tailed) | .239 | .167 | .003 | .002 | 0 | 0 | 0 | ||
| N | 93 | 93 | 93 | 93 | 93 | 93 | 93 | 93 | |
| Exp_2_male_side | –.131 | –.089 | –.125 | –.178 | .321** | .297** | .519** | 1 | |
| Sig. (two-tailed) | .211 | .397 | .232 | .089 | .002 | .004 | 0 | ||
| N | 93 | 93 | 93 | 93 | 93 | 93 | 93 | 93 |
Note. *Correlation is significant at the .05 level (two-tailed); **correlation is significant at the .01 level (two-tailed).
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
