Abstract
An artist-led exploration of portrait accuracy and likeness involved 12 Artists producing 12 portraits referencing a life-size 3D print of the same Sitter. The works were assessed during a public exhibition, and the resulting likeness assessments were compared to portrait accuracy as measured using geometric morphometrics (statistical shape analysis). Our results are that, independently of the assessors' prior familiarity with the Sitter’s face, the likeness judgements tended to be higher for less morphologically accurate portraits. The two highest rated were the portrait that most exaggerated the Sitter’s distinctive features, and a portrait that was a more accurate (but not the most accurate) depiction. In keeping with research showing photograph likeness assessments involve recognition, we found familiar assessors rated the two highest ranked portraits even higher than those with some or no familiarity. In contrast, those lacking prior familiarity with the Sitter’s face showed greater favour for the portrait with the highest morphological accuracy, and therefore most likely engaged in face-matching with the exhibited 3D print. Furthermore, our research indicates that abstraction in portraiture may not enhance likeness, and we found that when our 12 highly diverse portraits were statistically averaged, this resulted in a portrait that is more morphologically accurate than any of the individual artworks comprising the average.
Introduction
In August 2017, a public science project, The Science of Portraiture, was undertaken as part of Australia’s National Science Week and involved the Red Point Artists Association (RPAA), the Australian Broadcasting Corporation’s Radio Illawarra (ABC) and two of the University of Wollongong’s research centres: the Centre for Archaeological Science (CAS) and the ARC Centre of Excellence for Electromaterials Science (ACES). The research question underpinning the project was the extent to which likeness judgements of artistic portraits undertaken and exhibited under relatively naturally occurring (ambient) conditions are related to their accuracy of depiction.
This project builds on previous research involving portrait drawings undertaken by one Artist of 30 work colleagues, with each portrait being metrically and visually assessed for likeness (Hayes & Milne, 2011). This was the first study to see if geometric morphometrics (a landmark-based statistical shape analysis more typically applied to analyse variance in biological forms) could be usefully applied to artistic depictions. The results were that this approach can identify relative shape accuracy in artworks, but the findings were generally inconclusive regarding the influence of drawing accuracy on portrait likeness judgements. Contributing factors were likely the relatively small number of assessors (n = 25) having different levels of familiarity with the depicted, and that the likeness assessments involved laboratory-based Power-Point displays of the portraits paired with the photographs from which they had been derived.
In contrast to the previous study, this current project is far more contextualised within actual art practices. For this project 12 Artists produced 12 unique portraits of the same Sitter, with each Artist achieving the parameters of the Sitter’s face by referencing a life-size 3D print. A public exhibition of the resulting artworks was attended by over 200 members of the general public. Of these visitors, 162 chose to assess the portraits for likeness, with each recording, along with their likeness assessments, their level of prior familiarity with the Sitter’s facial appearance.
Our concern with portrait accuracy is because, more than any other visual art genre, portraiture is understood to be mimetic, an art form that imitates life (von Alphen, 1997; West, 2004). Portraiture is also an unusual art form in that it is inescapably bound to the visual identity of a unique individual (Brilliant, 1991), to the extent that a portrait of low artistic merit will be judged as having greater authenticity if it is one of the few to have been undertaken from life (Barlow, 1997). As such, an exhibition of portraits is typically viewed ‘as a collection of people, rather than a display of art works’ (West, 2004, p. 49). However, while art manuals strongly emphasise that accurately depicting a Sitter’s unique head pose, feature shapes and their spatial configurations is essential within portraiture (e.g., Aristides, 2006; Edwards, 1999; Faigin, 1990; Maughan, 2004; Speed, 1917), the notion that portrait = person is not a literal equivalence. The use of the term ‘likeness’ carries with it an acknowledgement that portraits are similar – but not identical – to the person depicted (Brilliant, 1991). All portraits, whether they are sketched, painted, sculpted, photographed, scanned and/or 3D printed, are always imperfect copies. This is because, of necessity, the process of translating a person into a portrait involves partiality (not everything is included) and mediation (where what is included is transformed) (West, 2004). When this translation includes strategic absences that allow the viewer to actively see what is not there, and transformations of distinctive features through exaggeration of these features, then a ‘good’ likeness is more likely to result (Arnheim, 1974; Gombrich, 1977, 1982; Ramachandran & Hirstein, 1999). And what constitutes a ‘good’ likeness in traditional portraiture is understood to be its capacity to convey immediacy, of eliciting a response in the viewer ‘as if’ the Sitter were present (Brilliant, 1991).
In addition to the active viewing that arises from filling in missing details and a ‘peak shift’ from the exaggeration of salient features (Ramachandran & Hirstein, 1999), it is likely that the ‘immediacy’ of a good portrait likeness includes its capacity to elicit recognition of the depicted. While the role of recognisability in portraiture is not specifically discussed in the literature, to some extent this can be inferred from the legal decision ∼150 years ago that portrait likeness judgements can only be undertaken subjectively, and only by those intimately familiar with the appearance of the depicted (Barnes v. Ingalls, cited in Brilliant, 1991). Stronger support for the likeness judgements of portraits being related to their recognisability comes from face perception studies, where familiar face recognition is typically described as a rapidly processed perceptual Gestalt (i.e., immediacy), while unfamiliar viewers tend to engage in a piecemeal process of feature-by-feature comparison (for a recent review, see Young & Burton, 2017). Furthermore, and more centrally, a new finding by Ritchie, Kramer, and Burton (2018) is that likeness is, in fact, linked to recognition. This study involved ratings of ‘ambient’ (naturally occurring and diverse) portrait photographs depicting the same person, and found that likeness judgements are significantly related to the viewer’s prior experience with the facial appearance of the depicted, with the greater the level of prior familiarity, the higher the likeness ratings.
The use of ‘ambient images’ in face perception studies is a relatively recent revision of experimental procedures that had tended to equate the perception of portrait photographs with actual faces, and an awareness that there is a high degree of variation across portrait photographs of the same person (Jenkins, White, Van Montfort, & Burton, 2011). This shift towards a greater use of naturally occurring (i.e., not purpose built), diverse images is also linked to the notion that overly high levels of experimental control might lead to results that are more about the controls than the variable, and criticisms that researchers do not always differentiate between familiar face recognition and unfamiliar face matching, nor account for individual differences in the capacity to perform either task (Burton, 2013; Young & Burton, 2017). In effect, it is arguable that this revision has reframed photographs as what they are – artworks that are both partial and mediated translations of a person’s face and assessed for likeness according to how well they are recognised by familiars of the depicted.
Portrait photographs as artistic transformations is, of course, implicitly present in the face perception literature prior to the incorporation of ambient images, with transformative aspects being isolated and their effects studied extensively. This includes, but is not limited to, transformations arising from: head pose (Campbell, Benson, Wallace, Doesbergh, & Coleman, 1999), camera focal length (Třebický, Fialová, Kleisner, & Havlíček, 2016), image resolution (Bruce, Henderson, Newman, & Burton, 2001), texture and hue (Bruce & Young, 1998), lighting direction (Hill & Bruce, 1996; Johnston, Hill, & Carman, 1992), shearing (Hole, George, Eaves, & Razek, 2002), post-processing of feature size and position (Ge, Luo, Nishimura, & Lee, 2003; Hosie, Ellis, & Haig, 1988; Lewandowski & Pisula-Lewandowska, 2008) and automated exaggeration of distinctive features (caricaturing) (Benson & Perrett, 1991; Brennan, 1985; Rhodes, 1996). However, while these portrait photographs are artistic transformations, it has been argued that the use of predetermined variables results in alterations that are relatively simple and unilinear, and as such fall short of the actual diversity present in ambient portrait photographs (Ritchie & Burton, 2017). Nevertheless, compared with portrait drawings, even ambient photographs are relatively limited in what they transform.
Portrait drawings have also been studied, though one criticism is that these studies tend to be overly reliant on visual assessment (Ostrofsky, Cohen, & Kozbelt, 2014). Metric measures that have been applied to artworks including geometry (Day & Davidenko, 2017; Perdreau & Cavanagh, 2014), spatial ratios/indices (Ostrofsky, 2015; Ostrofsky et al., 2014) and counts of stroke frequencies (Kozbelt, Seidel, ElBassiouny, Mark, & Owen, 2010). However, independently of whether the artworks are metrically or visually assessed, there is a marked tendency for the producers of the images to be either non-artists or a mix of non-artists and visual art students. Although there are a small number of studies that involve artists at work (e.g., Konecni, 1991; Miall & Tchalenko, 2001; Solso, 2001), this underrepresentation of artists in studies of artistic depiction is somewhat problematic, though no doubt due to the pragmatics of participant recruitment. A further potentially problematic area is that, whether or not artists are involved in the study, the images are both produced and assessed in ways that are atypical to art practices outside of the laboratory. In other words, as with most studies of portrait photographs, studies involving portrait drawings tend to lack ambience.
To the best of our knowledge, this is the first artist-led research project to explore artistic production, exhibition and likeness judgements under relatively ambient conditions, and one of the few to apply geometric morphometrics to analyse how, and to what extent, the shapes of the face are transformed when 12 Artists produce 12 portraits depicting the same individual (see Figure 1). Our area of particular interest is the extent to which likeness assessments are related to the relative accuracy of depiction and the assessor’s prior familiarity with the Sitter’s facial appearance.
The portraits. The alphanumeric code given to each portrait is indicated below.
Methods
Sitter and Artist Recruitment and Collaboration
Following the Sitter’s agreement to participate, a call-out for RPAA members was undertaken. This resulted in 12 Artist collaborators, most of whom are highly experienced visual artists working across a range of media, though not all had extensive experience in portraiture. All were aware that participation in the project included the active and informed research collaboration of both Sitter and Artists regarding the study design, including identifying the research question, the conditions under which the portraits were produced and documented, the ways the resulting works were exhibited, how the works were assessed and analysed, and the submission of the study as a coauthored research article. Group input and agreement regarding these elements was achieved through a series of face to face meetings and e-mail.
3D Print
To reduce variables regarding the Sitter’s head-pose for the portrait and to ensure the Artist-Sitter engagement was similar for all, each Artist set the parameters for their work in reference to a life-size 3D print of the Sitter’s head and face. The Sitter was first digitally photographed at the ABC studios with his head in the standard anatomical position at a distance of 2.5 m using a Canon EOS60D mounted on a tripod with the focal point of the Sitter’s sellion (a point on the midline of the face at the deepest part of the nasal bridge below the brow). The Sitter’s head and face were subsequently 3D scanned and printed at ACES who undertook this specifically for this project (see Supplementary Information).
Visual Support
None of the 12 Artists had prior familiarity with the Sitter’s facial appearance, and therefore supplementary visual information was provided in preparation for the production of the portraits. This included links to footage of the ABC digital photography session (https://www.facebook.com/abcillawarra/videos/1364291896949993) and the ACES scanning session (https://www.facebook.com/abcillawarra/videos/1402986829747166), together with the digital photographs taken during both sessions (e.g., Figure 3). It was also agreed that each Artist could source additional images of the Sitter from the Internet if they wished.
Production of the Portraits
The 3D print produced by ACES was mounted and placed on an exhibition plinth (Figure 2(a)). The plinth was located in an RPAA drawing studio against a white wall, and a standing artists’ easel was positioned 2.5 m from the sellion of the unrotated 3D print (Figure 2(b)). The 3D print was then rotated 47° from the facial midline, resulting in the right cheek dominating (Figure 2(c)), and digitally photographed using a Nikon D3200 (f/5.6, exposure 1/125 seconds, ISO-224, focal length 55 mm), again with the sellion as the focal point. Each Artist chose which leg of the standing easel was foremost when positioned at an oblique angle to the 3D print.
The ACES 3D print and the artist’s easel orientation within the RPAA drawing studio. (a) Vertical orientation of the 3D print and plinth, (b) Horizontal orientation of plinth and easel, and (c) 3D print orientation (rotated 47° from facial midline). The science of portraiture exhibition promotional poster.

The Artists had 50 minutes to draft their portrait in reference to the 3D print, and each accomplished this using the artistic materials of their own choosing. On arrival at the RPAA drawing studio, each Artist was provided with two sheets of 300 gsm A2 mixed media paper on which to produce their portrait, which was to be unsigned. Prior to the commencement of their work referencing the 3D print, each Artist had their age, sex, height and handedness recorded (to identify if this had any impact on the resulting portrait) against a unique alphanumeric code, and once their time was complete, each Artist photographed the 3D print using the same Nikon D3200 camera and settings as detailed previously, in order to document the visual orientation they used to draw the portrait. The Artists, if they chose to do so, used their own cameras to take additional photographs of the 3D print to assist them in the completion of their portrait.
The Artists had 2 weeks to complete their portraits on one of the sheets of mixed media paper provided, and each chose the medium and scale of their work. The completed 12 portraits were delivered on a set day and time to the RPAA, and each portrait was given a different unique alphanumeric code by DA, a member of RPAA not directly involved in the research for this project. This was to facilitate the portraits remaining anonymous for all parties for the duration of the exhibition, the likeness judgements and all subsequent analyses. The 12 completed and anonymised portraits were then digitally photographed at a height of 1.5 m using the Nikon D3200 camera (f/5.6, exposure 1/60 seconds, ISO-110, focal length 32 mm) with the sellion as the focal point.
Geometric Morphometrics
A set of 19 homologous landmarks and 37 semilandmarks (derived from curves) were identified for the photographs of the 3D print and the portraits (see Supplementary Information Figure 2 and Table 1). The 2D landmark coordinates (x,y) were digitized using the geometric morphometrics software tpsDig2.32 (Rohlf, 2008, 2015). A different geometric morphometric program, tpsSuper64 (Rohlf, 2015), was used to create two statistical averages and the resulting x,y coordinates were exported. The first average involved the baseline photograph of the 3D print – see Figure 2(c) – and each of the 12 photographs of the 3D print taken by each of the 12 Artists. This average photograph of the 3D print, hereinafter referred to as the
The 12 individual Artist photographs of the 3D print, the 12 portraits,
The Procrustes chord Distance arising from the Generalised Procrustes Analysis performed by morphologika2.5 was used to assess how morphologically accurate, overall, each of the portraits and the
morphologika2.5 only reports statistical significance following a multivariate regression. Therefore, the Procrustes registered x,y coordinate landmarks were also entered into the statistics software PAST3 (v.3.16; Hammer, Harper, & Ryan, 2001) and a Principle Components Analysis (PCA) was performed (Bootstrap 1000; Scree plot and broken stick). The PAST3 output includes which PCs, if any, are statistically significant independently of a multivariate regression, and undertaking this supplementary analysis is enabled because the resulting eigenvalues and PC scores from the PAST3 PCA agree, to three decimal places, with the morphologika2.5 output.
More detailed explanations regarding how geometric morphometrics can be applied to analyse shape variance in artistic, forensic and archaeological images of the face can be found in the following: Hayes and Milne (2011) for frontal portraits, Hayes and Tullberg (2012) for witness descriptions of suspects and Starbuck (2014) for morphological variance in archaeological figurines.
Anthropometrics and Facial Distinctiveness
Anthropometric measures and means of adult European male facial dimensions were sourced (George, 2007; Sforza et al., 2010; Sforza, Grandi, Binelli, et al., 2009; Sforza, Grandi, Catti, et al., 2009; Sforza, Grandi, De Menezes, Tartaglia, & Ferrario, 2011) and compared with the Sitter’s facial dimensions. This was to identify the extent to which the Sitter’s facial features are distinctive and to ascertain to what extent any distinctive features were retained and/or exaggerated in the portraits. The Sitter’s facial data included measures taken during the photography session, which were also checked against the 3D print.
Exhibition
During a meeting to discuss the exhibition of the portraits, one of the Artists suggested the works be scaled and converted to greyscale in order to reduce these variables during the likeness assessments. Support for this suggestion was strong, but not unanimous. A compromise was agreed to where the works would be exhibited twice in two separate exhibition spaces – first as Greyscale prints and second as the Original artworks. To achieve the Greyscale prints, the digital photographs of each portrait were loaded into the image manipulation software, Adobe Photoshop CC 2017, and each was scaled to the same facial height referencing the distance between the right exocanthion (outer eye corner) and stomion (midline point where the lips meet). Each portrait image layer had the levels adjusted using the automated levels function within the program, and all were cropped to the same size and converted to greyscale. A3 prints of the greyscale portrait photographs were produced by a local commercial printer, but printed in colour so as to retain the tonal variations of the artworks.
The Greyscale prints were mounted at a height of 1.5 m in the RPAA Workshop Gallery. Each was clearly identified by the Artist’s unique alphanumeric code and exhibited in a randomised order. The 3D print was also displayed within the Workshop Gallery with the same height and orientation as was used for the initial production of the portraits (see Figure 2). The Original artworks were exhibited separately in the RPAA drawing studio, and while the Original artworks were also mounted on the studio walls at a height of 1.5 m, they were exhibited in a different randomised order to that of the Greyscale prints.
Separate A4 Likeness Judgement recording sheets were produced for the Greyscale prints and Original artwork exhibitions. Judges were each allocated a unique numerical code and asked to record their sex (male or female) and age-group (<20, 20–35, 36–50, 51–65, >65). Following this were 5-point Likert scales with verbal low–high cues for familiarity with the Sitter’s face (not at all to very well), the extent to which they would describe themselves as a visual artist (not at all to professional) and their assessment of the likeness of each portrait (very low to very high). The Greyscale and Original likeness judgements were presented in the same order as the Greyscale prints and Original artworks were respectively exhibited, and each likeness assessment had the alphanumeric code and a thumbnail of the corresponding portrait printed adjacent to the scale.
Likeness Judgements
The Science of Portraiture exhibition was promoted to the general public on the ABC Radio, through the National Science Week, ABC and RPAA web pages and by the distribution of posters and fliers to a range of local venues (Figure 3). The exhibition opened on the first day of National Science Week and ran for two weeks. Prior to the opening of the exhibition, the Sitter attended a private viewing with SH (who was not one of the portrait artists), assessed the Original artworks for likeness and selected the artwork he most preferred, independently of likeness (which was subsequently given to the Sitter as a gift).
Visitors to the exhibition who wished to participate in the likeness assessments were advised that what was meant by likeness was up to their interpretation (no further explanation was provided) and were encouraged to judge the Greyscale prints before judging the Original artworks (in which case, the same unique identifier was used for each Judge). However, as a public science project, it was not compulsory for visitors to judge either or both versions of the 12 portraits. Out of over 200 visitors, 162 actively participated in the project by anonymously assessing the works (Judges): a small number (n = 7) only assessed the Greyscale prints, 45 only assessed the Original artworks, and 110 assessed both the Greyscale prints and the Original artworks. Once completed, the Judges posted their assessment forms into a purpose-built, sealed carton, and collation of the assessment data that was not undertaken until after the exhibition had closed.
Analyses of Judges and Likeness Judgements
In the instances where a Judge’s assessment fell between the numerical boundaries of the Likert scale, these were recorded as half-way scores (e.g., 1.5, 2.5). The statistical picture of the cohort of Judges and the mean likeness judgements of the portraits were compiled using the following functions from the statistics software PAST3: descriptive statistics, bivariate correlation (Spearman’s rho), cluster analysis (Ward’s algorithm, Euclidian similarity index) and Mann–Whitney pairwise post hoc tests with Bonferroni correction.
Results
Geometric Morphometrics
Analysis of the photographs each Artist took of the 3D print and the artworks they produced indicates that the Artist’s viewpoint, age, sex and height did not significantly impact on the morphology of the 12 portraits. Although handedness did result in photographs of the 3D print that were more in profile for the left-handed Artists (n = 2), this was not evident in the resulting portraits.
The geometric morphometric results of the PCA including the 12 portraits and the Geometric morphometric analysis of the 12 portraits and the 
While PC1 displays the statistically significant morphological variance, the Procrustes chord Distance (PD) scores account for all of the variance being captured by the coordinate landmark data entered into the geometric morphometric analysis. The PD scores attained by each of the portraits, from lowest (most similar) to highest (least similar), are as follows (and see Figure 5, x axis):
The AP121 (0.049), AP123 (0.062), AP116 (0.065), AP118 (0.067) AP115 (0.070), AP112 (0.071), AP122 (0.075), AP120 (0.081) AP119 (0.082), AP117 (0.084), AP113 (0.103), AP111 (0.122)
As can be seen, Portrait AP121 (and not Portrait AP115) is closest to the overall morphological parameters of the
When the 12 portraits are statistically averaged (
Ranking the 12 portraits according to their Procrustes chord Distance from the
Sitter Facial Distinctiveness
Comparison of the Sitter’s facial feature dimensions with available published averages indicates the Sitter’s facial features fall outside the normal range (are relatively distinctive) as follows:
Wide nose relative to eye spacing and mouth width, but within the normal range in actual nose length and width (George, 2007); nasal width and length >1 Standard Deviation (Sforza et al., 2011), though it should be noted that ‘normal’ for this anthropometric study is limited to 13 Italian males of similar age to the Sitter. Wide jaw relative to facial width (George, 2007).
Cohort of Judges
The 162 individuals (Judges) who assessed the portraits for likeness are skewed in sex, age and prior familiarity with the Sitter’s facial appearance: 68% are women (n = 110), 62% are over 51 years in age (n = 100) and 61% considered themselves unfamiliar with the Sitter’s face prior to attending the exhibition (n = 96). Artistic experience is more evenly represented with half of the Judges being experienced artists (>3 on the Likert scale). Spearman’s rho indicates that there is a slight tendency for the Judges with artistic experience to be female (r = .18, p = .04) and for the levels of artistic expertise to increase with age (r = .22, p = .01). There was no noticeably significant correlation (i.e., p < .05) between the Judges’ sex, age and artistic experience, and how they assessed the portrait likenesses.
Paired Portrait Judgements: Greyscale Prints and Original Artworks
Of the 162 Judges, 110 assessed both the Greyscale (cropped and scaled) prints and Original artworks. The profile of this subcohort of Judges is very similar to the total data set (65% female, 61% over 51 years, 61% unfamiliar with the Sitter’s face, 50% with artistic experience), and as with the larger cohort, there are no significant correlations (i.e., p < .05) between the sex, age and artistic experience of the Judges, nor, following Bonferroni adjustment, any significant impact of these individual differences on how the Judges assessed either the Greyscale prints or the Original artworks.
Across both exhibition conditions, each of the portraits received at least one assessment indicating it was a poor likeness (≤1.5) and at least one assessment indicating a good likeness (≥3.5). However, a cluster analysis of the raw likeness data in PAST3 (Ward’s algorithm, Euclidian similarity index) results in each of the portrait pairs forming a distinct cluster pair and indicates a high degree of overall similarity between how the Greyscale prints and Original portraits were judged. There are differences in that 8 of the 12 portraits have a lower mean likeness as a Greyscale print, but this difference is only statistically significant for Portrait AP118 (p = .03, Mann–Whitney pairwise post hoc test with Bonferroni correction). However, while significant, the impact is relatively negligible: Portrait AP118, which is already a monochromatic Original, is the second highest mean likeness in the Greyscale condition and highest mean likeness as an Original artwork. This movement does not change the order because Portrait AP117, which is a coloured artwork, has the highest mean likeness rating in the Greyscale condition and is second highest within the Original artworks. A more noticeable result arising is one of similarity in the overall range of assessments – Portrait AP121 has the lowest level of agreement in the likeness judgements (highest standard error and highest variance) whether it is exhibited as a Greyscale print or in its Original monochromatic form.
Judge Familiarity and Likeness Judgements
Given there is little real difference between how the Greyscale and Original artworks were assessed for likeness and to avoid repetition yet maintain similar assessment conditions, the analyses of the likeness assessments involves only the likeness judgements of the Original artworks by those who reported their level of prior familiarity and assessed both the Greyscale and Original artworks (n = 108).
Figure 6 shows the mean likeness assessments according to three groups based on the different levels of prior familiarity the Judges had with the Sitter’s facial appearance (None = Likert scale 1–1.5; Some = Likert scale 2–2.5; Familiar = Likert scale 3–5). Although Spearman’s rho (with Bonferroni correction) indicates that there is general agreement between those with some or no familiarity (r = .089, p = .001) and those with some or higher levels of prior familiarity with the Sitter’s face (r = .77, p = .04), there is little agreement between the extremes of familiarity (None and Familiar, not significant). When the mean likeness assessments according to familiarity level are compared with the individual portraits (Mann–Whitney pairwise post hoc test with Bonferroni correction), for 4 of the artworks, there is a significant difference in the assessments between those with no familiarity and those most familiar with the Sitter’s facial appearance. The most familiar assessors tend to rate Portrait AP112 (p = .04), Portrait AP116 (p = .03) and Portrait AP121 (p = .001) a lower likeness. In contrast, while all of the Judges have tended to rate Portrait AP117 a good likeness, familiar assessors have, on average, given this artwork an even higher likeness rating (p = .02).
Comparison of likeness judgements by level of familiarity. The dashed line is the mean likeness judgements (n = 110), while the bars show the level of prior familiarity the Judges had with the Sitter’s facial appearance (n = 108): no familiarity (white), some familiarity (midgrey), moderate to high familiarity (dark grey). Mean, Variance and Standard Error (SE) of the Portrait Likeness by 110 Judges across both exhibition conditions (scaled Greyscale Prints and Original).
Mean, Standard Error (SE), Variance (Var.) and Standard Deviation (SD) of the Portrait Likeness Assessments by Level of Prior Familiarity (N = 108).
Note. Highest values for each statistic are shown in bold.
Sitter Likeness Assessments and Selection
The Sitter assessed the likeness of the Original portraits prior to the exhibition opening, and these results were not included in the calculations of the mean likeness assessments. The Sitter’s assessments closely agree with the familiar assessor likeness means (Spearman’s rho, r = .83, p = .006 with Bonferroni adjustment), but not the overall mean likeness or the assessments by those with some or no familiarity with his facial appearance. However, while the Sitter agreed with his familiars that Portrait AP117 is the best likeness, closely followed by Portrait AP118, and did not consider Portrait AP121 to be a good likeness, the Sitter also rated Portrait AP122 as a good likeness (equal rank to Portrait AP118), and it was Portrait AP122 that the Sitter selected, without hesitation, as the artwork he most preferred.
Likeness, Familiarity and Morphological Accuracy
Comparison (Spearman’s rho) of the portraits’ Procrustes chord Distance (PD) scores from the
Discussion
This research project involved the mean likeness judgements of 12 portraits depicting the same Sitter being compared with the extent to which each portrait agrees with what each of the 12 Artists used as the basis for their artwork – the morphology of a 3D print. Referencing a monochromatic 3D print (rather than the Sitter in life) was a new and challenging experience for all of the Artists, but it ensured the portrait production conditions were similar for all and constrained the variance of each portrait regarding the depiction of the Sitter’s head pose. Previous studies have found that even subtle variations in head pose are typically the main contributor to morphological variance when geometric morphometrics is applied (e.g., Hayes, 2016; Hayes & Milne, 2011; Hayes & Tullberg, 2012), is a confounding factor for automated facial identification (for a review, see Ding & Tao, 2015) and is similarly problematic for face perception studies involving metric assessments (Young & Burton, 2017).
As an ambient public science project, some elements were unanticipated, but it is arguable that these have contributed to the relative richness of the results. First, even though the research was undertaken with all participants contributing to, and being aware of, the research aims and methods (e.g., the Artists were not the objects of the study), the resulting portraits are highly diverse (Figure 1). A contributory factor could be that, as with most professional portrait commissions, none of the Artists had prior familiarity with the Sitter’s facial appearance. However, what was probably more influential was that while each Artist had access to additional visual support materials, the extent to which this material was incorporated into the portrait varied. For example, the Artists who produced Portrait AP117 and Portrait AP122 accessed a wide range of images of the Sitter, while the Artist who produced Portrait AP121 completed their work solely in reference to the 3D print. Second, although we anticipated that the exhibition visitors would have differing levels of familiarity with the Sitter’s facial appearance, we did not anticipate quite so many attendees with no prior familiarity. As with the 12 Artists, nearly all who assessed the portraits knew of the Sitter, but for a large proportion of the visitors, this was only in the context of being regular listeners to the Sitter’s daily ABC Radio Mornings program. For these visitors, there was a curiosity to see what the Sitter looked like, and therefore their reference to the Sitter’s actual facial appearance was limited to the information provided by the 3D print (Figure 2) and promotional poster (Figure 3), both of which were exhibited alongside the portrait Greyscale prints. However, despite the likeness judgements being dominated by unfamiliar assessors, age and sex were not found to be influential factors, and as with a previous study (Kozbelt et al., 2010), we found a high level of overall agreement across assessors with different levels of artistic expertise.
The Artists’ decision to separately exhibit Greyscale versions of their portraits, and have these assessed prior to the Original artworks, was to reduce the influence of the portraits’ variations in scale and hue and provide a more level playing field for the visitors’ likeness assessments. The results arising from the 110 paired likeness assessments undertaken by visitors to the exhibition was that while the Greyscale portraits tended to be ranked lower in likeness, the only real difference was that the two highest ranked portraits for likeness (Portrait 117 and Portrait 118) exchanged first and second position when viewed as an Original artwork. Although the impact of hue and scale does not seem to have been previously tested with portraits, that the mean likeness assessment pairs are in close agreement is similar to the findings with portrait photographs – (a) that the impact of hue is negligible providing the tonal variation is maintained (Bruce & Young, 1998; Kramer, Manesi, Towler, Reynolds, & Burton, 2018) and (b) that the impact of scale is likely negligible providing the horizontal or vertical spatial distribution patterns of the features are retained (Hole et al., 2002).
Because the likeness assessments were similar across the Greyscale and Original portrait conditions, only the Original assessments were used to compare familiarity and likeness with the morphological accuracy of the portraits. By restricting the assessments of the Original portraits to those visitors who had also assessed the Greyscale prints and included their level of familiarity with the Sitter’s facial appearance (n = 108), this meant that prior to undertaking their assessments, all Judges had been exposed to the 3D print of the Sitter’s face as well as the promotional material. This means, in effect, that all assessors had some exposure to the Sitter’s facial appearance prior to assessing the Original artworks. However, it is not known the extent to which this exposure constituted a priming effect and provided the otherwise unfamiliar viewers with a mental representation of the Sitter’s face. It has been found that exposure to a range of ambient photographs of the same person can enhance familiarity and therefore recognition (e.g., Andrews, Jenkins, Cursiter, & Burton, 2015; Ritchie & Burton, 2017), but for this study, while the images are highly diverse, their ambience is constrained in that all depict the same view of the Sitter’s face.
Four portraits have resulted in being highly relevant to one or more of the core variables examined in this study: morphological accuracy as determined from geometric morphometrics (Principle Components analysis and Procrustes chord distances); the relationship of morphological accuracy to likeness assessments, and in particular the role of feature exaggeration; and, differences in likeness assessments between familiar and unfamiliar Judges. The relationship of each of these four portraits to these variables, and how these relate to the relevant literature is as follows:
AP121: Morphological Accuracy, Unfamiliarity and Face-Matching
Portrait AP121 is the most morphologically accurate depiction of the 3D print, and the 3D print is a life size, accurate depiction of the Sitter’s facial morphology. Furthermore, AP121 is a monochromatic artwork that only depicts the information provided by the 3D print of the Sitter’s face (i.e., minimal hair and neck), and therefore is the portrait that most emulates the limitations of the 3D print and is further limited in that the Sitter’s eyes appear closed. Portrait AP121 is considered a better likeness by those with no prior familiarity with the Sitter’s facial appearance, and this suggests that many of these unfamiliar Judges assessed the portraits through a process of face-matching – that is, by directly comparing each portrait element to the appearance of this element in the 3D print.
Although this has not been previously tested in portraiture, face-matching has been extensively studied with photographs (e.g., Bruce et al., 1999, 2001; Burton, Wilson, Cowan, & Bruce, 1999; Clutterbuck & Johnston, 2004; Kemp, Caon, Howard, & Brooks, 2016; Megreya & Burton, 2006; Shepherd, Davies, & Ellis, 1981; Young, Hay, McWeeny, Flude, & Ellis, 1985). There is general agreement across these studies that, when faces are unfamiliar, correct identification predominantly involves a direct comparison of each of the facial features, and these comparisons are highly sensitive to, and disrupted by, even slight differences in the depicted individual’s facial appearance (e.g., head pose, hairstyle, expression, lighting, focal length), though this disruption may be reduced when external features are masked (Kemp et al., 2016). Compared with the near perfect recognition rates of familiar faces depicted across highly diverse and highly degraded photographic images, unfamiliar face matching (also referred to as picture recognition; Young & Burton, 2017) has a poor success or high error rate. However, as has been noted, people display considerable variance in their capacity to either face match or recognise faces from photographs (Burton, White, & McNeill, 2010; Kemp et al., 2016).
Individual variation in capacities for unfamiliar face matching and familiar face recognition may account for the high level of variation in the assessments of Portrait AP121 across all levels of familiarity. However, while varied in their responses, visitors who had a moderate to high familiarity with the Sitter’s face prior to attending the exhibition did not tend to judge portrait AP121 a good likeness, and the difference between familiar and unfamiliar assessors in the likeness ratings is significant. This suggests that had it also been assessed (and it was an unfortunate oversight for us to not include this assessment), familiar Judges would likely have considered the 3D print to be a relatively poor likeness, particularly given people have difficulty either recognising familiar faces, or determining the sex of an unfamiliar face, when viewing monochromatic surface scans (Bruce et al., 1993, 1991).
AP117: Enhanced Likeness Through Caricature
Portrait AP117 is one of the least morphologically accurate portraits, with this inaccuracy including exaggeration of the Sitter’s nasal and lower face widths (see Figure 4). Portrait AP117 was also one of the two most highly rated portraits for likeness, with visitors having moderate to high prior familiarity with the Sitter’s facial appearance tending to accord both of the most highly rated portraits even higher likeness scores. This finding is in agreement with a recent study of ambient portrait photographs (Ritchie et al., 2018), and for Portrait AP117, the increase was statistically significant.
Anthropometric studies of adult European males indicates that the Sitter’s nasal and jaw widths are relatively distinctive features (George, 2007; Sforza et al., 2011), though frontal widths do not necessarily correspond to facial depths. Exaggeration of salient features is generally known to be a key element of most figurative artworks (Gibson, 1971; Gombrich, 1977; Ramachandran & Hirstein, 1999), including cave-art (Cheyne, Meschino, & Smilek, 2009). Exaggeration of distinctive features is specifically understood to enhance likeness in drawings (Arnheim, 1974) and portraiture (Gombrich, 1982), and a computer-derived ‘caricature effect’ has also been found to be a key element in enhancing recognition of photographs of familiar faces, and of line drawings derived from photographs (e.g., Benson & Perrett, 1991; Brennan, 1985; Rhodes, 1996). It is very likely, therefore, that Portrait AP117 achieved a significant increase in likeness ratings from those already familiar with the Sitter’s facial appearance because it incorporated caricature of the Sitter’s relatively distinctive frontal nasal and jaw widths into a three-quarter view. Incorporation of different views of the facial features into one view (e.g., increasing the profile of the nasal bridge) is one of the known transformations of portraiture (Sturgis, 1998), and its application is because it enhances the recognisability of the resulting face.
AP118: Morphological Accuracy and Likeness
Portrait AP118 is the fourth most accurate portrait, attained the highest overall mean likeness score, is the highest mean ranked portrait for those with no prior familiarity with the Sitter’s face and second highest for those with some to a high level of familiarity, including the Sitter himself. In many ways, the high ranking of Portrait AP118 contradicts the conclusions drawn from the two preceding artworks. Portrait AP118 has high morphological accuracy, with the geometric morphometric PC1 result indicating it has not exaggerated the Sitter’s distinctive jaw and nasal widths (or at least not exaggerated the characteristics that were measured). Portrait AP118 is also a monochromatic artwork, and while there was no significant difference in the likeness assessments of this work as either a Greyscale or Original portrait, Portrait AP118 is somewhat visually similar to the most morphologically accurate portrait, Portrait AP121. However, unlike Portrait AP121 (and by implication the 3D print), Portrait AP118 was rated a good likeness by familiar ‘recognisers’ as well as unfamiliar ‘face-matchers’, and, as with the less accurate/caricatured Portrait AP117, experienced a ‘peak’ in likeness ratings by those most familiar with the Sitter’s face prior to attending the exhibition.
What this outcome indicates is that accuracy of depiction does have a role in familiar as well as unfamiliar likeness judgements of portraits, and that it is possible that caricature does not have to be strongly present to elicit higher likeness ratings by familiar assessors. This finding is supported to a limited extent by a study of accuracy in face drawings by non-artists (Ostrofsky et al., 2014), which found a weak–moderate relationship between metric accuracy and visual assessments. But what is most missing from our study of metric accuracy for all of the portraits, not just Portrait AP118, is the depiction of facial texture.
Facial texture has been found to play a dominant role in the recognisability of portrait photographs (e.g., Hancock, Burton, & Bruce, 1996), and it is highly probable that a relatively accurate and caricatured depiction of facial texture has enhanced the likeness assessments of Portrait AP118 and Portrait AP117. Software has been developed to statistically analyse facial texture in portrait photographs (Kramer, Jenkins, & Burton, 2017), and this software would no doubt also work with other portrait image types. Unfortunately, the application (Interface) requires fairly frontally orientated representations of the depicted face (e.g., both pupils visible), and as relatively strong three-quarter views that are closer to being profiles, our portraits do not meet this precondition.
AP122: Average Diversity
Portrait AP122 attained only moderate mean likeness ratings across all levels of visitor prior familiarity. Portrait AP122 was, however, assessed as equally second best for likeness by the Sitter (alongside Portrait AP118). This portrait is relatively abstract (in that essential information is depicted with minimal detail), and was produced, as with Portrait AP117, in reference to a large number of additional images of the Sitter. Portrait AP122 is also the portrait the Sitter most preferred.
A study involving professional and amateur portrait artists has found that artists are more capable of abstraction when drawing from life – as opposed to drawing from memory – and that abstraction correlates positively with visual assessments of how interesting and pleasing a portrait drawing, with pleasing and interesting also being positively correlated (Konecni, 1991). This study, however, was assessed for these characteristics by those unfamiliar with the depicted, and did not, therefore, include likeness or recognition. If it is the case that abstraction (interesting or pleasing) is positively correlated with recognisability, it would be anticipated that Portrait AP122 would attract higher likeness ratings by familiars, but this did not occur.
As well as being stylistically interesting, Portrait AP122 is also interesting because it is the closest, morphologically, to the statistical average of all of the 12 artworks. Other than a similarity of head-pose and a commonality of art paper used, the 12 portraits the Artists produced are highly diverse. It was an unexpected outcome, therefore, to discover that our 12 divergent depictions of the Sitter’s face and features result in a statistically average portrait that is the most morphologically accurate depiction of the Sitter’s features – that is, more morphologically accurate than AP121 (see Figure 5). Face perception studies have found that the average of divergent photographs of the same face, through their removal of aspects that are irrelevant to facial identity, are highly recognisable for familiar viewers (Young & Burton, 2017). However, while recognisable, compared with individual instances of ambient portrait photographs, facial averages have also been found to attract comparatively low likeness ratings by familiars (Ritchie et al., 2018).
This description of portrait photograph averages being recognisable because of their removal of irrelevant detail is similar to that of abstraction, where an artist selects and depicts essential information with minimal detail (Konecni, 1991). Abstraction is, arguably, also an extension of the inherent ‘partiality’ (Gombrich, 1977; West, 2004) of all artistic depiction – it is not possible for any artwork to replicate life and portraits are not people. Portrait AP122 is a relatively abstract depiction of the Sitter’s face that is similar, morphologically, to the average of the portraits, and not the average of portrait photographs of the Sitter’s face. So, while the Sitter’s selection of Portrait AP122 as the one he most preferred is likely related to this portrait being the most pleasing and most interesting, it is an indirect link to further suggest that Portrait AP122 was not rated highly for likeness because it shares characteristics of average photographs. Were this a more direct link, it would be possible to conclude that abstraction does not enhance likeness and recognisability in portraiture.
With respect to how the shapes of a Sitter’s face are depicted in portraiture and how this relates to likeness assessments, a summary of how our research findings agree with previous research is as follows:
likeness assessments are relatively independent of artistic experience, which agrees with a previous study involving visual assessments of accuracy referencing portraits traced from photographs (Kozbelt et al., 2010); colour and scale do not significantly influence likeness assessments of portraits, a finding that is similar to previous research with portrait photograph recognition (Bruce & Young, 1998; Hole et al., 2002); familiar viewers rate good likenesses higher than unfamiliar viewers, indicating likeness assessments in ambient portraiture are linked to recognisability, a finding that agrees with recent research involving ambient portrait photographs (Ritchie et al., 2018); unfamiliar viewers show a stronger preference than familiar viewers for more accurate portrait depictions, and therefore appear to engage in face-matching to ascertain likeness, which has been found to occur with unfamiliar viewers of portrait photographs (Young & Burton, 2017); exaggeration of distinctive facial features enhances a portrait’s likeness for all viewers, and in particular familiar viewers, which is in agreement with aesthetic, psychological and neurological theories of art (Arnheim, 1974; Brennan, 1985; Gombrich, 1977, 1982; Ramachandran & Hirstein, 1999), as well as face perception studies referencing automated caricatures derived from photographs (e.g., Benson & Perrett, 1991; Rhodes, 1996); exaggeration of distinctive features does not need to be present for a good likeness to result, which agrees with previous findings of a weak–moderate positive correlation between metric accuracy and subjective likeness assessments of portraits produced by non-artists (Ostrofsky et al., 2014).
Furthermore, our research indicates:
abstraction does not appear to enhance likeness or recognition in portraiture; highly diverse portraits of a Sitter’s face, when statistically averaged, produce a highly accurate depiction; meaningful research can be undertaken under ambient conditions that are closer to the everyday of visual art practices (image production, exhibition and assessor participation).
Supplemental Material
Supplemental material for Variation and Likeness in Ambient Artistic Portraiture
Supplemental material for Variation and Likeness in Ambient Artistic Portraiture by Susan Hayes, Nick Rheinberger, Meagan Powley, Tricia Rawnsley, Linda Brown, Malcolm Brown, Karen Butler, Ann Clarke, Stephen Crichton, Maggie Henderson, Helen McCosker, Ann Musgrave, Joyce Wilcock, Darren Williams, Karin Yeaman, T. S. Zaracostas, Adam C. Taylor and Gordon Wallace in Perception
Footnotes
Acknowledgements
The authors would like to thank the following people for their support and assistance with this project: Dulcie Dal Molin, Naomi Arrowsmith and Donna Abbati; Gert van den Bergh, Treva Taylor and Brett Powley; Steve White and Justin Huntsdale; Stephen Beirne and Chris Richards. The authors would also like to acknowledge the enthusiastic support of the visitors to the Science of Portraiture exhibition during National Science Week 2017, and for the encouragement and insightful recommendations provided by our two reviewers.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The portrait production, exhibition and analysis were self-funded by the RPAA membership; the scanning and 3D print was undertaken by ACES and the Australian National Fabrication Facility Materials Node, and donated to the project as part of National Science Week 2017.
Supplementary Material
Supplementary material is available for this article online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
