Abstract
Little is known about self-report pain intensity scales best suited for young children. We tested the ability of preschool children to use two simplified scales (concrete ordinal and faces). Three- to 5-year-olds (n = 123) were asked to make binary discriminations (‘less’ vs ‘more’ pain) between response options using the Simplified Faces Pain Scale and Simplified Concrete Ordinal Scale and to complete a seriation task. Eighty participants were also asked to use the Simplified Concrete Ordinal Scale, with modified verbal anchors, to rate the loudness of tones and to assess practice effects. Binary discrimination accuracy and seriation ability improved with age. When using the Simplified Concrete Ordinal Scale to rate the loudness of tones, even the 3-year-olds performed significantly better than chance, and performance was better in 4- and 5-year-olds. Little evidence supported the ability of 3-year-olds to use either of the simplified tools in the pain context. The 4-year-olds demonstrated greater accuracy in using the Simplified Concrete Ordinal Scale than the Simplified Faces Pain Scale, suggesting that the Simplified Concrete Ordinal Scale may be more appropriate for this age group.
Introduction
Numerous validated tools assess self-reported pain intensity with children aged 6 years and older, but much less is known about the abilities of 3- to 5-year-olds to use these self-report tools (Birnie et al., 2019; von Baeyer et al., 2017). Although some tools are claimed to be suitable for children as young as 3 years, these claims are typically based on aggregated data, where a small number of 3- and 4-year-olds are included together with older children, resulting in an overestimation of their abilities (von Baeyer et al., 2017).
There is a small body of published research investigating the ability of 3- to 5-year-olds to use self-report tools to report pain intensity. Children aged 3 and 4 years tend not to order the response options of face scales as intended (Beyer and Aradine, 1986; Yeh, 2005), show low correlation between self-report and observational scales (Willis et al., 2003) and make more errors than older children when rating hypothetical pain scenarios (Stanford et al., 2006). A recent study found moderate correlation (Spearman rho = .73) between a concrete ordinal scale using blocks and a faces pain scale among 4-year-olds (Jung et al., 2018). The same authors found that both scales achieved good responsivity to analgesics; however, data were aggregated across the 4- to 7-year age group, making it impossible to interpret specifically for 4-year-olds.
Some evidence suggests that 3- and 4-year-olds may be able to provide self-report pain scores that are more valid and reliable when using scales with fewer response options relative to scales that have six response options (von Baeyer et al., 2013). Hence, Emmott et al. (2017) developed the Simplified Faces Pain Scale (S-FPS) and Simplified Concrete Ordinal Scale (S-COS). Both utilise an initial binary question about the presence or absence of pain, followed by discrimination between three possible levels of pain intensity. The S-FPS requires selecting between three faces depicting varying degrees of pain intensity. The S-COS requires selecting between images of one, two, or three blocks in stacks representing varying degrees of pain. Emmott et al. (2017) found that the 4-year-olds, but not 3-year-olds, were better able to distinguish pain from no pain using the simplified scales than using the Faces Pain Scale – Revised. Moderate correlations were noted between the simplified scales and an observational pain assessment tool (Face, Legs, Cry and Consolability (FLACC) Scale).
Children’s perceptions of the faces used in S-FPS regarding age and gender have not been studied. If children rely on the technique of matching their experience of pain to one of the faces (Besenski et al., 2007), children who consider the faces as dissimilar to their own may be less able to use the faces as a target for matching.
To understand young children’s ability to self-report pain intensity, it is useful also to consider the literature on their ability to provide graded self-report ratings in other sensory domains. There is some literature regarding children’s ability to rate the loudness of tones (Collins and Gescheider, 1989). The perception of loudness, like pain, is subjective in nature and must be measured using psychophysical methods. Although hearing tests with preschool-aged children typically consist of binary questions, asking children if they can hear a series of progressively softer sounds (Cunningham and Cox, 2003), early psychophysical studies considered children’s ability to make absolute magnitude estimations of sensations such as loudness. An interesting but small study found that 9 of 11 children, aged 4–7 years, and 9 of 12 adults were able to carry out an absolute magnitude estimation task involving cross-modality matching using line length to reflect the loudness of tones (Collins and Gescheider, 1989). Although these results should be treated cautiously due to low sample size, they suggest that unreliable performance did not seem related to age. Most of the children tested had a concept of magnitude relations as well as the capacity to utilise abstract symbols to represent their sensory experience: both cognitive prerequisites for using self-report pain scales (Chan and von Baeyer, 2016).
Aims
The overall aim of the study was to investigate the ability of 3- to 5-year-old pain-free preschoolers to use two recently developed, simplified self-report pain intensity rating scales, namely the S-FPS and S-COS and to establish whether either format was better suited for use with children at different ages. The study had four specific aims. First, we examined the ways healthy 3- to 5-year-olds, who are not in pain, perceive and understand the response options. Consideration was given to participant perceptions of the age and gender of the faces in the S-FPS, and whether these perceptions were associated with participants’ own age and gender. Second, the study assessed the ability of 3- to 5-year-old children to make accurate binary decisions between various combinations of response options for each of the simplified scales, identifying which option represented greater or lesser pain intensity levels. Third, we examined the ability of 3- to 5-year-old children to correctly order the response options for the S-FPS and S-COS with regard to varying levels of pain intensity. Finally, participants used the S-COS to rate another familiar sensory experience, namely the loudness of sounds. It was hypothesised that a developmental trajectory of improved task accuracy would be found for each of the tasks assessed. A secondary, exploratory aim was to examine whether accuracy improved with practice in discrimination and seriation tasks.
Methods
Study design
The study utilised a cross-sectional design, with a particular focus on between-age-group comparisons, with a secondary within-group follow-up component to assess for practice effects.
Participants
Participants were 3- to 5-year-old children recruited from nine preschools in Sydney, Australia, in 2015–2016. The exclusion criteria were poor English comprehension by the child and/or parent, cognitive impairment, and auditory impairment. Families who met any of the exclusion criteria were not provided with information and consent forms.
Measures
Graded dogs screening task
The graded dogs screening task (Emmott et al., 2017) consists of an image of three identical dogs, each increasing in size. Participants are asked: ‘Do you know what this is? What is it?’ and ‘Can you point to the biggest one?’ The task is intended to facilitate interaction with the child, to check that participants are able to comprehend and respond to basic questions and to demonstrate a concept of size.
S-FPS
The three monochrome faces, reflecting gradations of pain intensity, were used from the S-FPS (Emmott et al., 2017) (see Figure 1). The faces were designed to incorporate anatomical features consistent with facial expressions of mild, moderate and severe levels of pain intensity (scored as 3, 7, and 10 on a 0–10 scale). Simplified Faces Pain Scale. Copyright 2014, CL von Baeyer.
S-COS
The three images developed as part of the S-COS (Emmott et al., 2017) were used in the current study (see Figure 2). The S-COS is a concrete ordinal scale designed to assess children’s self-reported pain intensity. The first image depicts a single block, representing mild pain intensity (scored 3/10). The second shows a stack of two blocks, representing moderate pain intensity (scored 7/10). The third shows a stack of three blocks, representing severe pain intensity (scored 10/10). Simplified Concrete Ordinal Scale.
With modified verbal instructions, the S-COS may also be used to assess other sensory experiences, such as the loudness of a tone. The image with the single block is described as ‘just a little bit of sound’, the stack of two blocks is described as ‘more sound than that’ and the stack with three blocks is described as ‘the most sound’. Comparable with the original S-COS instructions for administration when assessing pain, a binary question is used to elicit whether or not participants hear any sound. If yes, this is followed by the instruction ‘Please point to the picture that shows how loud the sound is’.
Procedures
Ethical approval was obtained from the institutional Human Research Ethics Committee (Reference number: HREC/14/SCHN/501). Written informed consent was obtained from parents, and verbal assent was obtained from children. When providing consent, parents provided details about the child’s age, gender, and any current or recent painful experiences. Testing was carried out individually with each child in a quiet room in the preschool, with assessments generally lasting 20–40 minutes. Parents were not present at the time of testing. Data were de-identified after participants completed all aspects of testing.
The S-FPS and S-COS were not being used in the current study to assess pain, but rather to better understand/delineate children’s understanding of the various response options, and hence the measures were not administered in the usual way. Instead, the response options were shown to participants alone or in various combinations, as described as follows. Children’s perception and understanding of S-FPS responses. Participants were shown the S-FPS response options and asked, ‘Do you think this person looks like a boy or a girl?’ The order of references to boy and girl was computer randomised for each participant. If the participant hesitated, a third option ‘Or are you not sure?’ was offered. The participant was then asked, ‘Do you think this person is older than you or younger than you?’ The order in which the terms older and younger were stated was randomised. If the participant hesitated, a third option ‘Or are you not sure?’ was offered. The participant was then shown each face from the S-FPS in ascending order of pain intensity and asked for each ‘Do you think this face shows hurt?’ Binary discrimination task. Participants were shown all possible combinations of response pairs (for the S-FPS and S-COS), as well as a blank card representing no pain, in randomised order. Participants were asked ‘Which one shows more/less hurt?’ The terms ‘more’ or ‘less’ were randomised but with equal frequency. A binary discrimination task score was calculated as the number of correct discrimination responses out of six possible responses. Seriation task. The three cards depicting pain response options were shuffled and randomly placed on the table (for the S-FPS and S-COS). Participants were asked ‘Can you sort these cards in order from only a little bit of hurt to a lot of hurt?’ If a participant seemed unable to commence the task, the investigator provided additional prompts, such as ‘If we are putting these in order from only a little bit of hurt to a lot of hurt, which one comes first? And then?’ Once the cards were ordered, to help clarify what the participants had done, they were asked ‘Which one shows only a little bit of hurt?’ The seriation task score was scored as either being correctly ordered or not. The scale order (S-FPS and S-COS) for the binary discrimination and seriation tasks was randomised. Loudness rating task. Children participating in the second year of data collection were also administered a loudness rating task. They were told ‘I am going to play some sound clips to you and then ask you some questions about how loud they were. Here are the different sound clips you will be hearing. Some of the sounds are quite loud and some are quite soft so listen carefully’. Using headphones, sound clips of an identical C4 piano note (262 Hz) were played in the order of low, moderate and high sound volume. The S-COS was introduced with modified scale anchors appropriate to sound volume: ‘just a little bit of sound’, ‘more sound than that’ and ‘the most sound’. Participants were then asked to listen carefully for a sound and told that a sound might be played or might not be played. Participants then heard either no sound, or the C4 piano note at low, moderate, or high volume. Participants were asked if they heard a sound, and if they did, to point to the option on the S-COS that best showed how loud the sound was. Sound clips were presented in a random order such that each level of sound volume, and no sound, occurred twice. Practice effects. Following the loudness rating task, the binary discrimination and seriation tasks (as described above) were repeated for the S-FPS and S-COS again with random ordering of the two scales.
Statistical analyses
Descriptive statistics (means, standard deviations, frequencies, and percentages) were used to report children’s perceptions of the S-FPS faces, as well as their responses to the binary discrimination, seriation and loudness rating tasks. Chi square tests were used to assess whether the frequency of responses differed across key variables, such as by gender or age. One-sample t-tests were used to determine whether children’s accuracy of responses differs significantly from what would be expected from random responding. Practice effects were assessed using paired t-tests.
Results
Sample
Informed parental consent was obtained for 139 children; 16 of them did not wish to participate. It is not known how many parents implicitly declined consent by not returning the permission forms. Testing was commenced with 123 children. One 4-year-old did not respond to any questions so testing was terminated. One 3-year-old did not appear to comprehend the questions, failed a basic size seriation screening task, and did not respond in the required way to questions, and was therefore excluded from analyses. Two participants left before the end of the testing session, and incomplete data were obtained from these participants. Analysis of loudness ratings and practice effects was carried out with a subset of the sample, namely participants recruited in the second year of testing (n = 80).
At least partial data were obtained from 121 3- to 5-year-old children, including 45 3-year-olds (31 boys and 14 girls), 46 4-year-olds (22 boys and 24 girls), and 30 5-year-olds (19 boys and 11 girls). These numbers afforded sufficient power (α = 0.05, two-tailed) to detect large age effects if such effects exist. Loudness ratings and practice effects were assessed on a subset of 80 participants. No participants were experiencing pain at the time of the study.
Screening task
Of the full sample (n = 123) that was screened, only one participant (aged 3 years) was unable to complete the graded dog screening task and other tasks so their data were removed from analyses.
Perceptions of the faces in the S-FPS
Frequency (percentage) of perceived age of faces reported by each age group.
Frequency (percentage) of perceived gender of faces reported by boys, girls and full sample.
When asked whether the respective faces depicted pain, 29% (n = 13) of 3-year-olds reported that the face intended to depict severe pain face did not show pain, whereas only 9% (n = 4) of 4-year-olds and 3% (n = 1) of 5-year-olds made this judgement. For the face intended to depict moderate levels of pain, 42% (n = 19) of 3-year-olds, 35% (n = 16) of 4-year-olds and 40% (n = 12) of 5-year-olds reported that the face did not show hurt. For the face intended to depict low levels of pain, 31% (n = 14) of 3-year-olds, 35% (n = 16) of 4-year-olds and 30% (n = 9) of 5-year-olds reported that the face did not show hurt.
Pain scale binary discriminations
Mean correct binary decisions for each assessment tool by age.
Note: S-FPS: simplified faces pain scale; S-COS: simplified concrete ordinal scale.
* indicates that a one-sample t-test shows that participants performed significantly better than if they were answering randomly *p < 0.05; **p <.01).
Seriation task
Figures 3 and 4 show the percentage of participants who correctly completed the seriation task using the S-FPS and S-COS, respectively. If responding randomly, 16.67% of participants would complete the task correctly. Percentage of participants who correctly performed the Simplified Faces Pain Scale seriation task by age. Percentage of participants who correctly performed the Simplified Concrete Ordinal Scale seriation task by age.

Figure 3 shows that the 4-year-olds and 5-year-olds (but not 3-year-olds) performed significantly better than chance when completing the S-FPS seriation task. However, 3-year-olds did not perform significantly better than chance when ordering the response options of the S-FPS. As shown in Figure 4, the 3-, 4- and 5-year-olds all performed significantly better than chance when completing the S-COS seriation task.
A statistically significant relationship was found between participant age and whether or not they ordered the responses correctly using the S-FPS (X2(2, N = 73) = 12.6, p = .002; Cramer’s V = .42) and the S-COS (X2(2, N = 77) = 11.3, p = .003; Cramer’s V = .38). When assessed with a subsample (n = 80), seriation ability with the S-FPS and S-COS was not found to improve with practice.
Rating loudness using the S-COS
Mean and median number of times participants correctly identified tone loudness by age.
aSignificantly better than random (random response = 2), all p’s <.001.
One-sample t-tests revealed that 3-year-olds (t(30) = 4.1, p = .001, Cohen’s d = 0.74), 4-year-olds (t(26) = 7.1, p < .001, Cohen’s d = 1.87) and 5-year-olds (t(21) = 12.6, p < .001, Cohen’s d = 1.88) all performed significantly better than random in identifying the tone loudness. A significant relationship was found between age and ability to correctly rate the loudness of tones (X2(2, N = 80) = 27.7, p = .006; Cramer’s V = .42), whereby older children made more correct identifications.
Discussion
The current study investigated the ability of 3- to 5-year-old children to use simplified self-report pain intensity tools, building on very limited literature in this area. Evidence was found for a developmental trajectory in children’s ability to use such self-report tools. Three-year-olds had little ability to make accurate binary discriminations reflecting more versus less pain. As hypothesised, their ability to correctly order the S-FPS responses was not significantly better than chance. In fact, 29% of 3-year-olds reported that the face intended to depict high pain did not show hurt.
Although making a considerable number of errors, 4-year-olds performed better than chance on the binary discrimination task for the S-COS but not the S-FPS. They were also better than the 3-year-olds in correctly ordering the S-COS and S-FPS response options. Five-year-olds made fewer binary discrimination errors and performed significantly better than chance for both S-COS and S-FPS. Their ability to order the S-FPS was superior to both 4- and 3-year-olds, and their ability to order the S-COS was better than 3-year-olds and slightly better than 4-year-olds. These results suggest that by 5 years of age, children are more likely to be able to use a variety of assessment tools to self-report pain. This is consistent with other research supporting the ability of 5-year-olds to use a range of self-report scales (Tomlinson et al., 2010). The 4-year-olds in the current study were somewhat better at using the S-COS than the S-FPS. However, as in previous research (Emmott et al., 2017; von Baeyer et al., 2017), the current study found no evidence to support the ability of 3-year-olds to use either a simplified concrete ordinal or simplified faces scale.
Previous studies suggest that young children may improve in their ability to use pain self-report tools with practice (von Baeyer et al., 2011). The current study found that on the second occasion of completing the binary discrimination tasks, and after using the S-COS to rate sound volume, children of all ages, and with both of the self-report assessment tools considered, showed non-significant trends of improvement. However, there was essentially no improvement with practice on the seriation task. It is possible that experience with differing levels of pain intensity may be more relevant than simply practice using the scales (von Baeyer et al., 2011). Future research may also consider various types and amounts of training, such as the use of additional verbal instructions or modelling of the use of self-report assessment tools using simple materials unrelated to pain.
Participants were not significantly more likely to perceive the faces of the S-FPS as one gender or the other. This gender neutrality may enable participants to match their own experience to one of the faces (Besenski et al., 2007), a strategy which may not be possible if participants perceive the faces as dissimilar to their own. While 5-year-olds perceived the faces as looking younger than themselves, the 3- and 4-year-olds were equally likely to consider the faces as being older or younger than themselves. These results suggest that the faces in the S-FPS are suited for use with preschool-aged children. Although beyond the scope of the current study, further research may be warranted to explore the possibility of ethnic differences in how participants interpret the faces and their similarity to themselves.
Although more than 90% of the 4- and 5-year-olds recognised that the high pain face of the S-FPS depicted pain, the medium pain face was less often correctly identified as depicting pain. This may in part have been related to how the questions were asked. Children may have thought that a different answer was required when the question of whether the face showed pain was repeated for the three faces. Nevertheless, it is possible that a significant portion of children did not perceive that some of the faces in the S-FPS depicted pain, thus compromising the validity with which they would be able to use the scale.
A novel feature of this study was that the S-COS was also used to rate the loudness of tones. This provides a new paradigm for evaluating self-report assessment tools in the context of a real but non-threatening sensory experience. Such a paradigm may be particularly valuable in evaluating the use of self-report scales with very young children, as it both bypasses the challenges associated with testing in a clinical context and avoids asking young children to use a rating scale for a sensory experience that they are not currently experiencing (Jaaniste et al., 2016).
Although older children were more accurate in rating the loudness of tones than young children, even the 3-year-olds performed better than chance when rating the loudness of tones using the S-COS. This finding is important because it indicates that at least some children as young as 3 years demonstrate some ability to use a concrete ordinal scale to report a sensation. However, this raises the question of why the 3-year-old children were capable of reporting sound loudness using the S-COS but seemingly not capable of making accurate binary discriminations regarding levels of pain between S-COS response options. This is of particular relevance because ability to estimate quantities, which the binary discrimination results suggested the 3-year-old children were not capable of, is thought to be a prerequisite to the successful use of pain intensity scales.
One possible explanation is that it may have been easier for younger children to use the S-COS to rate a current sensory experience rather than the more abstract tasks of binary discriminations or response ordering with respect to a pain context not currently being experienced (Atance and Meltzoff, 2005, 2006; Jaaniste et al., 2016). Second, young children may be more adept at rating sound loudness than pain, given their greater familiarity and experience with sound. Third, it is important to note that participants heard the full range of possible tone volumes together with a verbal descriptor before beginning to rate the sound volumes. It is therefore possible that children were matching sound volumes to response options, rather than utilising the assumed prerequisite skills of seriation or quantitative estimation (Chan and von Baeyer, 2016). If this last explanation were true, it suggests that children would need to experience various levels of pain and be clear on which of the response options each pain level corresponded with before they could validly use the S-COS. This may be possible for children who have repeated or long-term pain but would be less feasible for most acute pain contexts.
A number of limitations in the current study warrant mention. The ability of participants to utilise pain assessment tools in a pain-free environment may not be indicative of how they utilise the tools when experiencing pain. Participants in the current study experienced less distress than they would in a painful clinical context. However, it may be easier for children to comprehend response options depicting varying levels of pain when they are actually experiencing pain themselves (Jaaniste et al., 2016; von Baeyer et al., 2011). Moreover, a child’s clinical history of prior painful procedures may also potentially influence their ability to utilise self-report pain intensity tools. Further research is needed in clinical contexts, as well as taking into account pain history.
Second, using self-report rating scales for rating the loudness of tones may have been easier than for rating pain because participants in the current study were exposed to each of the varying levels of auditory loudness before carrying out the tasks, whereas children often have a less clear concept of what constitutes mild, moderate or severe pain before being asked to rate a particular pain incident.
Third, the ability of young children to complete various tasks is dependent on the vocabulary used by the researcher in explaining the tasks. Some of the verbal instructions used in the current study may have been too complex for 3- and some 4-year-olds. For example it is possible that the poorer performance of 3-year-olds on the seriation task may have been due to not understanding the task demands rather than an inability to seriate. Moreover, although the initial instructions for the seriation task were consistent with a true measure of seriation, the subsequent prompts given to children who were unable to respond to the initial instructions encouraged children to use what Piaget referred to as the extremum method (identifying one end of the sequence and choosing what comes next) (Inhelder and Piaget, 2013). Thus, it is difficult to identify whether children’s responses of the seriation task were based on a true ability to seriate or their use of the simpler extremum method. Future studies that require young participants to complete a seriation task may consider using practice tasks with other sets of items, for example seriation by size or darkness, to ensure that the child understands the task demands before commencing the study task.
Fourth, it is possible that young children may have experienced some degree of respondent fatigue (Lavrakas, 2008), given the relatively lengthy testing protocol, which may have contributed to their poorer performance relative to older children. However, measures were taken to maintain the engagement and focus of all participants, using praise and encouragement as needed, and it was not observed that 3-year-olds who completed the study seemed more fatigued.
Finally, some selection biases may have occurred. For example parents may have been more likely to give permission for their child to participate if they believed their child would be likely to successfully engage with the researcher, and shy children may have declined to participate.
Notwithstanding these limitations, the current study has paved the way for future scale development and early psychometric work with self-report pain scales, such as concrete ordinal tools, using other sensory contexts. Future researchers may consider other familiar physical or sensory contexts such as rating varying degrees of hunger or thirst, visual brightness, fatigue or the need to urinate.
Implications for practice
Implications for clinical assessment of young children’s pain should be drawn only with caution, with further clinically based testing warranted. The data from the current study suggest that the concrete ordinal scale used in the current study, the S-COS, may be more suitable for use with 4-year-olds than the simplified faces scale. The current study suggests that most 3-year-olds are likely to struggle to use either type of self-report tool; observational/behavioural measures may be preferable.
Conclusions
The current study suggests that among 3- to 5-year-old children, there is a clear developmental progression in their ability to use simple self-report assessment tools such as the S-COS and S-FPS. The 3-year-olds were generally unable to validly use the S-COS or the S-FPS. In contrast, most 5-year-olds were able to use either the S-COS or the S-FPS. Four-year-olds demonstrated some ability to use the simplified tools. Notably though, the 4-year-olds were more accurate in making binary discriminations using the S-COS than the S-FPS, suggesting that the blocks scale may be a more appropriate tool for use with this age group. However, further clinical research is needed before clinical recommendations can be made. The sound volume rating task used in the current study may be a valuable tool in development of self-report pain scales for young children.
Footnotes
Acknowledgements
We acknowledge the support of the preschools involved in the study and their assistance with participant recruitment and facilitation of testing. We appreciate the constructive comments of Nicholas West on an earlier version of this manuscript. We acknowledge Patricia Bernal, who designed the faces and blocks used in the two pain scales, and the scale developers at the University of British Columbia, Canada, for providing us with access to the tools.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Tiina Jaaniste was supported by the Sydney Children’s Hospital Foundation. Ashleigh Burgess and Mathushinee Mohanachandran received medical student funding from the School of Medicine, University of New South Wales.
