Abstract
Use of nonwords is a potentially more appropriate method of assessment for English second language (EL2) learners. A mixed comparative design was used to compare the effects when using nonwords instead of picture-based stimuli to assess articulation of EL2 learners. Subaims were to compare results between two tests and age groups. In all, 16 Setswana L1 children assigned to two age cohorts were assessed using the Goldman-Fristoe Test of Articulation–second edition (GFTA-2), and nonword list was created via a registered Speech Motor Learning website. Results of the two assessments differed significantly, indicating that lack of semantic information may yield different outcomes for articulation assessments of EL2 learners. Speech sound differences on the GFTA-2 were sounds not found in L1. This agrees with previous research indicating incorrect diagnosis due to speech and language differences. There was no significant difference between the two age cohorts. This research forms the basis for investigations into nonwords as a more accurate method for assessment of EL2 learners.
Keywords
Introduction
When speech sound disorders (SSD) occur in conjunction with second language (L2) learning, long-term negative effects can result. These negative results are especially evident when intervention is not received before 5 to 6 years (McLeod, Verdon, & Bowen, 2013; Miltiņa & Augstkalne, 2015). The impact of SSD may be greater when the Language of Learning and Teaching (LoLT) differs from the first language, as is the case with English second language (EL2) learners in South Africa (Miltiņa & Augstkalne, 2015). These differences create challenges for a speech-language pathologist (SLP) when trying to identify children with disorders in this population (Preston & Seki, 2011). The main difficulties identified by SLPs in the assessment of EL2 learners with SSD fall within three broad categories: (a) a lack of norms for multilingual speech acquisition, (b) a lack of confidence in differential diagnosis between SSD and speech difference, and (c) a lack of culturally appropriate tools for assessment (McLeod et al., 2013).
The multilingual nature of South Africa is evident from the 11 official languages and additional unofficial languages (South African Statistics, 2011; Van Biljon, Nolte, Van der Linde, Zsilavecz, & Naude, 2015). Only 9.8% of the South African population are English first language (EL1) learners (South African Statistics, 2011). Despite this fact, in most South African classrooms, English is the LoLT (ELoLT) regardless of the heterogeneous language context representing more than 20 different L1s (Málek, 2013; Van Biljon et al., 2015). In this multilingual environment, it is to be expected that the sound systems of the different languages influence one another. The 11 official languages themselves differ from one another in a number of ways (Niesler, Louw, & Roux, 2005).
Differences in accents and dialect are mainly attributed to the different phoneme inventories of the languages (Niesler et al., 2005). English would differ from African languages such as Setswana, which is a Bantu language spoken in South Africa (Van der Merwe & Le Roux, 2014). Some single consonants that occur in English words do not occur in the African languages. Examples of the sounds that appear in English but not Setswana are [z], [v], [θ], and [ð] (Niesler et al., 2005; Snyman, 1989). Setswana and English do not only differ in sound inventory but also differ in the manner of articulation.
Aspiration occurs frequently in Setswana, while in English aspiration happens but is not phonemic. Ejection does not occur in English, but it is part of the production process of consonants in Setswana (Niesler et al., 2005). The dorso-velar fricative [x] occurs in Setswana but not in English, while the lamino-alveolar fricative [z] occurs in English but not in Setswana. Setswana uses the trill [r] instead of the [ɹ] found in English (Cole, 1955). The number of affricates in African languages is significantly greater than the number used in English. Furthermore, English employs none of the clicks used in many African languages (Niesler et al., 2005). English contains many consonant clusters, such as [bl], [bɹ], [spɹ], and [dɹ], which do not occur in Setswana. In addition, Setswana frequently uses clusters (for example, [tlh]), which do not appear in English (Snyman, 1989; Ziervogel, 1967).
In addition to the differences present in the consonant inventories of English and the African languages, vowels differ across these languages. Research established that EL2 speakers find it difficult to distinguish all the vowels of English (Le Roux, 2016; Seeff-Gabriel, 2003). Nineteen vowel sounds have been distinguished in English, while many African languages have a reduced vowel system (Bekker, 2009). An example of this limited vowel inventory is that of Setswana, which consists of only seven basic vowels, four raised vowels, and no diphthongs (Cole, 1955; Snyman, 1989). The four tense vowels [a], [i], [u], and [o] are used in Setswana, but are infrequent in English (Niesler et al., 2005). In contrast, English frequently contains lax vowels which do not occur in Setswana (Niesler et al., 2005). In addition, languages also differ concerning syllabic structure, and the consonant vowel consonant (CVC) structure (closed syllable) is found in English but not Setswana (Van der Merwe & Le Roux, 2014). However, Setswana can have a syllable made up of only one consonant (C), which does not occur in English (Van der Merwe & Le Roux, 2014). Multilingualism is affected not only by the differences between languages, but also by the interaction between the two languages.
Speech differences and variation in pronunciation can occur due to interaction between two speech systems (Zając, 2016). Specifically, bilingual children keep the two languages separate to a large extent, but the two systems can still influence each other (Zając, 2016). Bekker and Eley (2007) describe South African English spoken by EL1 speakers as being distinct from the English spoken by other ethnic groups in South Africa. This distinction is shown by the different accents linked to the different first languages spoken by EL2 speakers (Moonsamy & Kathard, 2015). Multilingualism is also affected by the speakers’ language proficiency.
The complications in assessment of English SSDs presented by multilingualism, differing phoneme inventories, and differing syllabic structures are exacerbated by the different levels of English competency, as seen in the most recent assessment of the literacy skills of international learners, the Progress in International Reading Skills Study (PrePIRLS; Howie et al., 2017). South Africa scored the lowest out of all participating countries (Howie et al., 2017). The 49% of participants (Grades 4 and 5) who wrote the test in an LoLT which was different from their first language scored significantly lower than those who wrote in their first language (i.e., their LoLT and first language are the same; Howie et al., 2017). The difference in scores between EL1 and EL2 learners was equivalent to approximately 2 years of schooling (Howie et al., 2017). These results show that when children are taught in a second language, it can impact their literacy skills. Furthermore, difficulties with precise pronunciation of the sounds of a language also affect literacy skills (Miltiņa & Augstkalne, 2015). An in-depth SLP assessment is important to prevent or minimize these effects of L2 learning. Unfortunately, these assessments come with unique challenges in the multilingual and multicultural context.
Kadyamusuma (2016) reports that even after an articulation assessment, an EL2 learner could be misdiagnosed with the presence (or absence) of an SSD due to the occurrence of speech differences, dialects, and variation in first and subsequent languages. Preston and Seki (2011) therefore stress the importance of differentiating between SSD and speech differences associated with speaking a second language.
Assessment of this population is problematic, not only because of the interaction between the different languages but also due to assessments being administered by speakers of the mainstream dialect, and assessments being designed exclusively for this dialect (Southwood, 2013). A better understanding of an EL2 child’s speech system could be obtained through a bilingual assessment. However, this is an unrealistic goal in the South African context (Pascoe & Norman, 2011). According to Caesar and Kohler (2007), the homogeneity of a predominantly monolingual SLP workforce is also frequently reported globally, especially in English-speaking countries. Preferably, an SLP should have a similar language background to the client and speak the same languages and even have the same dialect (Goldstein & Gildersleeve-Neumann, 2015). Unfortunately, the situation in South Africa is not ideal, as the majority of SLPs speak English or Afrikaans. However, they are aware of their lack of knowledge of African languages and its effect on assessment (Wilsenach, 2016).
The multilingual and multicultural society, coupled with the lack of culturally, linguistically, and contextually appropriate assessment tools, creates assessment challenges for the homogeneous SLPs in South Africa (Barratt, Khoza-Shangase, & Msimang, 2012; Pascoe, Rogers, & Norman, 2013; Southwood & Van Dulm, 2015). Pascoe and Norman (2011) suggest the consideration of informal dynamic assessment methods to address the challenges posed by inappropriate assessment tools. Unfortunately, the lack of norms against which the results can be compared, and limited knowledge about typical development of many African languages, results in dynamic assessment being only a partial solution (Pascoe & Norman, 2011). Southwood and Van Dulm (2015) acknowledge the need for valid and reliable normative assessment tools as these are the cornerstones of effective intervention. The proposed partial solution of the use of informal dynamic assessment methods in the South African context therefore leads to less valid and reliable assessment outcomes. We suggest that present methods of formal and normative assessment should be revisited to investigate possible adaptations to these reliable tools, rather than augmentation with informal methods.
The Goldman-Fristoe Test of Articulation (GFTA; Goldman & Fristoe, 2000) is a picture-based articulation test which is the most commonly used test in the urban Gauteng area (Van Biljon et al., 2015). Picture naming is affected by abilities other than articulation, including memory, visuospatial skills, vocabulary, visual perception, and lexical–semantic abilities (Yochim, Kane, & Mueller, 2009). Wen-Hui (2016) found that a picture-naming task slows the EL2 learner down and demonstrates a high rate of errors in this population. During a picture-naming task, EL2 learners produce words at a slower rate and therefore have reduced word output. This word output disadvantage can be due to competition between the two languages (Gollan, Montoya, Cera, & Sandoval, 2008). If the sounds of the target words in the second language are similar to those of the first language, the ability to name pictures in the second language will be reduced (Hermans, Bongaerts, De Bot, & Schreuder, 1998; Lee & Williams, 2001). This phenomenon is referred to as cross-linguistic interference, which purports that the activation of the first language affects the processing of the second language (Wen-Hui, 2016). The disadvantage of word output can be attributed to this competition between the two languages (Gollan et al., 2008). Wen-Hui (2016) stated that the phoneme inventory/sound repertoire of the first language will have a greater impact on the second language than vice versa.
In addition to the influence of the phoneme inventory of the L1, South African authors Van Biljon et al. (2015) found that when assessing EL2 learners using the picture-based GFTA-2, the target words were not always elicited as expected from the stimulus. An example of this is the picture of the wagon. The wagon is an object most children in the urban South African context have little experience with, and they may instead call it a wheelbarrow, as this is more familiar to them (Van Biljon et al., 2015). All these challenges result in the need for innovative research to bridge the gap between SSD and speech sound difference.
Nonword keywords may eliminate misinterpretation of results associated with different dialects, tools that rely on culturally inappropriate world knowledge, and limited vocabulary (Shriberg et al., 2009). As such, these words have been used in the treatment of motor speech disorders such as apraxia of speech (AOS; Van der Merwe, 2011). The rationale for the use of nonwords in the treatment of AOS consists of three parts. First, when using nonwords in treatment, there is freedom to vary the speech sounds included. This freedom means all consonants and vowels can be targeted in the consonant vowel consonant vowel (CVCV) syllable structure (Van der Merwe, 2011). Second, nonwords still allow for interaction between adjacent syllables (Van der Merwe, 2011); therefore, sounds are still produced in varied phonetic contexts. These nonwords will contain familiar syllable structures, as it must adhere to the phonotactic rules of a language (Van der Merwe, 2011). Third, nonwords have no semantic meaning and are therefore more likely to activate the neural motor areas, without additional activation of language areas (Van der Merwe, 2011). Finally, the second language learner will not struggle to find the necessary vocabulary when assessed. It is therefore reasonable to assume that nonwords may assist with the challenges presented by dialect, culturally inappropriate tools, and vocabulary.
Nonword repetition is a task used not only in the treatment of motor speech disorders but also in the assessment of children with language and literacy impairments, such as specific language impairment (SLI; Bishop, North, & Donlan, 1996; Shriberg et al., 2009). Nonwords have been used to diagnose children with possible SLI and other impairments where semantic content may interfere with performance (Reuterskiöld & Grigos, 2015). The rationale for their use in this regard is that the participant cannot access lexical information from long-term memory, which would support their performance (Reuterskiöld & Grigos, 2015). The lexical units comprise the vocabulary of a specific language. The lexical semantic value dictates the way in which the lexical units correlate with the syntax of that language. The value of nonwords, therefore, lies not only in the lack of semantic information, but also in the lack of specific syntax or structure.
Nonword keywords are mostly made up of two or more syllables. This syllabic composition is due to their use in the treatment of people with AOS, as these speakers use shorter words than normal speakers (Edmonds & Marquardt, 2004). This structure generally represents the CVCV structure at the lowest difficulty level (Van der Merwe, 2011). In almost all world languages, infant babbling and early words include the CV syllable. The basic syllable structure has therefore become the preferred basic unit of speech articulation (Van der Merwe, 2011). Reduplicated babbling, CVCV, is also common in the first 50 words of children from different language groups (Edwards & Shriberg, 1983). Therefore, children from different linguistic backgrounds could be assessed using nonwords to identify speech sound differences and articulation errors not associated with learning a second language.
Results of the Shriberg et al. (2009) study indicate that nonword tasks could possibly discriminate between SSD and speech differences. Shriberg and his colleagues (2009) stated that nonword repetition difficulties indicate a deficit in memorial processes. Such deficits have been identified as contributing to impairments such as language delays and other verbal traits such as SSD (Shriberg et al., 2009). Children with SSD present with poor performance on nonword repetition tasks (Shriberg et al., 2009). Shriberg et al. (2009) concluded that some of these errors are also due to habitual misarticulations, another indication of SSD. Shriberg et al. (2009) results agree with previous findings by Kovas et al. (2005), which stated that the results of a nonword repetition task are sensitive to articulatory accuracy in addition to phonological short-term memory. These research studies suggest that there is a possibility to use nonwords for assessment of SSD, as speech sound differences and articulation errors would still show up in results, but those productions made due to language difference would not be seen.
Van der Merwe (2011) suggested that the use of nonwords of a CVCV structure will not activate the language areas of the brain. It should therefore be possible to assess the speech production of an EL2 learner without semantic or linguistic interference. English differs from Setswana in both speech sound inventory and grammatical structure (Bekker, 2009; Niesler et al., 2005; Snyman, 1989; Van der Merwe & Le Roux, 2014). A nonword list can be varied to include speech sounds and phonological rules specific to a language. As such, it is possible that the use of nonwords could be used to identify only misarticulations due to SSD, as opposed to those resulting from interaction between the first language of the learner and English, as indicated by Shriberg et al. (2009). Therefore, the aim of the present research was to identify the difference in results when nonwords are used, as SSDs could be identified without semantic and linguistic interference resulting from EL2 learning.
The null hypothesis (H0) stated that despite the lack of semantic information, the use of novel, nonword keywords instead of a picture-based SSD test such as the GFTA-2 would not yield significantly different results in the assessment of articulation skills of EL2 learners. The alternative hypothesis (Ha) stated that the use of novel nonword stimuli instead of a picture-based SSD test such as the GFTA-2 would yield significantly different results in the assessment of articulation skills of EL2 learners due to the lack of semantic information.
Method
Research Design
This study employed a mixed comparative group design. This design allowed the researcher to compare each participant’s results when assessed using the nonword test as opposed to the GFTA-2 (each participant is assessed using both tests to allow comparison). In addition, comparison could be made between the different groups to identify differences based on age.
Ethical Considerations
Ethical clearance was obtained from the Research Committee of the Department of Speech-Language Pathology and Audiology, University of Pretoria, South Africa. The researchers received written permission from the private school, allowing the research to be conducted on their premises. Written consent was granted by the parents or legal guardians for their children to participate in the study. The participants granted written consent by writing their names on the consent form, as well as assent by selecting the correct illustration (tick vs. cross/ happy vs. sad face) on a form of assent. Participant anonymity was ensured through the use of numeric codes.
Participants
Selection criteria for participants
The study included 16 participants assigned into two groups using purposive sampling as the groupings were based on age. Group 1 consisted of seven 6- to 7-year-old EL2 learners. Group 2 consisted of nine 8- to 10-year-old EL2 learners. All participants were required to have English as their LoLT and Setswana as their first language. Male and female participants were selected. Participants came from similar socioeconomic backgrounds. This socioeconomic status is assumed due to all participants being selected from the same private school. Participants were aged between 6 and 9 years and 11 months at the time of the study.
Potential participants were excluded if they had structural deficits (e.g., cleft palate), hearing loss, autism spectrum disorder, a language disorder in their first language, or a previously diagnosed SSD (phonological disorder, articulation disorder, or dysarthria). Learners who display these challenges were excluded as the present study only aimed to determine the assessment value for typically developing EL2 speakers. These excluded learners were referred for further assessment and treatment by the qualified SLP working at the school.
Sample size and sampling method
A convenience sampling process was used. Participants were selected from a single private primary school (Grades 1–3). This private school is located in Hammanskraal, a rural area in Northern Gauteng, South Africa. The intended sample size was approximately 24 Setswana-speaking children. Unfortunately, due to limited willingness of the parents to allow their children to participate, and varying first languages in the school, only 19 children were assessed. Of the 19 participants, two were excluded from the research results as they presented with speech delays or disorders. One participant withdrew from the research due to illness. Therefore, only 16 participants were included in the results.
Data Collection
Data collection material
During the pre-experimental assessment, quantitative data were obtained from a parental questionnaire, the Test of Auditory Processing Skills–third edition (TAPS-3; Martin & Brownell, 2005), the Phonological Awareness Test–second edition (PAT-2; Robertson & Salter, 2007), the Test of Auditory Comprehension of Language–fourth edition (TACL-4; Carrow-Woolfolk, 2014), the Tswana Expressive Receptive Language Assessment (TSWERLA; Bortz, 1997), the Oral Facial Examination (OFE; Shipley & McAfee, 2004), and the Gilliam Autism Rating Scale–second edition (GARS-2; Gilliam, 2005). The experimental assessment materials consisted of the Goldman-Fristoe Test of Articulation–second edition (GFTA-2; Goldman & Fristoe, 2000) and an articulation assessment using a nonword list compiled on the Van der Merwe (2011) SML website (Online Appendix A).
Data collection procedures
Pre-experimental assessment
The researcher sent out letters of consent to parents before data collection commenced. When consent forms were received, potential participants were assessed in a venue at the school. The researcher, an EL1 SLP, assessed all potential participants using the material as described. For the TSWERLA assessment (Bortz, 1997), a Setswana first language speaker prerecorded the stimuli sentences and phrases. These recordings were used by the EL1 SLP to administer and record responses to the Setswana language test. Potential participants identified as having delays or disorders were excluded from the study and referred for further testing.
Experimental assessment
During the experimental assessment, participants were assessed in the same venue by the EL1 SLP, using the material as described. The SLP transcribed each response from the auditory recording using the International Phonetic Alphabet (IPA).
Participants were assessed using the GFTA-2 sounds in word section. If a participant presented with an SSD in their first language, their results were excluded from the study, and this was determined through a comparison of the GFTA-2 results and their speech in Setswana (i.e., if sounds found in Setswana were incorrectly produced on the GFTA-2, the participant was not stimulable, and if the same sound was produced with a difference during the TSWERLA test, the participant was excluded due to an SSD).
The testing procedure for the nonword test was different for the two age groups:
6- to 7-year olds: With access to the face of the examiner, the words were read out loud once. The participant was requested to repeat the word. Request for clarity or repetition was allowed once.
8- to 10-year olds: The participants were presented with cards displaying single nonwords and were asked to read the word. This method follows that in the decoding section of the PAT-2. In cases where the participants were unable to accurately read the word, the examiner assisted the participant by “reading” the word to the participant once and then asked for oral repetition. Allowing for oral repetition follows the administration guidelines used in the GFTA-2 (Goldman & Fristoe, 2000).
Data Analysis
The results of the different tests performed were compared in regard to ease of first time elicitation of the target words. In a study on dynamic assessment, it was found that children with a possible language disorder required more effort on the part of the examiner (Peña, Quinn, & Iglesias, 1992). The amount of effort needed from the examiner can be an important measure to determine the amount and intensity of mediation needed to facilitate future learning (Gutierrez-Clellen et al., 1998). In cases where the participant struggled to read the word (older participants) or asked for repetition (younger participants), the assistance with reading as well as the once-off allowed repetition was accepted as the first time elicitation of the target words. In other words, first time elicitation can be defined by how accurately the child produced the target word when presented with the stimulus. The percentage correct production for each target word was calculated and statistically analyzed. This analysis made use of nonparametric methods due to the small sample size. The mean and SD were calculated and the Mann–Whitney U test and Wilcoxon signed-rank test were run to compare the assessments. The same procedure was followed for the novel nonword list. Within- and between-participant comparisons were made and statistical differences calculated if present. Between-group comparisons were also made. The Wilcoxon signed-rank test is a nonparametric statistical test used to compare two related samples. As the same participants completed Test 1 and Test 2, this test is used to compare the difference in results. The Mann–Whitney U is a nonparametric test used to assess for significant differences in a scale or ordinal dependent variable by a single dichotomous independent variable. The Mann–Whitney U test was used to compare the differences in results between the two age groups. The independent variable here is dichotomous, as the groups take on the values of either “younger group” or “older group.”
Results and Discussion
Test Comparison
Each sound with speech sound differences in either test was individually compared across the two tests. The comparison identifies the consistency of differences across the two tests, by identifying sounds with significant variation in the number of speech sound differences. This comparison was made to identify if sounds are produced incorrectly in one test but not the other and vice versa. In this way, the effect of using nonwords can be looked at for each sound. Test 1 (T1) refers to the GFTA-2 and Test 2 (T2) refers to the novel nonword list.
If the p value is less than .05, there is a statistically significant difference between the tests for that individual sound. This would mean there is a significant variation in the number of differences for that sound on the different tests. However, if the p value is greater than .05, the differences between the tests are not statistically significant. The Wilcoxon signed-rank test statistics as well as the corresponding p values are shown in Table 1.
Comparison of Individual Sounds Assessed in the GFTA-2 and Nonword List.
Note. T1 = GFTA-2; T2 = nonword list; GFTA = Goldman-Fristoe Test of Articulation.
Significant difference (p value <.05).
As the p value for bɹ_T1 and bɹ_T2 is less than .05, the results for this sound differ significantly between the two tests. Of all the speech sound differences compared, only /bɹ/ showed a significant variation in number of differences between the two tests. This means the other speech sound differences had a similar number of variations in both tests, and the results between the two tests were statistically consistent for these sounds.
To identify if there is a significant difference between the speech sound differences on each test, an overall comparison was run. To compare the overall consistency of results across the two tests, the overall statistical differences between T1 and T2 are shown in Table 2.
Comparison Between the GFTA-2 and Nonword List.
Note. GFTA = Goldman-Fristoe Test of Articulation; T1 = GFTA-2; T2 = nonword list.
Significant difference (p value < .05).
As the p value equals .020, there is a statistically significant difference between T1 and T2. Although only /bɹ/ had a statistically significant difference across the tests, the overall comparison shows that the two tests statistically have significantly different results. It is necessary to look at the individual sound differences regardless of the significance at an individual level. If an individual sound does not have a statistically significant difference between the two tests, it does not mean that minor differences on each sound will not add up to result in an overall significant difference across the two tests.
To more closely examine the variation between individual sounds, the percentage of differences for each sound in each test is represented in Figure 1.

Chart of error percentage.
Figure 1 indicates that although the variation is not significant for all sound differences, there are differences in results for each sound which, when combined, result in a significant variation between the two tests. This means that overall, there is a significant difference in the number of sound differences on each sound across the two tests.
The null hypothesis (Ho) stated that the use of novel nonword keywords instead of a picture-based SSD test such as the GFTA-2 would not yield significantly different results in the assessment of articulation skills of EL2 learners due to the lack of semantic information. This hypothesis is rejected as the results indicate that there is a significant difference in results between these two tests.
Figure 1 shows all the speech sounds which had differences and the percentage of participants who presented with these differences. These speech sound differences include /ɹ/ in all positions, /ʧ/ in all positions, /ð/ in initial and medial positions, /Ө/ in medial and final positions, /ʤ/ in the medial position, and consonant clusters /bɹ/, /dɹ/, /tɹ/. Speech sounds /ɹ/, /ð/, /Ө/ and consonant clusters /bɹ/, /dɹ/, /tɹ/ are English sounds and do not occur in Setswana, and they were substituted for sounds from this sound inventory. For example, alveolar approximant /ɹ/ was substituted with the alveolar trill /r/, which occurs in Setswana. It is important to note that the mentioned sounds do not occur in Setswana, as these speech sound differences would not be present if the nonword list followed the Setswana sound system. These findings agree with the report of Shriberg et al. (2009); removing the linguistic content from a speech assessment may allow SLPs who do not speak an African language to accurately assess the participants’ articulation in their first language as they revert back to using that sound system. The speech sounds /ʤ/ and /ʧ/ were also produced differently. These speech sounds do occur in Setswana. The /ʤ/ sound was changed to /j/ by one participant in the older group. The /ʧ/ sound was produced by one older participant as /k/ and by two older participants as /ʃ/. For those participants who produced the nonwords differently, the administrator gave the participants an auditory model of the word and asked for repetition. The participants were able to produce these sounds accurately once a correct model was produced. This accurate repetition supported the need for a single verbally modeled production in addition to the initially read attempt. We suggest that these speech sound differences were made not because of an articulation disorder or language difference, but rather due to limited literacy skills in their L2. Likewise, the participants produced the /ʤ/ and /ʧ/ sounds correctly in the GFTA-2, indicating production success and supporting possible reading failure.
The results of the GFTA-2 and nonword test were compared. It was found that /bɹ/ was the only sound with a statistically significant difference. However, the other speech sounds produced differently also present with variation between the two tests. This leads to an overall significant difference between the two tests. In summary, the use of nonwords does have an effect on results, not on an individual sound level, but overall. Van der Merwe (2011) states that the lack of semantic content is an important aspect regarding nonword application in the clinical setting. The present study suggests that the removal of semantic information induces significant differences when compared with a traditional picture-based test. The two tests contained exactly the same speech sounds; therefore, the significant difference in results can only be caused by the lack of semantic information.
The significant difference between the two tests implies that when the semantic information is removed, the participants reverted back to the Setswana sound system. As a result, sounds such as /ð/ and /Ө/, which the participants produced correctly in English words, were pronounced as /t/ in the nonwords. The potential purpose of using nonwords as an articulation test is to be able to assess a participant in their L1. The Setswana speakers used Setswana sounds when pronouncing the nonwords; this supports the potential use of nonwords as, without semantic information, the sounds of a specific language can still be targeted.
Age Comparisons
The participants of this study were assigned to two age cohorts to determine the functional effect for both groups. Results in the use of nonwords may differ according to age groups due to literacy levels.
The two age groups are compared with regard to each individual speech sound difference in Table 3. The Mann–Whitney U test statistics as well as the corresponding p values are shown in Table 3.
Comparison Between the Different Age Groups for Individual Error Sounds.
Significant difference (p value <.05).
The only p value that is less than .05 and therefore statistically different is for “ɹ_Final_T2.” Thus, there is a statistically significant difference between the younger and older groups, only for “ɹ_Final” in Test 2.
When assessed using the GFTA-2, none of the participants made speech sound differences on /ɹ/ final position. During the nonword assessment, only participants in the older group made changes to this sound by varying the approximant /ɹ/ to the trill /r/. The younger group was provided with an auditory model which used the approximant /ɹ/ and as a result did not switch to the trill sound. This implies that when the older participants were able to freely read the sounds (not provided with a verbal model), many of them reverted back to the Setswana speech sound inventory and produced the trill /r/. These sound transfers from one language to another indicate a cross-linguistic effect (Goldstein & Gildersleeve-Neumann, 2015). A cross-linguistic effect occurs when a bilingual child uses a specific sound in their second language because it is part of their first language speech sound inventory (Goldstein & Gildersleeve-Neumann, 2015). In the case of the Setswana-speaking participants, cross-linguistic transfer occurred as the /r/ sounds from Setswana are used in the English speech production. The results of this study confirm Goldstein and Gildersleeve-Neumann’s (2015) statement that although children keep the two languages separate, there is interaction between the two systems. For the overall differences between the age groups for T1 and T2, the statistics are shown in Table 4.
Comparison Between the Age Groups When Assessed Using the GFTA-2 and Nonword List.
Note. GFTA = Goldman-Fristoe Test of Articulation; T1 = GFTA-2; T2 = nonword list.
As both p values in Table 4 are greater than .05, there are no statistically significant differences between Test 1 for the younger and older groups as well as no statistically significant difference between Test 2 for the younger and older groups. The comparison between the two age groups provides information about the effect of removing semantic information at different age and literacy levels.
Despite the different methods used for the different age groups, it is important to note that there is no significant difference in results. Nonwords cannot be represented pictographically; therefore, they need to be read. As the younger group (Grade 1) does not have the necessary literacy level, the examiner is required to provide an auditory model for these participants. In assessments such as the GFTA-2, the use of an auditory model can only be used to assess stimulability of the sound, but the error is still recorded. The lack of a significant difference between the age groups implies that the use of an auditory model for the younger group did not significantly affect the results obtained. As can be seen in Figure 1, it is evident that initial incorrect productions of the /ʤ/ and /ʧ/ may relate to the incorrect reading of these sounds. It is therefore important to consider not asking older children to read the nonwords and instead have all children repeat after an auditory model. In other words, due to inadequate literacy levels of the children assessed in this study, it may be necessary to remove literacy skills from the assessment. Further research is needed to address this finding as, although the comparison between the age groups indicated no significant difference, use of an auditory model may impact the validity of the assessment.
Limitations or Weaknesses
Due to the explicit inclusion criteria for this study specifically (language background and matched socioeconomic status determined by being in the same school), there was a very limited pool of potential participants. This limited group was further diminished by a lack of willingness for parents to allow their children to participate. As a result, this study has a very small sample size, and the results may not be representative of a larger portion of the target population. In addition, there are other factors, which may have limited the strength of the results. There was variation in the elicitation methods from picture naming to reading to repetition and in the canonical structure of the test stimuli between the two test conditions. Furthermore, there was a small difference in the number of stimulus items between the two test conditions. Finally, the sound differences resulting from second language influence were not calculated as errors.
Recommendations for Future Studies
If the nonword list is created using only Setswana speech sounds, and aligned with the linguistic rules of this language, these speech sound differences would not be present. The results indicate that in the presence of SSD, the speech sound differences found in the GFTA-2 would also be present when assessed using a nonword list. If the participants’ results on the nonword list were interpreted using the Setswana sound system, no errors would be identified. This would reduce the risk of overdiagnosing children as having SSD in the presence of a speech difference.
In addition, when the nonword list is used to assess older children, they were required to read the nonwords; therefore, some speech sound differences made were the result of literacy errors as opposed to articulation difficulties. More research is needed to identify the effect when EL2 learners read nonwords as well as the assessment validity when an auditory model of the nonwords is used. Furthermore, there are languages internationally that have not been included in traditionally used standardized norming samples. Therefore, research into the use of nonwords with other language groups is recommended.
Finally, the limitations mentioned previously should be considered in future studies. The challenges of using nonwords as stimuli with African languages having different structures than English call for rigorous research designs.
Conclusion
This investigation shows that speech assessment results of EL2 learners differ when nonwords are used instead of the GFTA-2 as assessment items. We suggest that these differences may be directly related to the lack of semantic information because the GFTA-2 and novel nonword list contain the same speech sounds in the same structure—the only difference is the lack of semantic information. In addition, when assessed using nonwords, EL2 learners reverted back to the sound system of their L1. We propose that the use of a novel nonword list containing only speech sounds found in the participants’ L1 may offer improved articulation assessment opportunities in regard to diagnosis of SSD. Further research is needed into this potential use of nonwords.
Supplemental Material
Appendix_A – Supplemental material for The Use of Nonword Keywords in the Speech Assessment of English Second Language Learners
Supplemental material, Appendix_A for The Use of Nonword Keywords in the Speech Assessment of English Second Language Learners by Lauren Ross, Salomé Geertsema, Mia le Roux and Marien Alet Graham in Communication Disorders Quarterly
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
