Abstract
The search for gender equality in language use is one of the most frequently cited cases of linguistic democratization (e.g., Farrelly & Seoane 2012:394). At the grammatical level, this process implies, for example, that pronouns such as generic
1. Introduction
Pronouns used to refer to a gender-less antecedent (or to one whose gender is irrelevant) are usually called “epicene pronouns” (e.g., Baron 1981; Newman 1997). Because English does not have specific epicene pronouns (Baron 1981:85), speakers may resort to generic
(1)
(2) <[> They will not </[> </{> ordinarily occur <,> in
(3) You’ve told us the meaning of <.> secre </.> a secretor. That is
The choice of any of these pronouns is closely related to democratization processes in that generic
Epicene pronouns have tended to be studied from either of the two following perspectives: (i) a strictly linguistic approach that considers the role played by language-internal variables such as the collective or individualized meaning of the antecedent (e.g., Newman 1997; Laitinen 2002), or (ii) a social approach to pronominal variation based on the role played by prescriptive grammar and feminism in the use and expansion of the different forms (e.g., Pauwels 2001; Balhorn 2009; Paterson 2014). The view adopted here leans more clearly towards the second perspective, in that one of the aims is to determine the role played by democratization as a factor conditioning the use of non-sexist pronouns in three Asian varieties of English, although language internal variables will also be considered in the analysis. In addition, the adoption of a register approach (e.g., Biber 1988), based on the analysis of corpora (using the
International Corpus of English
, or ICE), will allow us to see how democratic options (
The term and concept of democratization, described, for example, in Farrelly and Seoane (2012), will need to be used here with caution, as it is a western-centric concept applied to three eastern communities (see Loureiro-Porto & Hiltunen, this issue, for a discussion). In fact, when the term is found in the literature on World Englishes, it can refer to a wide variety of social and socio-political phenomena, such as the writing of a constitution (Schneider 2007); democratization is only occasionally used to refer to the kind of language change described in this issue (e.g., Wasserman & van Rooy 2014; Hackert & Deuber 2015). In order to properly interpret the (social and language-internal) factors conditioning the use of democratic epicene pronouns, the three Asian Englishes will be examined from different perspectives that allow us to capture their singularity.
The focus of this study, then, will be the variation between
RQ1: Is the distribution of epicene pronouns in these three Asian varieties similar to that of their matrilect, i.e., BrE?
RQ2: What does register variation tell us about the spread and diffusion of democratic epicene pronouns (
RQ3: To what extent can the variation between epicene pronouns be explained in terms of socio-cultural phenomena related to each territory explored in this paper?
The paper is structured as follows. Section 2 describes the theoretical framework. Section 3 presents the methodology and decisions therein. Section 4 analyzes the data, taking into account both language internal and language external factors. Finally, section 5 discusses the main findings and draws some conclusions.
2. Theoretical Background
2.1. Epicene Pronouns: Diachrony and Synchrony
The use of epicene pronouns in English is closely related to the loss of grammatical gender. The traditional view states that, while in Old English times pronouns agreed with the grammatical gender of the antecedent, the leveling of inflections during the Middle English period ushered natural gender (sex) into the linguistic arena (Curzan 2003:42-46).
4
Nonetheless, as early as Old English, examples of generic
(4) Ne fornime
‘Do not,
(5) weila he seið wa is me þt he oðer heo habbeð swuch word icaht . . . þt is muchel sorhe. for i feole oðer þing he oðer heo is swiðe to herien . . . (Ancrene wisse 47)
‘“Oh dear,” he [the backbiter] says, “Woe is to the one that he or she has caught such talk . . . that is a great sorrow, for in many other ways he or she is greatly to be praised”’ (from Curzan 2003:68; also quoted by Balhorn 2004:96)
(6) Gif oxa ofhnite
‘If an ox gores
Examples such as (6) have led some authors to claim that singular
(7) That
With hem and eek to sellen hem hir ware. (Man of law’s tale 139-140)
‘That everyone wanted to buy from them and also to sell them their merchandise.’ (Balhorn’s [2004:93] paraphrase)
Though these first examples of singular
The end of the eighteenth century saw the publication of prescriptive grammars that clearly proscribed the use of singular
Balhorn (2004) explores BrE searching pronouns referring to every-compound antecedents in the OED and finds a steady increase of singular
Similar results are found for AmE and AusE. Pauwels (2001) found that singular
2.2. Asian Englishes
Despite the attention that epicene pronouns have received in inner-circle varieties of English, their status in the outer-circle remains largely unexplored. For this reason, this paper focuses on three outer-circle (Asian) varieties: HKE, IndE, and SgE. The three share several characteristics, which makes them comparable for a study like this. To begin with, all three have BrE as their matrilect. Further, most of the native languages spoken in these territories exhibit gender-neutral third person singular pronouns, as seen in Table 1. Therefore, language contact effects should not justify differences among the three Englishes analyzed (as also mentioned in Loureiro-Porto 2020:195).
Third Person Singular Pronouns in Asian Substrates (adapted from Loureiro-Porto 2020:195) 5
This column refers to the English varieties which are potentially affected by the substrates listed in the first column. By no means does it mean that the pronouns in the third and fourth columns are used in any Asian variety of English.
Lastly, the three varieties are documented in the International Corpus of English (ICE) project, and corpora for each of them were compiled in the 1990s and include texts that date from 1990 or later, which allows for synchronic comparisons of the data. The structure of ICE corpora, including twelve different registers (see section 3), also allows for the analysis of register variation, something that has proved to be very significant in the study of World Englishes. In fact, register has come to be considered as significant a variable as region (Kruger & van Rooy 2018:231) and, notably, the registers in the different ICE corpora have been found to be very similar to one another (Kruger & van Rooy 2018:237), which guarantees that the results of the present study will not be biased by corpus compilation decisions.
Notwithstanding these similarities, the three varieties differ in a number of ways, such as the date at which English was first spoken in each territory. It reached India in the 1600s when the country became a trade colony; it entered Hong Kong only in the 1840s, “in the wake of the first Opium War” (Schneider 2007:133); and it arrived in Singapore in 1819, although Singapore did not become a British colony until 1867 (Deterding 2007:2). Following the introduction of English in these territories, the sociolinguistic situation gave rise to two different types of varieties, according to eWAVE (Kortmann & Lunkenheimer 2013). Both HKE and IndE are indigenized L2 varieties, that is, English was introduced in the colonial era via the educational system and is still used in education and other official domains, there never having been a significant number of L1 speakers. As for SgE, eWAVE only records Colloquial Singapore English, classifying it as a high-contact L1 variety, spoken by an ethnically mixed population who consider themselves native speakers of a local variety. No classification is found for Standard Singapore English (SSE).
In addition, the three varieties differ regarding their evolutionary phase, according to Schneider’s (2007) Dynamic Model of postcolonial Englishes, which consists of five evolutionary phases: (1) Foundation (settlers establishment), (2) Exonormative stabilization (English is stabilized in the territories, according to British rules, although the lexicon incorporates localisms), (3) Nativization (mixed codes are used and grammar starts to diverge from BrE), (4) Endonormative stabilization (after political independence, speakers are aware of the idiosyncrasies of their own variety and, on occasions, dictionaries are published), and (5) Differentiation (new varieties emerge). According to Schneider (2007:135-139), HKE is the least advanced of the three varieties here, in that it entered phase 3 in the 1960s and shows no traces of having entered phase 4. Following HKE is IndE, which according to Schneider (2007:171) shows little evidence of having entered phase 4, although Mukherjee (2007:182) claims that it is endonormatively stabilized. Finally, SgE is the most advanced variety, having entered phase 4 in the 1970s (Schneider 2007:155-161).
At the social level, as Brooks (2007:3) has noted, there are regional disparities concerning gender equality in the Asian region, with Hong Kong and Singapore being said to have “a high level of social development.” In fact, Hong Kong has undergone the consequences of intensive industrialization since the 1970s, when “increasing numbers of women entered the labour force” (Göransson 2010:199-200), while Indian women and girls are considered “preservers of the ethnic and religious authenticity [. . .] the bearers of traditional values that are to be inculcated into future generations” (Rydstrøm 2010:11). In such a scenario it is not surprising that the presence of a women’s movement in these territories differs considerably. Thus, because Hong Kong remained a British colony until 1997, the women’s movement in Hong Kong broadly mirrors that of western feminism (Lim 2010:144). Singapore, meanwhile, has enjoyed its own strong feminist tradition since the late 1800s, which made possible the emergence of several feminist societies, such as the Singapore Council of Women (after WWII), the National Council of Women (1975), and the Association of Women for Action and Research (1985) (Lyons 2010:76-80). The autonomous feminist movement in India, however, is said to have developed in the 1970s (Madhok 2010:225), yet by the 1990s the situation of women in public was the following:
6
1990s woman can step out, but she must demonstrate moral purpose—she is off to work, to buy food, to pick up the kids from school and so on. She must also perform respectability on the street, by avoiding eye contact, looking busy, moving with purpose. We have many reports of sexual harassment, of the difficulties for women of renting property and living without a man or family. (Osella 2017:230)
Nonetheless, it should not be forgotten that all these approximations to gender equality in Asia are made from a western perspective, while the true gender picture in Asia is in fact far more complex, as seen, for example, in the challenge to binary gender represented by Indian hijras, a religious community of individuals who embody male and female characteristics, who prefer to be referred to in the feminine gender, and who belong to an institutionalized third gender (Nanda 1999:ix-xiv). In addition, in 2009 a third gender (beyond the male/female distinction) was made available on ballot forms by the Indian election commission (Osella 2017:241). Thus, any working sociological description of gender equality in these Asian territories can only be taken to be superficial, as a means of contextualizing the possible role of social forces in the use of democratic epicene pronouns.
3. Methodology
The methodology adopted for this study is corpus-based, and the corpora chosen for the analysis of the three Asian varieties are the HKE, IndE, and SgE components of ICE (ICE-HK, ICE-IND, and ICE-SIN, henceforth). As stated, the three corpora were released in the 1990s and the texts they include were produced in that decade. Although this material is more than twenty years old, they are the most recent carefully curated corpora of the varieties under analysis, and their advantages outweigh their limitations. For example, they enable synchronic cross-linguistic comparisons, and they have proven to be tidier than larger, more recent corpora, such as GloWbE (Davies 2013), for example (Loureiro-Porto 2017).
The ICE project aims to provide comparable representative corpora of varieties of English throughout the world (Greenbaum 1996). Each ICE corpus consists of one million words (60 percent of spoken material, 40 percent of written material) in twelve broad registers, as shown in Table 2.
Registers Included in ICE
None of the three corpora is tagged or parsed. Therefore, examples were retrieved using AntConc (Anthony 2014) and then manually pruned. The word forms searched for include all word forms of the lexemes
Scrutinized Forms per Variety
Pruning these examples implied much decision-making having to do with (i) deciding which antecedents allow for variation between generic
Regarding the first of these issues, it was decided that there are probably cultural reasons why generic
Identifying the antecedent of a pronoun often requires the analysis of an extensive context, because a short one may lead to misinterpretations. On occasions, however, an extended context is not available. Where a pronoun appeared in a context in which the antecedent could not be retrieved, it was not included in the data.
The second important issue to take into account when pruning these data concerns the variable morphology in outer-circle varieties of English, where plural -s is not as systematically used as in Standard English (Mesthrie & Bhatt 2008:52). This makes it difficult to establish whether an antecedent is singular or plural:
(8) Now if we observe L two learner
(9) And
(10) When a student go over to France to do a degree perhaps in business or even in engineering how marketable are their degrees. (ICE-SIN:S1B-049)
In sentence (8) the first italicized they is not considered to be an epicene pronoun, despite its antecedent being second language learner, in that the following context suggests that learner is actually an unmarked plural form. Likewise, in (9) the other composer cannot be taken to be a clear singular antecedent, because the verb go does not agree with a singular subject. Something similar happens in (10), where a student, which seems to be undoubtedly singular because of the presence of the determiner a, is followed by the base form go, leaving room for an ambiguous interpretation. For the sake of rigor, and with the aim of minimizing errors, cases such as these were omitted from the analysis, and only very clear instances of epicene antecedents (such as those in 1-3 above) were included in the data.
A total of 2120 tokens of epicenes were found (see Table 4). All of these were then entered into a database including the variables below, which are analyzed through cross-tabulations and descriptive statistics:
Frequency according to Variety
epicene pronoun: he, they, he or she
variety: HKE, IndE, or SgE
register: the twelve registers included in ICE
antecedent: indefinite pronoun, noun phrase (NP) with quantifier, indefinite NP, definite NP
speaker gender (only for HKE and IndE): woman or man, as found in the ICE metadata 8
speaker age (only for HKE and IndE): age range assigned to speakers in the ICE metadata
4. Data Analysis
4.1. General Results
The 2120 tokens of epicene pronouns in the three Asian varieties under investigation are distributed as shown in Figure 1 (which is a graphic representation of Table 4). Figure 1 shows a marked and significant difference between HKE,
9
on the one hand, and IndE and SgE, on the other, the former exhibiting an overall more democratic picture than the latter, with 43.5 percent of democratic epicenes, i.e., singular

Frequency of Epicene Pronouns per Variety
4.2. Type of Antecedent
This section explores the type of antecedents of the three epicene pronouns analyzed, following the classification used by Paterson (2014:50; cf. Paterson, this issue), which presents antecedents on a scale from most indefinite to most definite:
(i) indefinite pronoun (e.g., somebody, everyone), as in (11).
(11) Now when
(ii) NP with quantifier (e.g., any person, each child, every teacher, no student, some assistant), as in (12).
(12) So surely there’s no reason to assume that
(iii) indefinite NP (e.g., a person, another child), as in (13).
(13) I also felt always <,> that
(iv) definite NP (e.g., the person, her friend, the professor’s student), as in (14).
(14) If
Type of antecedent is a key factor in the analysis of epicene pronouns from the perspective of democratization, because there are historical reasons why singular antecedents with a potential plural referent may foster the use of
(15)
Therefore, clear evidence for democratization must be sought in antecedent types (iii) and (iv), namely indefinite and definite NPs. 10
Figure 2 sets out the distribution of epicene pronouns per variety, taking into account the four types of antecedents. It shows that the frequency of democratic pronouns is consistently lower with antecedent types (iii) and (iv) than with antecedents (i) and (ii) in all three varieties (35.4 percent in HKE, 9.6 percent in IndE, and 9.9 percent in SgE, which comes ahead of IndE when purely democratic antecedents are considered). This finding is not surprising and goes hand in hand with data for BrE for the twenty-first century, as shown by Paterson (2014:60). She finds that while indefinite pronouns select singular

Type of Antecedent per Variety and Pronoun (Raw Frequencies)
4.3. Register Analysis
This section presents the distribution of the three epicene pronouns in the twelve registers included in ICE, reproducing the text codes used in the project (briefly, codes beginning with S indicate spoken registers, and those beginning with W indicate written registers; see Table 2 above for further information). As seen in Figures 3-7, the frequencies of epicene pronouns in some registers are very low (because the size of each corpus section is small). For that reason, the subsequent analysis will mainly focus on general tendencies, rather than on specific fine-grained differences between registers. Figure 3 shows the first overview of the data. Consistent with the results seen so far, a clear difference is observed between HKE, on the one hand, and IndE and SgE, on the other: HKE exhibits democratic epicene pronouns across all registers,
11
and the presence of democratic options in IndE and SgE is restricted to certain registers. Particularly striking in IndE and SgE is the frequency of democratic epicene pronouns in W2F (fiction) and, in the case of SgE, in S1A (private dialogues). The fact that fiction often exhibits features of spoken registers has been explained as a result of the tendency for it to contain (informal) dialogue (e.g., Kruger & van Rooy 2018:231). Given that in section 4.2 we have seen that the type of antecedent is an important factor in measuring the democratic value of singular

Distribution of Epicene Pronouns across ICE Registers per Variety (Raw Frequencies)

Distribution of Epicene Pronouns in ICE-HK Registers: Comparison between All Types of Antecedents and Only Antecedents (iii) and (iv) (Raw Frequencies)

Distribution of Epicene Pronouns in ICE-IND Registers: Comparison between All Types of Antecedents and Only Antecedents (iii) and (iv) (Raw Frequencies)

Distribution of Epicene Pronouns in ICE-SIN Registers: Comparison between All Types of Antecedents and Only Antecedents (iii) and (iv) (Raw Frequencies)


Figure 4 shows the results for HKE, where the picture does not differ much if we compare all antecedents with antecedents (iii) and (iv). Beginning with spoken registers (S1A to S2B), we see a clear constant decrease in the percentage of singular
Figure 5 shows that IndE spoken registers also exhibit a decreasing percentage of singular
As shown in Figure 6, the situation in SgE does not change much when all types of antecedents are compared to antecedent types (iii) and (iv). If anything, the elimination of antecedents (i) and (ii) from the analysis makes even more evident the sharp division between S1A (private dialogues) and all other registers, where democratic pronouns are seldom used. In addition to the decreasing proportion of singular
Summing up, despite the fact that the frequencies of epicene pronouns in some registers are too low to draw some fine-grained conclusions, the register analysis of these three varieties yields interesting results that point towards general tendencies. Firstly, the three varieties exhibit a higher incidence of democratic epicene pronouns in spoken registers, particularly in the most spontaneous ones, and in fiction, which is expected to include much conversation (e.g., Kruger & van Rooy 2018:231). Secondly, in HKE the use of democratic epicene pronouns is spread across all registers, and singular
4.4. Speaker Age and Gender
This section considers two external variables that may help explain the variation in epicene pronouns in HKE and IndE, namely the
The registers for which metadata is available on the speaker’s gender and age are the spoken ones, namely S1A (private dialogue), S1B (public dialogue), S2A (non-scripted monologue), and S2B (scripted monologue).
The dichotomous approach to gender, differentiating only male and female speakers, represents a simplification of reality and excludes speakers who associate with a third gender, such as the hijras in India. Nonetheless, because this religious group has only gained democratic rights recently (2009, according to Osella 2017:241, as seen in 2.2), they are highly unlikely to have been informants in the compilation of ICE-IND, which took place in the 1990s and which stipulated that authors and speakers had to have been educated through the medium of English (according to the ICE compilation criteria). Other gender identities not culturally recognized in India and Hong Kong might, however, have been included in ICE corpora without proper labeling. Unfortunately, that is something that cannot be known, and the analysis of the data must be conducted without forgetting that such simplification of reality is unavoidable in corpora compiled in the 1990s.
ICE-HK and ICE-IND classify age groups differently. Since the original classification is followed here, age groups will vary from one variety to another.
The number of speakers per age group is very unbalanced. However, the rate of epicene pronouns will be presented in percentage form, which normalizes the frequency of each of the three available forms.
Figures 7a and 7b show the percent distribution of epicene pronouns in HKE according to gender and age groups. Figure 7a includes all types of antecedents, while Figure 7b focuses exclusively on antecedents (iii) and (iv). Overall, the HKE picture does not point to any clear tendency, although the youngest groups (less than thirty years old) use democratic pronouns more often than generic
Figures 8a and 8b show the picture for IndE. These figures reveal that, contrary to what happened in IndE written registers (see section 4.3), coordinate


Summing up, the HKE data do not point towards any clear tendency, other than younger speakers use democratic pronouns more often than generic
5. Discussion and Conclusions
Based on the HKE, IndE, and SgE ICE corpora, this paper has explored the variation between three possible epicene pronouns in 1990s data, one considered sexist, namely generic
Several hypotheses can be explored to account for this ranking, the first of which involves the specific phase of development of each variety according to Schneider’s (2007) Dynamic Model. As seen in section 2.2, the most advanced variety is SgE, the least advanced one is HKE, and IndE exists somewhere in between. There is, then, no correlation between degree of progress in postcolonial terms and a preference for democratic epicene pronouns. This seems to be an important finding, because it has been shown that the developmental phase does indeed correlate with degree of informality. As Kruger and van Rooy (2018:237) observe, “the more advanced non-native varieties become along the stages of the Dynamic Model (Schneider 2007), the more they resemble native varieties in their use of informal features in written registers.” However, HKE has been shown to use democratic pronouns in written registers to a far greater extent than the other Asian varieties, although they are more advanced in Schneider’s (2007) model. The same applies to the classification of World Englishes as indigenized L2 varieties (HKE and IndE, according to Kortmann and Lunkenheimer’s 2013 eWAVE) and high-contact L1 varieties (which would be the case of CSE). According to Kruger and van Rooy (2018:231), the lowest degree of informality is to be found in high-contact varieties, which does not help to explain the lowest position of IndE in the ranking here. Therefore, perhaps democratic epicene pronouns cannot be interpreted as an informal feature, an assumption which would help disentangle democratization from phenomena such as informalization, although in fact these have been claimed to be closely related (Farrelly & Seoane 2012:393).
Another hypothesis that might explain the differences between the varieties is substrate influence. However, as shown in section 2.2, the languages that English enters into contact with in these three territories are very similar, in the sense that they all have a gender-neutral third person singular pronoun in their inventory, which would foster the use of democratic options rather than generic
In fact, converging evidence is found in Loureiro-Porto (2019), a study on the variation regarding another democratization marker (Farrelly & Seoane 2012:393), namely the replacement of modal must with semi-modals have (got) to, need to, want to. Based on a series of morphosyntactic and semantic features related to grammaticalization, the study shows a ranking of Asian varieties of English according to which HKE is more advanced than SgE, which in turn is ahead of IndE (Loureiro-Porto 2019:134). The correlation between that ranking and the one presented here for epicene pronouns seems to suggest that democratization is at work at different levels in Asian Englishes. In the specific case of epicene pronouns, in addition, there are social reasons, these from a feminist perspective, that explain why HKE exhibits the highest rate across registers of democratic epicene pronouns. In fact, different sources point to a higher level of social development in Hong Kong than in other Asian territories (Brooks 2007:3; Lim 2010:144), while the feminist movement in India only emerged in the 1970s, and by 1990 inequality was still very much in evidence in the public role of women (Madhok 2010:225). The differences between HKE and IndE regarding gender equality in language, then, seems to correlate with gender equality in society.
In addition to these general results, this paper also adopted a register approach so as to determine the paths followed in the diffusion of democratic epicene pronouns in World Englishes. The three main findings that emerge are consistent in the analysis of the three varieties. First, singular
Two language external factors were taken into account for HKE and IndE, based on the metadata that accompany ICE corpora. The variables analyzed were
Summing up, the analysis presented here allows us to answer the three questions posed in the introduction. Thus, the presence of democratic epicene pronouns in Asian Englishes does not mirror that of BrE in the 1990s, although HKE is far closer to its matrilect than IndE and SgE (RQ1). The register approach clearly shows that singular
Footnotes
Appendix
This appendix includes the tables corresponding to Figures 7a and 7b (Table A1) and 8a and 8b (Table A2). Because the raw frequencies of some pronouns in each age group were on occasions very low, no statistical test could be used.
Acknowledgements
I am grateful to the reviewers and Turo Hiltunen for their very insightful comments. The usual disclaimers apply.
Funding
The author received financial support for the research, authorship, and/or publication of this article: For financial support, thanks are due to the Spanish Ministry of Economy and Competitiveness (grant FFI2017-82162-P) and the Academy of Finland (decision 258434).
