Abstract
Traditional Chinese characters are usually written with at least as many strokes as simplified characters, but there are Chinese characters with fewer strokes in their Taiwanese traditional forms than in their Mainland Chinese simplified forms. This study lists these characters and gives the reasons for the differences between their traditional and simplified forms. These reasons will lead to the conclusion that there are motivations for orthographic choices in Mainland China and Taiwan that extend beyond the desire for standardization or simplification. These motivations have been explicitly expressed in texts dictating the rules for standard character forms, but are little known and rarely discussed. This study brings these motivations to the forefront and shows the effects they have had on orthographic policy, in particular where differing motivations between Mainland China and Taiwan have caused Taiwanese traditional characters to consist of fewer strokes than their simplified equivalents. The root cause underlying the different choices made by the Mainland Chinese script authorities and their Taiwanese counterparts post-1949 is a difference in ideology. In Mainland China, the script was a tool to be employed in such a way as to achieve a practical goal, namely mass literacy. Since achieving this goal was thought to require or at least be facilitated by simplifying the script, that course of action was taken. In Taiwan, however, the goal of script policy is to preserve the traditional way of writing characters. In practice, this means that unlike simplified characters, Taiwanese traditional characters are preferably made to structurally conform to their seal script forms in the Shuoˉwén Jieˇzì, leaving certain Taiwanese traditional characters with lower stroke counts than their equivalent simplified forms. Thus, Taiwanese and Mainland Chinese character forms are different because the Taiwanese and Mainland Chinese script authorities ultimately want different things from their scripts.
Keywords
Introduction
It is conventional wisdom that simplified characters, the characters used predominantly in Mainland China, have fewer strokes than traditional characters, the characters that were formerly standard in Mainland China and still are in places such as Taiwan and Hong Kong. After all, reducing the number of strokes in characters was one of the primary methods of simplification used in the Mainland Chinese simplification programme. And yet careful examination of contemporary Taiwanese traditional and Mainland simplified characters reveals a paradox: there are traditional characters that have fewer strokes than the simplified forms of those same characters. In this study, I will list these anomalous characters and explain the reasons for their existence. These reasons will lead to the conclusion that there are motivations for orthographic choices in Mainland China and Taiwan that extend beyond the desire for standardization or simplification. These motivations have been explicitly expressed in texts dictating the rules for standard character forms, but are little known and rarely discussed. This study brings these motivations to the forefront and shows the effects they have had on orthographic policy, in particular in the cases of differing motivations between Mainland China and Taiwan that have caused the paradox that is the topic of this paper.
I will begin by giving a brief overview of the Mainland Chinese simplification process and orthographic policy in Taiwan after 1949, followed by my methodology for finding characters that meet the criterion of inclusion in this study, namely that their Taiwanese traditional forms consist of fewer strokes than their equivalent Mainland simplified forms. I will then list the forms that meet this criterion, which I have divided into six lists of characters in three categories: characters relevant due to differences in variant elimination (Table 1), characters relevant due to containing a form in Table 1 as a component (Table 3), and characters relevant due to stroke contraction (Tables 5, 6, 8 and 9). Each list comes with a brief description, and each category is followed by some insight into the situation preceding the Mainland–Taiwanese split in orthographic standards and my analysis of the orthographic principles that have caused this paradoxical situation to occur. Finally, I will briefly discuss some of the inconsistencies in characters’ forms and stroke counts that I have encountered while carrying out this research, provide an estimate of the frequency of this type of character, and describe how the character forms that are the topic of this study are encoded in Unicode.
In this article, I will use the term ‘simplified’ (jiǎntǐ 简体) to refer to all the characters that are currently considered standard forms in Mainland China. This includes graphs that have not been changed during the simplification movement of the 20th century, which are therefore ‘simplified’ only in the sense that they are part of the simplified character standard, not in the sense that those graphs themselves have been subject to simplification. I will generally use the term ‘simplification’ (jiǎnhuà 简化) in the way in which it has been used by those who carried out the reform of Chinese characters in Mainland China after 1949, so that ‘simplification’ is used to mean the reduction of the number of strokes in a character or the replacement of a character with another character that has the same pronunciation, thereby reducing the number of characters in use (Chen, 1999: 157). In reality, these are two different types of simplification, since the former simplifies at the level of the individual graph, whereas the latter simplifies at the level of the writing system as a whole. Simplifications at the level of the individual graph can cause complications at the level of the system as a whole, and vice versa. Thus, whether or not the act of reducing the number of strokes in a character or reducing the number of characters in use should be considered a simplification is itself debatable. By attaching more meanings to a character, as is the case when a character is replaced by a homophonous character that already has a meaning of its own, it may become harder to understand the intended meaning of the character in a specific context, in effect also making the character harder to read correctly (Handel, 2013: 40–41). While reducing the number of characters in use has resulted in certain characters being relevant to this study, the focus of this article lies at the level of the individual graph. As for reducing the number of strokes in characters, it has been argued that it causes more characters to look alike, making them harder to distinguish and thus harder to read (Chen, 1999: 158, 160). Handel (2013: 41) points out a more fundamental problem with equating stroke reduction to simplification: using stroke number as the metric by which to judge the efficacy of simplification […] is based on a fundamental misjudgment about Chinese characters: namely, that the stroke is the basic cognitive unit by which script users learn and remember characters.
A character or character component with more strokes is not automatically more difficult to learn or to remember than a character component with fewer strokes, so stroke reduction does not automatically make a character or component simpler. At the heart of these criticisms of simplification lies the difference in ease of use of characters between the writer and the reader: making a character simpler to write can make it harder to read, so that the term simplification may be applicable to the process of writing a character, but not necessarily to its reading. Furthermore, the way in which the term ‘simplification’ is used in this study is not the only way in which it can be used, for any change made to a character that the one who initiates the change considers to make the character simpler can be deemed a simplification. For example, changes to establish a perfect correspondence between phonetic elements in characters and their pronunciation can be considered simplifications, irrespective of their effect on stroke totals.
In this study, I will use the term ‘pre-simplified traditional’ to refer to the standard characters that were used in Mainland China up until the simplification programme of the second half of the 20th century. Traditional characters are still sometimes used in Mainland China, but since they are no longer the orthographic standard for contemporary characters, that use is not taken into consideration in this study. Conversely, I will use ‘Taiwanese traditional’ to refer to the standard characters that are presently used in Taiwan. By ‘relevant characters’ I will mean characters that have fewer strokes in their standard contemporary Taiwanese traditional forms than in their equivalent Mainland simplified forms, and therefore meet the criterion of inclusion in this study, regardless of the cause of this situation.
Simplification and standardization in Mainland China and Taiwan
From 1949, two distinct points of orthographic standard-setting existed in Mainland China and on the island of Taiwan. The orthographic standards set by the script authorities of these two entities quickly began to diverge, as Mainland China embarked on a process of character simplification while the Taiwanese script authorities avoided large-scale prescriptive changes and kept the vast majority of character forms the same as they had been before the split. The Mainland Chinese drive for character simplification was aimed at aiding the spread of literacy to the population at large (Wiedenhof, 2015: 394). This goal was motivated in part by the Mainland Chinese government’s communist ideology. Work on the simplification of Chinese characters in Mainland China began almost immediately after the founding of the People’s Republic of China in 1949, with the Ministry of Education circulating a list of over 500 simplified characters for discussion in 1950 and with the setting up of the Committee on Script Reform in 1952 (Chen, 1999: 155). Mainland Chinese policy to standardize the character-writing system and reduce the number of characters in use first came into force in 1955 with the abolition of 1053 variant character forms in the Dì Yī Pī Yìtǐzì Zhěnglǐ Biǎo 第一批异体字整理表 (First Batch of Tabulated Variant Forms of Chinese Characters, hereafter: Yìtǐzì Biǎo) (Chen, 1999: 154; Zhongguo, 1955, 1956, 1986, 1988). The 1956 publication of the first Scheme of Simplified Chinese Characters, the Hànzì Jiǎnhuà Fāng’àn 漢字簡化方案, followed soon after, consisting of 515 simplified characters that replaced 544 traditional ones, and 54 simplified character components that each replaced a traditional character component. A complete list of all 2236 characters that were simplified by the first Scheme, including those that were simplified because they contained a character component that was simplified, was published in 1964, and republished with minor changes in 1986 (Chen, 1999: 154).
The first Scheme mostly consisted of character forms that already existed as popular or variant forms, or in cursive script, and were thus by and large not new characters but old characters that had been elevated to the position of standard character, a principle known as shù ér bú zuò 述而不作, ‘recognizing without creating’ (Baldauf and Zhao, 2008: 40). However, the central government wanted to go further, and in 1964 publicly stated its aim of simplifying all characters in common use down to no more than 10 strokes, compared to nearly half of simplified characters in use today consisting of more than 10 strokes (Chen, 1999: 155; Wiedenhof, 2015: 398). To this aim, the Second Scheme of Simplified Chinese Characters (Draft), Dì Èr Cì Hànzì Jiǎnhuà Fāng’àn (Cǎo’àn) 第二次漢字簡化方案 (草案), was published in 1977. It contained 248 new simplified characters intended to be used immediately and 605 characters intended for trial use. However, the Second Scheme proved unpopular, and its use in school textbooks and major national newspapers was halted in 1978 (Baldauf and Zhao, 2008: 51). It was officially repealed in 1986, after having received much criticism (Chen, 1999: 156).
After the 1949 split, the Republican script authorities in Taiwan at first continued considering the issue of character simplification, as they had done on the Mainland before the Japanese invasion in 1937 (Chen, 1999: 153). However, after the institution of simplified characters on the Mainland, the official Taiwanese position on simplified characters shifted drastically, and in 1956 the use of simplified characters in publications was officially banned. Although simplified characters continue to be used in Taiwanese handwriting, they are rarely seen in print (Chen, 1999: 162–163).
Like the Mainland authorities in 1955, the Taiwanese script authorities have also standardized character forms, namely in the Chángyòng Guózì Biāozhǔn Zìtǐ Biǎo 常用國字標準字體表 (List of Commonly Used Standard Character Forms), published in 1979. Furthermore, 40 general rules (tōngzé 通則) and 120 specific rules (fēnzé 分則) describing how to correctly write standard contemporary Taiwanese traditional characters are given in the Guózì Biāozhǔn Zìtǐ Yándìng Yuánzé 國字標準字體研訂原則 (Principles of Research and Designation of Standard Character Forms, hereafter: Guózì Yuánzé), which dates to 1997 and is available online on the Taiwanese Ministry of Education’s website (Zeng and Jiàoyùbù, 1997).
Methodology
In order to determine which Taiwanese traditional characters have fewer strokes than their equivalent simplified forms, it was necessary to consult a primarily Taiwanese traditional character dictionary that also gives information on simplified forms, and a primarily simplified character dictionary that also gives information on the traditional forms of characters. This is because these two types of dictionary provide different types of potentially relevant characters. The former type contributes Taiwanese traditional character forms that are subject to an orthographic standard which has not been applied to simplified characters on the Mainland and does not normally show up in Mainland traditional dictionaries, since they usually take the pre-1949 traditional forms as their standard. The latter type contributes simplified characters that represent multiple traditional characters, at least one of which has fewer strokes than the simplified form that it is replaced by. Since in a traditional character dictionary the multiple traditional forms that are represented by such a single simplified form are given as separate characters, without cross-referencing the reader has no way of knowing that one or more of those traditional forms have been replaced by a simplified form that has more strokes.
For the former type, I manually reviewed all of the characters listed in the Far East 3000 Chinese Character Dictionary/遠東漢字三千字典 (Teng, 2011, hereafter: Far East Dictionary), a Taiwanese traditional character dictionary that lists both the traditional and simplified forms of the 3000 most common Chinese characters and gives the stroke count of each form. Since it was clear from the relevant characters that I found in this dictionary that certain stroke-saving principles are in use that have been consistently applied to all characters that contain a certain component, I then used character-finding tools on Zdic.net and in the Pleco Chinese Dictionary app to compile a list of characters that contain components that would cause them to be relevant to this study but that are not recorded in the Far East Dictionary. For example, having identified the character 流/流 (liú, ‘to flow’) as relevant to this study, I searched Zdic.net for other characters with a similar pronunciation, since such characters may share components due to their use as phonetics, and I used the Pleco app to generate a list of characters containing the component ‘
’. Many of the characters that I identified in this manner are archaic or very obscure, so in order to limit this study to graphs that have at least some degree of contemporary usage, I attempted to locate them in one of the other character dictionaries that I used for this study, namely the Guómín Zìdiǎn 國民字典 (1974; Taiwanese traditional), the Xīnhuá Zìdiǎn 新華字典 (Xinhua, 1955; Mainland pre-simplified traditional), and the Xiàndài Hànyǔ Guīfàn Zìdiǎn 现代汉语规范字典 (Lü and Li, 1998; Mainland simplified, hereafter: Guīfàn Zìdiǎn). In addition to the aforementioned need for dictionaries both from Mainland China and Taiwan, I have intentionally included dictionaries published both before and after the Mainland Chinese simplification movement, for the purpose of comparison. I then also looked up the characters that feature in at least one of these four dictionaries in the Zhèngzì Biǎo 正字表 (contemporary Taiwanese traditional) (Zhōnghuá, 2017), which I describe in more detail in the next paragraph. In some cases the stroke count has to be deduced by combining the stroke count of the radical listed in the radical index with the remaining stroke count of the character. The reason for these checks, and for not using dictionaries that include a greater number of characters, is primarily that the aim of this study is to chart contemporary Taiwanese traditional characters that have fewer strokes than their equivalent simplified forms, not to compile a list of archaic characters that would hypothetically meet that criterion if they were written today. Therefore, I am using a character’s presence (as a standard form) in at least one of the physical dictionaries mentioned above as a proxy for contemporary use, thereby meriting its inclusion in this study. Furthermore, only characters that are listed in at least one of the Taiwanese traditional sources and in at least one of the simplified or pre-simplified traditional sources are included in this study, for otherwise it would not be possible to compare the stroke counts and conclude that the Taiwanese traditional form has fewer strokes. The relevant characters identified in this way are those in Tables 3, 5, 6, 8 and 9. Since this methodology is based on an extrapolation of writing principles that can be found in the 3000 most frequently used characters, it is possible that less frequently used characters relevant due to stroke reduction principles not present in this sample are not included in this study.
Characters relevant due to the elimination of their traditional form in favour of a character with more strokes in the Mainland Chinese orthographic standard (55 characters).
Forms marked with ‘x’ are listed as standard forms in the Zhèngzì Biaˇo, but are also given as variant forms in the description of the character or alongside the traditional form in the centre column. For example, the Zhèngzì Biaˇo entry for 惪 lists the character as a standard character (zhèngzì 正字), but also indicates that it is a variant form (yìtıˇ 異體) of the graph 德 (see the full entry at https://dict.variants.moe.edu.tw/variants/rbt/word_attribute.rbt?quote_code=QzAzNzYx (accessed 13 July 2021)). Some characters marked ‘x’ are also listed as variant forms in other sources, such as the Guómín Zìdiǎn, others are not. Their inclusion in this list is therefore debatable, but I have chosen to include them with this annotation for the sake of a comprehensive overview of the relevant forms.
Listed as a variant form in the Zhèngzì Biaˇo, but as a standard form in the Guómín Zìdiaˇn and Xıˉnhuá Zìdiaˇn.
The stroke count of this character is sometimes given as 10, of which 5 strokes are the radical, but I have followed the Far East Dictionary here, which gives it a stroke count of 9.
Since 強/强 forms part of certain other characters, it has further implications for this study, which are dealt with in Table 3.
Characters relevant due to differences in variant elimination
The left-hand column of characters in Table 1 contains traditional character forms that the Zhèngzì Biǎo gives as contemporary standard forms (zhèngzì 正字), but that have been eliminated from the Mainland Chinese orthographic standard in 1955 by the Yìtǐzì Biǎo in favour of the traditional forms in the second column of characters. The criteria deciding which forms were eliminated will be discussed below. In the years following this elimination of variant forms, the Mainland Chinese government carried out its script simplification programme, creating the simplified orthographic standard present in the third column of characters. In most cases in Table 1, the traditional forms in the second column were not structurally altered by the Mainland simplifications, though in some cases the difference in stroke count between the eliminated form in the first column and the maintained form in the second column is larger than the stroke reduction carried out as part of the Mainland simplification, so that the character in question is relevant to this study. For example, the simplified character 锉 (cuò, ‘file’) has three fewer strokes than its traditional form 銼, but still has three more strokes than its traditional form 剉. A few of the changes made in the Yìtǐzì Biǎo were reversed in the 1980s, which I have taken into account during the compilation of this list.
As can be seen in Table 1, many of the characters relevant to this study were eliminated from the Mainland Chinese orthographic standard in favour of character forms with a higher stroke count in the Yìtǐzì Biǎo in 1955. In fact, the Yìtǐzì Biǎo contains 226 eliminated variant forms that have fewer strokes than their equivalent maintained forms, out of a total of 1053 eliminated forms. These 226 eliminated forms do not all meet the criteria for inclusion in this study, since some of them are also considered non-standard forms by the Zhèngzì Biǎo and since some of the maintained forms had their stroke counts reduced by the Mainland simplifications that came after the Yìtǐzì Biǎo, but they clearly show that stroke count was not the primary criterion for variant elimination. This begs the question, if not on stroke count, then on what basis did the compilers of the Yìtǐzì Biǎo make their decisions to maintain certain forms and eliminate other forms? The principles of selection underlying the 1955 elimination of variant forms are given by Shan (2001: 49) as follows: out of the possible forms, the character forms that are maintained: (a) have existing (printing) moulds; (b) if moulds exist for multiple variant forms, are the forms that are most common in general use; (c) if the frequency of use is comparable, are the characters with the broader range of meaning; and (d) as much as possible, are characters with a left–right structure (as opposed to a top–bottom structure) (Shan, 2001: 49). Evidently, stroke count is not included in these principles, which explains why forms were eliminated that have fewer strokes than their maintained equivalent forms.
Frequency of use of characters in Table 1 according to the Zhèngzì Biaˇo.
Since this study focuses on standard character forms, it is important to understand the Taiwanese position on which forms are to be considered standard forms, so that we can understand why not all of the forms with fewer strokes that were eliminated in Mainland China by the Yìtǐzì Biǎo are listed as standard forms in our Taiwanese traditional sources and included in this study. As mentioned above, the Taiwanese script authorities standardized character forms in the Chángyòng Guózì Biāozhǔn Zìtǐ Biǎo 常用國字標準字體表 (List of Commonly Used Standard Character Forms), published in 1979. However, the Taiwanese script authorities used a narrower definition of what is considered a variant form than the Mainland authorities, given as follows in the Guózì Yuánzé (Zeng and Jiàoyùbù, 1997):
“ 字形有數體而音義無別者 , 取一字為正體 […] 。”
If a character has multiple graphical forms and the pronunciations and meanings are indistinguishable, select one [of them] as the standard form.
字有多體 , 其義古同而今異者 , 予以並收。 […] 古別而今同者 , 亦予並收 […] 。”
If a character has multiple forms, and their meanings were the same in ancient times but are different now, all of them are collected [as standard forms]. […] Those that were different in ancient times but are the same now, are also all collected.
Conversely, the Yìtǐzì Biǎo includes forms that have overlapping, but not identical, pronunciations and meanings among its eliminated variant forms. These differing definitions of what counts as a variant form have caused fewer characters to be eliminated from the Taiwanese orthographic standard than from the Mainland standard, and are the reason why the forms in Table 1 are considered standard forms in Taiwanese traditional script, but were eliminated in Mainland China. For example, the traditional characters 氾 and 泛 were merged into the simplified character 泛, even though the Far East Dictionary gives their definitions as ‘spread, fill everywhere; extensive, boundless’ and ‘float; flood; general; pan- (as in “pantheism”)’ respectively. Though there is some overlap between the two sets of definitions, the two traditional characters clearly do not have perfectly identical meanings, and must therefore both be maintained as standard Taiwanese traditional forms according to the rules of the Guózì Yuánzé.
Characters relevant due to containing a form in Table 1 as a component
Characters containing 強/强 (five characters).
A blank space indicates that a character is not listed in a source.
Characters in Tables 3, 5, 6, 8, and 9 without a stroke count are not explicitly given a stroke count in the source, but visually conform to the form given.
The stroke count given here is made up of six strokes for the radical `衤', plus eleven strokes for the rest of the character. The traditional radical `衤' is written with five or six strokes depending on the source.
The stroke count for this character is not explicitly given, but can in this case be deduced based on the stroke count of the previous character in the dictionary.
糨 is listed as a variant form of 糡. 糨 is also listed as a variant form, though not directly next to 糡.
The component ‘𧈧’, which appears alongside a ‘弓’ component in the character 強/强 (qiáng, ‘strong’) and in characters that contain that character as a component, is written in Taiwanese traditional characters with a ‘厶’ component, whereas it is written as ‘虽’; that is, with a ‘口’ component, in simplified characters, causing the Taiwanese forms to have one fewer stroke, unless other changes have been made in the simplification of the character that affect the stroke count. As can be seen in Table 1, the decision to write the simplified form of the character 強/强 with the ‘口’ component was formally made in Mainland China in 1955 in the Yìtǐzì Biǎo, which lists 强 as the standard form from then on, and 強 as a variant form that has been eliminated. These two distinct ways of writing the component ‘𧈧/虽’ are consistent in Taiwanese traditional and simplified forms respectively, though the Xīnhuá Zìdiăn is inconsistent in this regard. The inconsistent use of ‘𧈧/虽’ in pre-simplified traditional dictionaries will be discussed below.
Selected characters containing [強/强] in the Zhoˉnghuá Dà Zìdiaˇn.
Characters containing /(14 characters).
This character is listed as a variant form of the character 瑠.
Characters relevant due to stroke contraction
’ in simplified forms is written as ‘
’ in Taiwanese traditional forms; that is, with one fewer stroke, since the ‘丶’ stroke of the ‘亠’ component and the ‘
’ stroke of the ‘厶’ component are written as a single elongated ‘
’ stroke. As can be seen in the Table 5, this contracted way of writing the component ‘
/
’ is applied to all Taiwanese traditional forms in which it appears, but to none of the simplified forms in which it appears. In theory, this same stroke-saving contraction of the ‘丶’ stroke in a ‘亠’ component with a ‘
’ stroke below it could be made in other character components too – for example, in characters containing a ‘玄’ or a ‘亥’component – but this is not done, at least not in printed or standard forms.
Characters containing 卸/卸 (three characters).
The stroke count for this character is not explicitly given, but it is listed based on stroke count between two characters that also have 8 strokes.
Listed as a variant form of the character 銜.
Listed both as a standard form and as a variant form of the character 銜.
Listed as a variant form of the character 衔.
’ in simplified forms is written as ‘
’ in Taiwanese traditional forms; that is, with one fewer stroke since the ‘
’ stroke and the ‘
’ stroke at the bottom of the component are combined into a single ‘
’ stroke. This contraction is applied to all traditional forms that contain this component, but not to simplified forms that contain it. However, as Table 6
shows, the Xīnhuá Zìdiǎn is inconsistent in the way in which it writes this character component. Of the three forms of this type relevant to this study, it gives the character 御 the same stroke count as the two simplified dictionaries consulted here give it, but the characters 卸 and 啣 with one stroke fewer. A close look at the characters in question (which can be found in Table 7) confirms that for the character 卸, the contracted component ‘
’ is used, explaining the stroke count of eight. However, this is not the case for the character 啣, thus leaving this character without an obvious explanation for its reduced stroke count.
Characters containing 卸/卸 in the Xıˉnhuá Zìdiaˇn.
’ stroke and a ‘
’ stroke into a ‘
’ stroke is present in certain traditional dictionaries, such as the Guómín Zìdiǎn (1974), the Kāngxī Zìdiǎn 康熙字典 (1716) and the Zhōnghuá Dà Zìdiǎn (1915), in characters containing the component ‘此’ in the upper half of the character, such as 些 (xiē, ‘some’) and 柴 (chái, ‘firewood’), and sometimes also in characters that feature the component ‘此’ in other parts of the character, such as 雌 (cí, ‘female’) or 呲 (cī/zī, ‘to scold’). In such cases the aforementioned dictionaries give the component ‘此’ as consisting of five strokes, as opposed to the six strokes it has in simplified dictionaries, but also in other traditional dictionaries such as the Far East Dictionary, the Zhèngzì Biǎo and the Hànyǔ Dà Zìdiǎn 漢語大字典 (Xu Z and Hanyu, 1995). In some cases, this contraction is visible in the depiction of the character, in other cases it is not, and the component appears to have six strokes despite being listed as having five strokes. Since this contraction is not present in the most recent sources that I have consulted, I have not included these characters in this study.
Characters containing 致/致 (one character).
致 is listed as a variant form.
’ stroke are contracted into a single ‘㇇’ stroke, could in principle be applied to all characters containing the component ‘攵’, such as 玫 (méi, ‘rose’), but this is not done, at least not in printed or standard forms. The traditional character 緻 (zhì, ‘fine, delicate’) shows that this way of writing the component ‘夊’ is applied to other characters containing the character 致, though I have not found any commonly used Taiwanese traditional characters that contain this character and have fewer strokes in total than their equivalent simplified forms.
Characters containing 毒/毒 (two characters).
The component ‘母’ in simplified characters is written as ‘毋’ in traditional characters when it appears below a ‘
’
component; that is, with the two ‘丶’ strokes replaced by a single ‘丿’ stroke, thereby reducing the stroke count by one. Both in simplified and Taiwanese traditional script ‘母’ and ‘毋’ are recognized as distinct components, though they are often listed as a single radical in dictionaries. This contraction is present in all Taiwanese traditional characters that contain the character 毒/毒 as a component, but not in any of the simplified forms that contain it. This type of contraction cannot be applied to the character 母 (mǔ, ‘mother’) without equating it to the character 毋 (wú, ‘no, not’), but could in theory be applied to other characters that contain the component ‘母’, like 每 (měi, ‘every’), though this is not done, at least not in printed or standard forms.
Selected contraction-type characters in the Zho¯nghuá Dà Zìdiaˇn.
As can be seen in Table 10, characters containing the component ‘
’ are not consistently given with the same form of the component, with some characters’ stroke count indicating the use of the three-stroke version and others indicating the use of the four-stroke version. However, all characters of this type visually appear to contain the same component, so that it is impossible to tell by looking at the character whether it is considered to be written with the three- or the four-stroke component. For characters containing 卸/卸 and 致/致 the opposite holds; they are consistently given the stroke counts belonging with their contracted versions, but the characters do not always visually conform to the contracted forms that they need to have to correspond to their listed stroke counts, nor are components that should have the same stroke count written identically to each other. For example, in Table 10, 御 (yù, ‘imperial’) clearly uses the contracted component ‘
’, whereas 卸 (xiè, ‘to unload’) does not appear to, and the character 緻 visibly contains the contracted component ‘夊’, without the small upwards triangle that indicates the end of a horizontal printed stroke at the top-right of the character, whereas 致 does have this triangle, and extends this horizontal stroke beyond the top of the left-falling stroke which it should be contracted with in order to make a nine-stroke character. Thus, we see inconsistencies both in the stroke counts and in the depictions of contraction-type characters in the source material predating the Mainland–Taiwanese split.
While these inconsistencies exist in pre-simplified traditional sources, the forms given for Taiwanese traditional characters in this study are mostly consistent, especially when we exclude the Guómín Zìdiǎn (1974) and only consider the consistency between and within the Far East Dictionary and the Zhèngzì Biǎo, the more recent sources consulted for this study. The origin of this consistency can at least partially be traced back to the Guózì Yuánzé, published in 1997. The principles listed in this document describe and sanction many of the writing practices discussed in this study. While the aim of Mainland Chinese character authorities was and is primarily to make characters easier to write, for which it may be desirable to change the form of a character, Taiwanese character-writing principles instead aim to preserve characters’ original structures (Zeng and Jiàoyùbù, 1997): “字之寫法 , 無關筆劃之繁省者 , 則力求符合造字之原理 。” As for the way of writing characters, regardless of the strokes being more or fewer, [we] strive to conform to the principles of character creation. “凡字之偏旁 , 古與今混者 , 則予以區別 。” Whenever the ancient and modern components of characters have been conflated, they are distinguished.
The first of these two quotes shows that the goal of the rules in the Guózì Yuánzé is not to reduce the stroke counts of characters, but to make them conform to the principles of character creation; that is, to make them structurally conform to their original forms. What the Guózì Yuánzé means by ‘original’ forms is clear from the citations used to justify the rules that it lays down, which are citations from the Shuōwén Jiězì 說文解字, a dictionary dating to the 1st century CE which lists and describes seal script characters. The second quote states the intention to differentiate contemporary character components in cases where distinct ancient components might otherwise be written as one identical component in modern script. The following section will discuss how these principles of adhering to the original forms of characters and differentiating components with different ancient forms have caused many of the instances of Taiwanese traditional characters having fewer strokes than their equivalent simplified forms due to stroke contraction.
” components is a “㇜’”stroke, [so that the component has] three strokes in total, with the first stroke not being a “丶” stroke.’
’, the consequence of the aim to preserve characters’ original structures is that the component ‘
’
is written with three strokes to match its seal script form ‘
’, which also features an uninterrupted stroke through the horizontal stroke of the component (see general rule 12). This component is an inversion of the pictographic component ‘子’ (zǐ, ‘child, infant’), which is written as ‘
’ in seal script. This explains why the same stroke-saving contraction is not applied to certain other components in which it could theoretically be applied, such as in characters containing ‘亥’ or ‘玄’: the contracted form is not used in order to reduce the stroke count, but to structurally correspond to the ancient form of the character. Therefore, characters of which the ancient form does not feature the component ‘
’ will not use a contracted stroke in their contemporary Taiwanese form. The seal script form of ‘亥’ is ‘
’, in which the topmost stroke clearly does not cross the stroke beneath it. The seal script form of ‘玄’ is ‘
’, which has its origin in the even older Shang dynasty character
, which depicts a twisted strand of silk, so that the part of the component below the horizontal stroke should be taken as a sub-unit of its own, rather than combined with the part above the horizontal stroke by means of a single contracted stroke. Consequently, a character containing ‘亥’ or ‘玄’ in its standard script form, if it were to be written with a single contracted stroke, would violate the Taiwanese script authorities’ principle of writing characters in such a way that they conform structurally to the seal script forms of the Shuōwén Jiězì. Examples of characters containing ‘亥’ and ‘玄’ can be found in Table 11. While in previous times, judging by the Zhōnghuá Dà Zìdiǎn, the traditional standard on ‘
’ seems to have been flexible, the Guózì Yuánzé makes the official Taiwanese traditional standard clear, even if people’s actual writing habits may continue to display variation in this regard.
Selected characters containing ‘’, ‘亥’ or ‘玄’ in the Shuoˉwén Jieˇzì.*
The Shuoˉwén Jieˇzì forms given in this article can be found in the Hànyuˇ Dà Zìdiaˇn 漢語大字典.
”.
’ in some older dictionaries, are written with the four-stroke version of the component in contemporary sources. Furthermore, there are other characters containing ‘止’ that are not written with the three-stroke version of the component in any of the sources that I have consulted, such as the character 武 (wǔ, ‘martial’), despite the relevant stroke of the seal script component ‘
’ being identical in all of these characters, as can be seen in Table 12.Selected characters containing ‘止’ in the Shuoˉwén Jieˇzì.
’ (Shuōwén Jiězì radical 203) and ‘夊’ (suī, Kangxi radical 35), which is equivalent to the seal script component ‘
’ (Shuōwén Jiězì radical 198). The distinction here is that the final stroke extends further to the left in the latter component than in the former. The graph 致 is given as an example of a character containing such an extended-stroke component. Both ‘夂’ and ‘夊’ consist of three strokes. The example characters given in specific rules 108 and 111 show that the component ‘攵’ (pū, considered equivalent to 攴, with which it forms Kangxi radical 66) is considered distinct from both ‘夂’ and ‘夊’, though these two rules are not about ‘攵’ itself. ‘攵’ is equivalent to the seal script component ‘
’ (Shuōwén Jiězì radical 92), and consists of four strokes. The difference in stroke count between Taiwanese traditional 致 and simplified 致 follows from the use of ‘夊’ in the former and ‘攵’ in the latter. Table 10 has shown that the use of both three-stroke and four-stroke components in characters containing ‘致/致’ is attested in pre-1949 sources. Unlike on the Mainland, Taiwanese script authorities decided to write these characters with the three-stroke component ‘夊’, since that more accurately corresponds to the component ‘
’, which is featured in the seal script version of the character 致/致 and in characters that contain it. The fact that the component ‘攵’ in Taiwanese traditional forms has not been contracted into ‘夊’, other than in characters containing 致, is also a consequence of the Taiwanese script authorities’ conformity to seal script character structures. Characters such as 敗 (bài, ‘to lose’) and 枚 (méi, ‘trunk’) are written with a ‘攵’ component because they are written with a ‘
’ component in seal script (see Table 13).
Selected characters containing ‘夊’ or ‘攵’ in the Shuoˉwén Jieˇzì.
The reason why only 致/致 and characters that contain it feature ‘夊’ in their Taiwanese traditional forms and ‘攵’ in their simplified forms is that 致/致 is highly unusual among Chinese characters in using ‘
’ in the right half of its seal script form. The characters listed in the Shuōwén Jiězì under the radicals ‘
’ (‘夊’) and ‘
’ (‘攵/攴’) show that almost all characters that contain either of these two components fit the pattern of featuring ‘夊’ in the bottom half of the character or ‘攵’ in the right half of the character. Out of the 15 characters listed under the radical ‘
’ (not including the radical functioning as an independent character), only 致 features ‘
’ in the right half of the character, with the remaining characters all featuring it in the bottom half of the character. It is, however, possible that there are characters containing ‘
’ in their right half that are listed under different radicals. Out of the 76 characters listed under ‘
’ (not including the radical functioning as an independent character), 74 characters feature the component in the right half of the character, and only 2 feature it in the bottom half, namely 變 (biàn, ‘to change’) and 更 (gēng, ‘to change’). The simplified form of 變 is 变, which contains ‘又’ instead of ‘攵’, and 更 does not contain ‘攵’ in its standard script (kǎishū) form, so these two characters have no further relevance to this study. Thus 致/致 is relevant to this study because it and characters that contain it feature a ‘
’ component in a location where it is not found in other characters, but where the component ‘
’, with its very similar standard script form, is common. The simplified form 致, though not historically accurate in its use of ‘攵’ instead of ‘夊’, conforms to the pattern in other standard script characters of writing ‘攵’, not ‘夊’, in the right half of graphs. Conversely, the Taiwanese traditional form 致 structurally corresponds to its seal script form by using ‘夊’, but breaks with the pattern found in other standard script characters by using ‘夊’ in its right half. Guózì Yuánzé general rule 39 shows that the Taiwanese script authorities’ choice for historical accuracy over standardization is a conscious one. Further research is needed to answer the question of whether or not the Mainland Chinese script authorities made this choice consciously, and to what extent they followed or broke with prevailing writing practices by sanctioning 致 over 致.
(zhóu, ‘stone roller’) is instead written with a single vertical stroke. However, despite visually conforming to a 13-stroke character, the character
is listed as a 14-stroke character, so that just as with the contraction-type characters discussed above, the graphic forms of certain characters do not correspond to the stroke count under which they are listed, nor are character components that we would expect to be written identically always uniform across the characters in which they appear. Unlike 強/强, 毒/毒 is not listed in the Yìtǐzì Biǎo. Instead, the origin of the difference between the Mainland and Taiwanese ways of writing the character can once again be found in the Guózì Yuánzé. Specific rule 82 explains that the seal script forms of ‘毋’ and ‘母’ are distinct, and that, unlike in the character
in Table 10, the central left-falling stroke in ‘毋’ should be extended below the lowest stroke that it crosses. Table 14 shows that the distinction between the seal script forms of ‘毋’ and ‘母’ corresponds to the distinction in their contemporary forms, namely that ‘毋’ has a single long stroke where ‘母’ has two shorter strokes. Due to the rotation of these components in standard script compared to seal script, the strokes in question are horizontal in seal script, but vertical in standard script. Thus, characters of this type are in line with the aim of the Guózì Yuánzé by corresponding structurally to their ancient forms.
Selected characters containing ‘毋’ or ‘母’ in the Shuoˉwén Jieˇzì.
*The Hànyuˇ Dà Zìdiaˇn gives this form with a ‘
’ component, that is with two strokes in place of one, despite indicating that it is made up of a component with a single stroke in place of the two strokes, but the form listed in the version of the Shuoˉwén Jieˇzì that I have consulted (in Sìkù Quánshuˉ 四庫全書, Jı¯ng Bù經部) is as given here.
Inconsistencies
When comparing characters across dictionaries it quickly becomes apparent that different sources adhere to different notions of which character form is the standard form and which form is the variant form, and of how many strokes a character consists. In some cases, inconsistencies are even found within a dictionary, as has been discussed earlier for the Xīnhuá Zìdiǎn and the Zhōnghuá Dà Zìdiǎn. An example of an inconsistent standard between various dictionaries not included in the lists in this study is the character 髯/髥 (rán, ‘beard’). The Zhèngzì Biǎo lists 髥 only as a variant form, hence it is not included in this study, and together with the Xīnhuá Zìdiǎn lists 髯 as the standard form, but the Guómín Zìdiǎn does the opposite by giving 髥 as the standard form and 髯 as a popular form (súzì 俗字). An example of inconsistent listings of the stroke count of a character is the character 熙 (xī, ‘splendid’), which in the Guómín Zìdiǎn and Kāngxī Zìdiǎn is listed as having 13 strokes, on account of the contraction of a ‘
’ stroke and a ‘
’ stroke into a ‘
’ stroke in the left-hand half of the character, so that the character is written like this:
. Since the more recent dictionaries that I have consulted list this character as having 14 strokes both in its simplified and Taiwanese traditional forms, I have not included it in this study.
Frequency of relevant characters
Taiwanese traditional characters with fewer strokes than their equivalent simplified forms unsurprisingly make up only a fraction of all Chinese characters. No indisputable number of Chinese characters in existence can be given, since new characters can be created and old characters can fall out of use, and there is no clear-cut point at which a character should be added to or excluded from the total number of existing characters (Wiedenhof, 2015: 380–382). Furthermore, the inclusion or exclusion of variant forms and forms considered to be incorrect is also debatable. The same problems arise when trying to put an exact number on the total number of characters that match the criterion of inclusion in this study. I have limited this study to characters with some degree of contemporary usage, in the form of being listed in at least one of the physical dictionaries that I have consulted for this study, but an argument could be made for the inclusion of all characters that would in principle be relevant, no matter their frequency of use. To give a sense of how often an ordinary script user may expect to encounter such characters, we may once again consider the Far East Dictionary. Out of the 3000 characters deemed to be the most frequently used in modern Chinese writing, 15 have traditional forms that contain fewer strokes than their equivalent simplified forms. In other words, roughly 0.5% of characters in frequent use are paradoxically written with fewer strokes in Taiwanese traditional script than in simplified script. That the characters discussed in this study include both very common and less common characters suggests that very roughly speaking, we may expect that the rate of such characters remains similar when the sample size is enlarged.
Character encoding
Selected Unicode characters. The first column shows the characters’ Unicode values in hexadecimal notation, the second through seventh columns show glyphs submitted to the Ideographic Research Group by Mainland China, Hong Kong, Taiwan, Japan, South Korea and Vietnam respectively.
Conclusion
This study has demonstrated that some Taiwanese traditional characters paradoxically have fewer strokes than their equivalent simplified forms. The three categories to which such characters belong are: (a) characters that were eliminated from the Mainland Chinese orthographic standard as variant forms of characters made up of more strokes, but which are considered standard characters in Taiwanese traditional script; (b) characters that contain a form in the first category as a component; and (c) characters in which two strokes are contracted into one stroke only in their Taiwanese traditional forms. The relevant characters in these categories can variously be explained by the eliminated forms in the Mainland Chinese Yìtǐzì Biǎo and by rules in the Taiwanese Guózì Yuánzé, and examples of characters in all of these categories can be found in the Zhōnghuá Dà Zìdiǎn, a dictionary that predates the Mainland–Taiwanese orthographic split. However, unlike in contemporary sources, the principles that dictate how characters in these three categories are written were not consistently applied in sources that predate the split. The consistency of simplified and Taiwanese traditional forms is due to the fact that both the Mainland Chinese authorities and the Taiwanese authorities have made efforts to standardize the way in which characters are written, and the different standards propagated by these two authorities are the direct cause of the differences between their respective character forms.
The presence of the types of stroke contraction discussed in this study in the Xīnhuá Zìdiǎn and the Zhōnghuá Dà Zìdiǎn shows that Taiwanese traditional characters containing these contractions are not new creations of the Taiwanese script authorities or script users. Rather, the previously inconsistent ways of writing these characters in pre-simplified traditional script were standardized according to a different standard than the one used in Mainland China. The rules quoted from the Guózì Yuánzé have shown that the stroke contractions discussed in this study are conscious choices for the Taiwanese traditional script, and that the elimination of ‘強’ in favour of ‘强’ was a conscious choice for the simplified script. However, further research is needed to determine whether the Mainland script authorities consciously took as the starting point of their simplification programme forms that were already not the shortest way of writing the characters that they represent, and to determine whether Taiwanese script authorities made a conscious decision to uniformly use ‘強’ in all characters that contain ‘強/强’.
The stroke-saving contractions and substitutions discussed in this study certainly could be applied to simplified characters, and applying them consistently to all simplified characters that contain the relevant components, disregarding the originalist ideal of conforming to ancient character structures, would avoid the inconsistencies present in Taiwanese traditional script and satisfy the desire for conformity and simplicity in simplified script. The fact that even minor stroke-saving changes that have already proven themselves to be unproblematic in Taiwanese traditional characters have not been incorporated into the simplified script emphasizes the degree to which Mainland Chinese character-simplification efforts have stalled since the retraction of the Second Scheme and the publication of the final official character scheme in 1986. Ironically, this means that the simplified orthographic standard is in effect a conservative standard, albeit one that conserves a standard that was set only a few decades ago. In any situation wherein an authority seeks to preserve a certain way of writing characters, the crux lies in which time period it seeks to emulate, and which source or sources from that period it considers authoritative. For example, in his study of the evolution of simplified Chinese characters, Bökset (2006: 181) notes that some discrepancies between Chinese and Japanese characters can be explained by the different ancient materials which were used as sources of simplified characters in twentieth-century China and Japan. This study, therefore, may serve as a trigger for script authorities to reflect on their past approaches to character reform and simplification, and on which character-writing principles they wish to uphold in future. For example, Mainland Chinese script authorities may consider whether simplified characters should remain the same as they are now indefinitely, or whether they should incorporate whatever additional stroke-saving changes can be found in the Taiwanese traditional script, or in Mainland Chinese script users’ writing practices.
Script users, both native speakers and foreign-language learners, who wish to write characters in accordance with the orthographic rules of Mainland China and Taiwan will want to be aware of the subtle differences between the two standards, such as the ones discussed in this study, so that they may write correctly wherever they go. Furthermore, script users would do well to understand the link between the Taiwanese traditional character standard and seal script characters, and be aware of which contemporary components are distinct from each other despite appearing identical at first glance due to their origins in separate seal script components. In addition to writing Taiwanese characters correctly, this will allow script users to better understand the etymology and function of certain components in the graphs in which they appear, thereby advancing their knowledge of Chinese characters.
The root cause of the differences between Mainland Chinese and Taiwanese standard characters discussed in this study is primarily ideological. On the Mainland, the script was a tool to be employed in such a way as to achieve a practical goal, namely mass literacy. Since achieving this goal was thought to require or at least be facilitated by simplifying the script, that course of action was taken. To the Taiwanese script authorities, however, the goal of script policy is to preserve the traditional way of writing characters. In practice, this means that unlike simplified characters, Taiwanese traditional characters are preferably made to structurally conform to their seal scrip forms in the Shuōwén Jiězì, leaving certain Taiwanese traditional characters with lower stroke counts than their equivalent simplified forms. Thus, one of the reasons why Taiwanese and Mainland Chinese character forms are different is because their respective authorities ultimately want different things from their scripts.
Footnotes
2.
See Unicode (2020) Unihan Database Lookup Tool, entry for U+5378. Available at: http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=5378&useutf8=true (accessed 24 May 2021).
