Abstract
Binomials, coordinated pairs of words, differ as to their reversibility. However, the degree of reversibility of any binomial is not necessarily stable, but is subject to diachronic changes. This article hypothesizes the different pathways of change that a binomial’s degree of reversibility may follow and presents corpus findings to show that all these pathways, and more, do occur. Some 200 high-frequency binomials were analyzed regarding their degrees of reversibility in American English across the twenty decades from 1810 to 2009 using Google Books data. While the reversibility of a binomial may remain stable, changes in terms of freezing, unfreezing, and even order reversal are frequent and probably due to a combination of intra- and extralinguistic factors. The interaction between reversibility changes and frequency changes is discussed in light of usage-based approaches to language. The example of gender binomials shows that sociocultural changes may be reflected in linguistic changes.
Binomials, that is, coordinated pairs of words from the same word class, are more or less reversible. For instance, corpus evidence shows that the binomial arts and crafts is very nearly irreversible; it hardly ever occurs in the sequence crafts and arts. Conversely, the elements in the binomial public and private are quite reversible—private and public occurs with almost equal frequency in corpora. 1 Any given binomial is thus situated at a point on a cline of reversibility ranging from complete exchangeability to frozenness. The middle ground of the cline is occupied by binomials for which both sequences are attested, but one sequence is preferred over the other to a smaller or greater degree. While this concept of a cline of reversibility was introduced by Malkiel (1959:115) more than fifty years ago, it has not been given much attention in research on binomials. What remain completely unresearched, however, are diachronic changes in reversibility, that is, how binomials may move along the cline over time. The present article hypothesizes that a number of pathways of change are conceivable, such as freezing, unfreezing, and reversal in preference (while for binomials to retain their degree of reversibility is of course also possible). Using the American English subcomponent of the Google Books n-gram data, which spans twenty decades from the 1810s to the 2000s, 2 a corpus-based study into the development of 214 high-frequency binomials shows that all these pathways are indeed attested, plus a number of additional, complex patterns. Examples are given for each pattern of change, and possible motivations for changes in reversibility are also suggested. After an investigation of the interaction between binomial frequency and binomial reversibility, I discuss how pathways of change that are established here for the first time, especially unfreezing, are difficult to integrate into usage-based models of language. The article ends with a case study into the changes and motivations for change concerning one very special group of binomials, those belonging to the gender domain. This case study suggests that sociocultural changes may be reflected in changes regarding binomial reversibility. All in all, it is shown that the degree of reversibility is not a fixed characteristic of binomials, but is subject to change along more possible pathways than previously assumed.
Binomials: Definition and Previous Research
Binomials are defined by Malkiel (1959:113) in his classic article as “the sequence of two words pertaining to the same form-class, placed on an identical level of syntactic hierarchy, and ordinarily connected by some kind of lexical link.” For the purposes of the present study, binomials are defined as pairs of words from the same word class, restricted to the lexical classes of noun, verb, adjective, and adverb (since other word classes only rarely combine into binomials) and coordinated with the conjunction and. Binomials formed with other conjunctions are omitted since and is by far the most widely used coordinator in binomials (and generally in English; cf. Meyer 1996:31): the Corpus of Contemporary American English (COCA; Davies 2008–) has more than three million tokens of pairs of nouns, verbs, adjectives, or adverbs coordinated with and, but only 405,000 coordinated with or and 35,000 coordinated with but. Moreover, disjuncted and adversative binomials may pattern differently from conjuncted ones, and homogeneity in the sample is preferred. The third of Malkiel’s requirements for the status of binomials—both coordinated elements belonging to the same level of syntactic hierarchy—is certainly required to discount examples in which the conjunction coordinates not two lexical elements, but two phrases or clauses. For example, “we went there and then we did …” is not an example of the binomial there and then. However, this requirement is not easily put into practice in corpus searches that yield thousands of hits, as in the present study. As explained below, all binomial types analyzed in the present article were subjected to checks of randomly selected concordance lines to eliminate items that exhibit a substantial proportion of syntactically nonparallel examples. The term “binomial sequence” in this article refers to one specific sequence of the two coordinated elements (called A and B here), that is, either A and B on its own, or B and A on its own. The binomial itself (also, “binomial type”) may be realized by either of these two sequences.
Previous research has focused mainly on one specific type of binomials, namely, the irreversible ones in which the order of the two elements cannot be reversed, and in particular on the question of why any sequence A and B came to be fixed rather than B and A. A large number of ordering constraints has been proposed to account for the preference of one sequence over the other, most notably by Cooper and Ross (1975). They postulate a semantic principle, the “Me first principle,” suggesting that elements that are perceived to represent the prototypical speaker are named first. They also propose seven phonological constraints, ordered hierarchically, but all subordinate to the “Me first principle.” These constraints, complemented by a number of newly developed ones, have been empirically tested on the basis of corpus data by Benor and Levy (2006), who suggest a new hierarchy placing semantic constraints before rhythmic constraints and word frequency, while phonological constraints do not appear to determine binomial order. This hierarchy is also largely confirmed in Mollin (2012). Further perspectives that previous research on binomials has taken include the syntactic distribution of binomials (Gustafsson 1975), their distribution across text types (Hatzidaki 1999), their functions within texts (Norrick 1988), as well as their use in historical periods of English (e.g., Leisi 1947; Koskenniemi 1968; Potter 1972; Yada 1973; Tani 2008).
One of the main characteristics of binomials, however, has remained largely unresearched, even though it is routinely mentioned in all studies: this is the fact that binomials are reversible to differing degrees. Some binomials are irreversible, such as day and age, while others can order their elements more freely, such as teachers and students or students and teachers. Many other binomials fall in between, allowing both sequences but preferring one to a greater or lesser degree (e.g., time and money rather than money and time). In focusing on irreversible binomials only, previous research has disregarded both the majority of binomials and a research question that is as interesting as why one specific order came to be fixed in irreversible binomials, namely, why some binomials are more or less reversible than others. Only three previous studies have explicitly tackled the phenomenon of reversibility: Gustafsson (1976), Lohmann (2011), and Mollin (2012). Gustafsson (1976) considers some 2,700 binomials that occur in three small corpora, aiming to uncover different distributions in word class between reversible and irreversible binomials (as well as between frequent and less frequent binomials, an issue not explored here). She finds that there is a larger proportion of nominal pairs among the reversible binomials than in the whole sample, and fewer verbal ones (Gustafsson 1976:635). However, her results may be skewed by her small database (which is actually quite extensive, considering the technological status quo at the time). In general, her collection is so small that only a limited number of binomials (446 out of 2,720) occur more than once, so that no claim as regards the reversibility of the remaining hapax legomena can be made, and her conclusions regarding the degree of reversibility of the 446 binomials that occur multiple times are also doubtful. Most of these are classified by Gustafsson as irreversible, since no reversed instances occur in her data; however, if a larger corpus were to be consulted, it could well (and does) yield such examples.
Lohmann (2011) and Mollin (2012), conversely, are fortunate in having megacorpora (in this case, the 100-million-word British National Corpus [BNC]) and modern corpus technology at their disposal. Lohmann (2011) compares a sample of low-frequency, reversible binomials and a sample of high-frequency, irreversible binomials from the spoken component of the BNC concerning how well previously postulated ordering constraints predict the actually observed order of the binomials. He finds that the constraints predict the order of irreversible binomials better. In Mollin (2012) I study the reversibility of some 500 high-frequency binomials from the BNC, from all points along the cline of reversibility. I introduce an (ir)reversibility score that expresses the degree to which a binomial favors one of the two possible binomial sequences over the other through the following formula:
For example, if the binomial sequence A and B occurs 100 times in the corpus, but B and A only 50 times, the preferred sequence A and B accounts for 67 percent of all occurrences of the binomial, 67 percent being its (ir)reversibility score. The score for B and A would be 33 percent—all values less than 50 percent suggest that a variant is dispreferred. A score of 50 percent represents complete exchangeability of the elements with equal frequencies of both sequences. A score of 100 percent signifies irreversibility, with the binomial only ever occurring in one form. Regarding the most frequent binomials in the BNC, in Mollin (2012), I find that both complete irreversibility and complete reversibility are comparatively rare, since most binomials show some preference for one variant without being limited to it. A greater preference for one sequence is more frequent than a less pronounced one. Turning to the ordering constraints tested by Benor and Levy (2006), I discover that more fixed binomials have a greater tendency to conform to a number of semantic and metrical constraints than more reversible ones. This is the first indication of a factor that drives the freezing of binomials: sequences that are particularly well formed in terms of semantics (especially in terms of ordering according to power differences in society, according to chronological sequence, and according to semantic markedness) and rhythm (i.e., examples in which a lapse, as well as ultimate stress is avoided, and where the word with fewer syllables is placed first) apparently lend themselves more easily to freezing.
Selecting Binomials for the Present Study
The present study moves the understanding of the phenomenon of degrees of binomial reversibility forward by considering diachronic developments, investigating which pathways exist in this type of lexical change and how frequently these occur. The only available corpus that is large enough to allow a study into these questions (and that comes with feasible search facilities) is the American English subcorpus of the Google Books n-gram corpus. It spans the period 1810 to 2009 and can be accessed via the corpus.byu.edu site (further description below). To obtain sufficient numbers of hits for the binomials, which can be individually rather infrequent, only high-frequency items are selected. (However, less frequent items will also be analyzed later in this article to discuss interactions between diachronic changes and frequency.) The list of binomials to be studied here derives from a search for those binomial sequences that are the most frequent in contemporary written American English as it is represented in the COCA corpus (Davies 2008–). Since the diachronic analysis was to proceed on the basis of the Google Books corpus, which contains writing from published books and magazines (more on this corpus below), the initial selection of currently frequent binomials was based on those COCA components that are the most similar to the composition of Google Books: academic writing, fiction, and magazines. Since Google Books contains no newspaper writing or speech, these two COCA components were disregarded. In addition, to focus on contemporary data, a time span from 2000 to 2011 was chosen (ending with the data update of COCA in April 2011). The selected subcorpus of COCA, 137 million words in size, was then searched for combinations of nouns, verbs, adjectives, and adverbs coordinated with and. The frequency cutoff for these highly frequent binomial sequences selected lay at 137 occurrences in the subcorpus, that is, one occurrence per million words (pmw), to arrive at a manageable sample size. A number of items needed to be excluded from this list, such as cases of B and A sequences, when A and B was already present in the list, since each binomial was studied in both possible sequences later on anyway. In addition, echoic binomials such as more and more, in which elements A and B are identical, were excluded since these cannot be assigned an (ir)reversibility score, as well as cases of verbal hendiadys (e.g., try and get, go and see), which logically cannot be reversed. Items that represented only a part of a binomial with one or two multiword elements, such as States and Canada, were completed (the United States and Canada). Finally, all of the remaining 221 items were subjected to a check of a random sample of 50 concordance lines to eliminate false positives that do not exhibit syntactic parallelism. If one or more of the 50 concordance lines included an example that transcended syntactic boundaries, the complete type was omitted (i.e., laws and reproduction in “This work is protected by copyright laws and reproduction is prohibited”). The method served to exclude six word pairs that indeed frequently occur in nonparallel uses (here and now, inside and out, off and on, on and off, then and there, and laws and reproduction). The file ended up comprising the 215 most frequent binomials in contemporary written American English. One note needs to be made regarding the differences between word forms and lemmas. The present study analyzes binomials on the basis of word forms, since word forms of the same lemma may behave very differently in the formation of binomials. A prime example are the two word form binomials size and shape versus shapes and sizes, which exhibit mild preferences ((ir)reversibility scores of 80 percent and 74 percent, respectively) for the opposite order. All word form binomials thus need to be considered and analyzed in their own right.
The binomials range in frequency from 3,850 occurrences in the 137-million-word subcorpus of COCA (27.95 pmw) for the binomial men and women to 137 occurrences (1 pmw, the minimum cutoff point) for policies and practices. The average frequency lies at 326 occurrences (2.37 pmw), as represented for example by the binomial ladies and gentlemen. It has been noted previously (e.g., Gustafsson 1975) that a majority of binomials are combinations of nouns. This is corroborated in the data from the three written COCA components as well. The vast majority of binomials in the sample (73 percent) are nominal. The next most frequent part-of-speech category is adjectives, representing 19 percent of binomials in the sample, while pairs of adverbs and verbs are rather rare (6 percent and 3 percent, respectively). For all 215 binomials, the frequencies for both sequences (A and B as well as B and A) were collected from the subcorpus of COCA to obtain a first picture of their current degree of reversibility. Figure 1 shows how the binomials are distributed along the cline of reversibility, the cline being partitioned into bands of 10 percentage points, giving complete irreversibility as a separate category. As in the British English data in Mollin (2012), binomials with a strong preference for one order (90-99.99 percent scores) are the most frequent, followed by binomials with less strong preferences. Both complete reversibility and frozenness are rare. A note of caution regarding this distribution is, however, necessary: since the binomials were selected on the basis of the most frequent individual binomial sequences in the corpus, it is possible that binomial types with high (ir)reversibility scores are overrepresented since the sequences of more reversible items need to be higher for the binomial type to be included in the data sample than the preferred sequences of less reversible items.

Distribution of 215 high-frequency binomials across categories of (ir)reversibility scores in three written COCA registers.
The Diachronic Development of Binomial Reversibility: Theory and Hypotheses
Figure 1 presents the (ir)reversibility scores of the 215 currently most frequent binomials in written American English (as it is represented in three components of the COCA corpus). However, the present article focuses on the diachronic developments that have led to this status quo. As mentioned earlier, no previous treatments of the diachronic changes in the reversibility of English binomials, whether theoretical or empirical, exist apart from marginal comments in a few of the classic sources on binomials in general. Yet I would like to posit that diachronic developments will be detectable in corpus data and, also, that an understanding of the pathways of change is essential to an understanding of the concept of binomial reversibility as such. For example, synchronic studies such as Lohmann (2011) and Mollin (2012), finding differences in constraint adherence between reversible and irreversible binomials, make the assumption that constraint adherence thus influences freezing—a diachronic process. The term “freeze” itself, frequently used for irreversible binomials, makes reference to the freezing process of which it is a result. By considering diachronic processes, the first hypothesis is, consequently, that the degree of binomial reversibility is subject to change in the first place. This is also assumed by Malkiel in his seminal 1959 article, in which he claims that irreversibility is the result of a freezing process, through which formerly rather reversible binomials develop a greater and greater preference for one sequence:
Among such loosely attached binomials a fraction of preferred sequences may, with the passage of time, become increasingly current, at the expense of their opposites (as should be statistically demonstrable under ideally favorable conditions), until one particular arrangement of the two words once freely matched stiffens, tending to become obligatory. (Malkiel 1959:116)
The freezing process as one possible pathway of change is also touched on by Cooper and Ross (1975:70) when they speculate that those binomial sequences that adhere more closely to semantic and phonological ordering constraints are more likely to attain irreversibility, “to stand the test of time, to become conventional.” Indeed, as I mentioned earlier, my work in Mollin (2012) has shown that more fixed binomials do satisfy a number of semantic and rhythmic constraints to a greater degree than more reversible ones, with similar results obtained by Lohmann (2011). A number of further publications allude to the freezing process when diachronic aspects are mentioned, such as Koskenniemi (1968:84), who provides a short discussion of reversible binomials in her Old English and Early Middle English texts and states for reversible binomials that “[a]s language develops, only one alternative generally remains in use.” Kohonen (1979:160), likewise considering Old English and Early Middle English, observes that many of the binomials that are reversible in his data are fixed today and suggests that the vacillations in the preferred order that he observed may be indicative of a beginning “process of fixation.”
Freezing thus appears to be the one pathway of change that is generally assumed to exist, as is indeed shown in the term “freeze.” A concomitant diachronic pattern is continuing reversibility, since both Malkiel (1959:116) and Cooper and Ross (1975:70) point out that only a restricted set of reversible binomials ever enter the freezing process. The small number of freezes among the binomials also implies that the freezing process (or at least the completion of the freezing process) is not undergone by all binomials. Indeed, the only empirical study into the diachronic development of binomials (even if focusing on German, and not English binomials), Hüpper, Topalovic, and Elspaß’s (2002) qualitative study of two binomials in the genre of legal oaths in Middle High German, finds an example each for both patterns: one binomial shows freezing (treu und hold ‘faithful and loyal,’ now out of use), while one remains relatively reversible (arm und reich ‘poor and rich’).
However, at least one further pattern of change in reversibility has been noted, and is probably the most noticeable of all, namely a reversal in the preferred order. Evidence for reversed preferences has been given for at least two binomials: ladies and gentlemen as well as mother and father. Potter (1972:314) notes that address terms in Chaucer’s texts mention gentil men before ladies, while in the modern address formula the female term precedes the male. Similarly, Knowles (cited in Moon 1998:153-154) observed in a corpus of children’s literature that the preference for father(s) and mother(s) in the late nineteenth century seems to have given way to a preference for mother(s) and father(s) by the late twentieth. It is not surprising that these two examples of a reversal of preference both concern the gender domain, in which changes concerning reversibility have recently been more dramatic than in other semantic fields, as will be illustrated in the results section below. Examples of reversed preferences in the ordering of binomials are particularly interesting because they suggest that diachronic patterns may be far more complex than generally assumed. For preferences to change, a process of unfreezing needs to set in to the point of reversibility. From that point on freezing resumes again, but in this case in favor of the binomial sequence that was previously dispreferred. Apart from freezing and continuing reversibility, unfreezing must therefore also be a possibility, suggesting that we are not dealing with a unidirectional process.
Thus, since previous treatments of binomials have generally neglected the question of diachronic development and only alluded to freezing, continuing reversibility, and reversals in preference as potential diachronic patterns of reversibility, it is now high time to widen the scope and look at all possible patterns in a systematic fashion. I hypothesize that, in charting the development of (ir)reversibility scores of binomials across a given time span in a corpus, no fewer than six major patterns will all be attested, as illustrated in Figure 2. The graphs in this figure represent the abstract patterns of diachronic developments of the (ir)reversibility score concerning the binomial sequence that is preferred today. The x-axis represents time, while the values on the y-axis represent the (ir)reversibility scores, with 50 percent signifying complete reversibility and 100 percent complete frozenness (as explained earlier). Values in between suggest a more or less strong preference for one of the two possible sequences. The trend line will cross the 50 percent threshold only in the case of a reversed preference, since in these cases the sequence that is preferred today used to be dispreferred at a score of less than 50 percent. For convenience’s and transparency’s sake, only the line of the contemporarily preferred variant is shown, even though the scores for the dispreferred variant could of course be plotted too and may be derived by subtracting the values of the preferred sequence from 100 percent.

Six major patterns of diachronic development in binomial reversibility.
The first three patterns to be found on the left-hand side of Figure 2 (patterns A through C) can evidently not be termed pathways of change, since they are static. Continuing reversibility has been mentioned as a possible diachronic pattern in previous accounts, but I would like to distinguish continuing complete reversibility (with no marked preference for one sequence) from cases that continue to show marked preferences (B and C). The pattern of continuing frozenness (A) also logically emerges from previous accounts of binomials, representing freezes after completion of the freezing process. Static patterns are likely to be present in substantial numbers of examples in any corpus study, especially if the time span covered by the corpus is short. Static patterns have two possible explanations: either the reversibility status of these binomials is indeed unchanged from the beginning (i.e., the time at which the binomial entered the language), or any changes precede the time span of observation. Needless to say, these binomials need not remain static; changes may yet occur following the time span covered by the corpus. The cutoff points used to distinguish between the categories of continuing reversibility and frozenness are 75 percent and 99 percent (ir)reversibility scores, so that all binomials showing a score of more than 99 percent are considered frozen and all binomials showing a score of less than 75 percent are considered reversible, with those cases in between exhibiting a preference for one of the two possible orders. Naturally, these cutoff points of 75 percent and 99 percent are rather arbitrary. For example, in the previously mentioned study of binomial reversibility in a smaller corpus, the BNC, I considered only binomials with a score of 100 percent as frozen (Mollin 2012). However, larger corpora are statistically more likely to throw up individual cases of reverse sequences to a binomial that is normally assumed to be frozen, so that a lower cutoff point suggests itself for the Google Books data. The 75 percent threshold is used to (almost) evenly divide the spectrum of nonfrozenness (50 percent to 98.99 percent) into the two categories of a marked preference versus no preference or only a slight preference for one sequence.
The three patterns that illustrate true change over time (patterns D through F) include, among others, freezing, as discussed above. In this case, scores rise from a level of reversibility or moderate preference to frozenness. If we merely see a rising line whereby the preferred sequence becomes preferred to a greater degree over time, without achieving complete frozenness, I consider that tendency a freezing trend, or incomplete freezing. Unfreezing is the inverse process, in which a previous (possibly even fixed) preference for one sequence erodes to a point of reversibility, entailing that the dispreferred sequence gains greater currency. Incomplete unfreezing (an unfreezing trend) ends not with a reversible status, but still shows a development toward a smaller degree of preference for one sequence. Unfreezing is assumed as a potential pathway of change in binomial reversibility because previous literature has remarked on cases of reversed preferences, which necessarily incorporate an unfreezing component. In reversals in preference (pattern F), we find a continuously rising line toward greater preference or frozenness of one variant (scores greater than 75 percent) with the distinctive feature of a starting point less than 25 percent. This means that an unfreezing trend is followed by a freezing trend in favor of the currently preferred sequence. Again, the temporal coverage of the corpus may mean that there are unknown changes that precede or follow the time span for the dynamic patterns just as well as for the static ones. For example, binomials that appear to be cases of freezing in the corpus may in fact represent reversals in preference, if the crossing point of the 50 percent threshold lies too far back in time to be covered by the data.
While the diachronic development of the reversibility of binomials as such is uncharted territory, it can be integrated into more general theories of lexical change. In particular, Brinton and Traugott’s concept of lexicalization is relevant. Brinton and Traugott (2005) conceive of lexicalization as the counterpart to grammaticalization. While grammaticalization very broadly may be defined as a change whereby a linguistic item becomes more grammatical, lexicalization is a change whereby an item becomes more lexical. They define lexicalization as
the change whereby in certain linguistic contexts speakers use a syntactic construction or word formation as a new contentful form with formal and semantic properties that are not completely derivable or predictable from the constituents or the word formation pattern. Over time there may be further loss of internal constituency and the item may become more lexical. (Brinton & Traugott 2005:144)
This definition suggests that the term “lexicalization” applies best to frozen and idiomatic binomials, which have moved from the status of ad hoc coordinations to fixed forms with a noncompositional meaning. And indeed the only two binomials that Brinton and Traugott mention as examples of lexicalization in their book are both of the frozen, idiomatic kind: nuts and bolts (Brinton & Traugott 2005:49) as well as pins-and-needles (Brinton & Traugott 2005:90), whose status as one lexeme is also made transparent in the hyphenated spelling. However, nonfrozen and nonidiomatic binomials can also be integrated into the lexicalization framework if one emphasizes lexicalization as a gradual process with frozen and idiomatic items as the ultimate end point. Forms may proceed along the path to full lexicalization and remain at different points for the time being—becoming more fixed, and finally possibly even acquiring idiomatic semantics. One significant process that is, according to Brinton and Traugott (2005:90), frequently, though not always, found in lexicalization is fusion, which “concerns internal tightness of collocation and fixing of sequences.” Lexicalization thus involves a loss of syntactic flexibility, which we see when binomials develop a greater preference for one of their two possible sequences. Understood in this way, the lexicalization of binomials may be equated with the freezing process outlined above, with the additional focus on the acquisition of noncompositional meaning that may follow parallel to or after complete freezing. While Brinton and Traugott’s lexicalization may thus be used to view the diachronic development of binomials as part of a larger type of language change, with parallels to the emergence of word formations, idioms, and multiword units, it has the disadvantage that only one direction of change is emphasized, namely the freezing process. Of interest, however, Brinton and Traugott do very briefly touch on the question of whether a reversal to lexicalization is possible. Since lexicalization involves items moving from the neutral middle ground of the lexical-grammatical cline to the lexical extreme, a reversal, which they term antilexicalization (Brinton & Traugott 2005:102), would move items from the lexical extreme of the cline back toward the neutral middle ground, that is, a process which would make “extremely lexical” items less lexical (but not grammatical). Brinton and Traugott (2005:103) maintain that the only examples of antilexicalization that have so far been documented are cases of folk etymology, and that future research is needed to show whether other types of antilexicalization exist (Brinton & Traugott 2005:148). If examples of unfreezing of previously frozen binomials are found in the data presented here, this gap will have been filled.
A Corpus-Based Analysis of the Diachronic Development of Binomial Reversibility in Late Modern American English
Method
The hypothesis that was stated in the previous section, that all six theoretically conceivable patterns of change in binomial reversibility do occur, is tested with the help of corpus linguistic methods. It is not surprising that such an empirical study into the diachronic development of reversibility has not been undertaken previously since a corpus of sufficient size has only recently become available. Even a synchronic study of the reversibility of binomials requires a rather large corpus since binomials are individually rather infrequent. If a binomial (in either of its two possible sequences) occurs only a few times in the corpus, a reliable (ir)reversibility score cannot be computed. In tracking the diachronic development across decades or centuries, however, each subcomponent of the diachronic corpus needs to be large enough to provide enough hits for the binomial sequences. A pilot study to the present analysis was carried out on the basis of the COHA corpus (400 million words overall; Davies 2010–). This corpus is already extremely large compared to standard historical corpora, providing data for all decades from the 1810s to the 2000s, with the smallest subcomponent (the 1810s data) comprising more than one million words and later decades between seven and thirty million words. This pilot study had to be discontinued because too many decades with insufficient numbers of hits (even for the most frequent binomials in contemporary written American English) made a sound analysis impossible, at least if a large degree of granularity is desired, as in the present project. COHA is of course sufficiently large if the data are collapsed over several decades, but to be able to trace the development of binomials along not just a few but a good number of data points, an even larger corpus was needed.
Therefore, this study benefits greatly from the introduction of the Google Books n-gram data, made available by Google in cooperation with a team of scholars in late 2010 as a result of their extensive book scanning program. The interface provided by the affiliated scholars for searching for n-grams in their corpus is severely restricted from a corpus linguist’s perspective (e.g., no raw frequencies are provided), so that we are fortunate that the corpus.byu.edu site provides a corpus-linguistically-minded interface for searching the American English subcorpus of the n-gram data (Davies 2011–). This American English Google Books corpus has a size of 155 billion words overall, divided into decades from the 1810s to the 2000s, with the size of decade subcomponents ranging from 378 million words for the 1810s to more than 26 billion words for the 2000s. This extremely large corpus is just about large enough for the present purposes since some of the binomials still occur too infrequently in the nineteenth century to provide complete trend lines. The problem is exacerbated by the fact that some of the binomials that are highly frequent today were less common in the past. While the Google Books corpus is currently the only feasible corpus to be used if addressing questions of lexical change, its disadvantages still need to be borne in mind. In particular, neither Google nor the team of scholars (Michel et al. 2011) who initiated and proceeded with the project of making the n-gram data public for research have made available detailed statistics on the composition of the corpus that underlies the n-gram data. Michel and his collaborators merely report that they selected more than five million books of those that had been digitized by Google at the time, resulting in a database of more than 500 billion words, 361 billion of which are written in English. Of these, 155 billion represent American English between 1810 and 2009 and are covered in the interface used for our present purposes. Selection criteria for inclusion in Michel et al.’s (2011:176) sample of books were “the quality of their OCR [optical character recognition] and metadata [provided by libraries whose books were scanned].” Google’s ultimate aim is to digitize all books ever published, but whether any further selection criteria apply in which books have been digitized first and thus form the basis for the corpus is unknown.
The corpus presently used is thus composed of published books. While this means that the corpus cannot be said to be representative of American English or even written American English as such (cf. Hoffmann 2011:397), it at least means that it is relatively homogeneous regarding text type (see the discussion of similar problems for the CLMET corpus in De Smet 2005). However, the corpus provides no metalinguistic information on topic or type of publication, such as whether a book represents fiction or academic writing (Nunberg 2011).
Within the BYU interface to the Google Books American English corpus, the 215 binomials extracted as the currently most frequent forms of the types “noun and noun,” “verb and verb,” “adjective and adjective,” as well as “adverb and adverb” from three written sections of COCA (see above) were searched for in both possible sequences (i.e., A and B as well as B and A). One binomial, Harold and Kumar, appears only in COCA, where it is frequent because the corpus contains the movie script to the 2004 movie Harold and Kumar Go to White Castle, but it is not represented in the Google Books n-gram data. The file of binomials analyzed thus shrank to 214 items. Note that the searches, unlike the original search to select candidates from COCA, were not part-of-speech specific. The Google Books n-gram data are not tagged for word class. While the BYU interface does allow for part-of-speech queries, the tags are not based on a full-text tagging of the corpus (as the corpus itself is not made public by Google, but only the lists of n-grams and their frequencies), but on probabilistic assignations. By default, the BYU interface assigns to a word form that part-of-speech tag that it received in more than 50 percent of its occurrences in the tagged COCA corpus (the level of confidence may be increased manually). The word form time, therefore, is assumed to be a noun, so that queries searching for the adverb pair time and again throw up no results, even though it is of course highly frequent. It was therefore decided not to include part of speech in the searches. I am aware that this may lead to a small number of false positives included in the data, when individual examples extracted from the Google Books corpus may not be true examples of pairs of nouns, verbs, adjectives, and adverbs. However, it is unlikely that these false examples significantly skew the picture, even for binomials containing a lexical element that is ambiguous in word class. Consider the nominal binomial cause and effect. Cause is ambiguous, being assigned a noun tag in COCA in 55 percent of cases in which COCA assigns an unambiguous POS tag and a verb tag in 45 percent of cases. Effect receives a nominal tag in 97 percent of cases, a verbal tag in 3 percent of cases. The ambiguity of cause, however, disappears in the coordinated form cause and effect: all examples of this coordinated word pair in COCA are examples of the “noun and noun” construction, that is, a true binomial (even in those few cases in which the COCA tagger has assigned the wrong tag). This is due to the fact that the coordinator and can be used only to coordinate syntactic equals (Quirk et al. 1985:945). The number of false positives will thus be very small and at the most include examples of and coordinating not two lexical items, but longer phrases and clauses.
The frequencies of both binomial sequences in the 20 decades provided were exported, and an (ir)reversibility score was calculated for each of the two sequences in each of the 20 decades (as explained earlier). Decades in which the overall frequency of the binomial (i.e., the frequencies of both binomial sequences combined) lay below 50 were disregarded since these frequencies were considered too low to allow for reliable conclusions regarding reversibility. When a binomial is this rare at a certain point in time, individual writers’ preferences may skew the results for the preferred order as well as the degree of reversibility. As a consequence, the trend lines analyzed for some rather rare binomials do begin not in the 1810s but later, since the decades in the first half of the nineteenth century are less well represented in terms of sample size. As an illustration, the overall raw frequencies for each binomial type range from 0 to 5,274 for gold and silver in the 1810s (0 to 13.95 pmw), but from as high as 2,889 (orientation and mobility) to 718,781 (men and women) in the 2000s (0.11 to 26.74 pmw).
When values for the development of a binomial’s (ir)reversibility score from the 1810s to the 2000s had been calculated, the challenge was then to assign the emerging trend lines for each binomial to one of the six abstract patterns presented in Figure 2. The first step here was to statistically determine whether the fluctuations in (ir)reversibility scores that any binomial exhibits amount to a statistically significant trend over time, or whether these are insignificant, coincidental variations. The statistical procedure applied was that suggested by Hilpert and Gries (2009:389) as best practice for determining the significance of frequency changes in multistage diachronic corpora: 3 correlating the variable of time with the variable of frequency, which, in this case, is operationalized as the relative frequencies of sequence A and B versus sequence B and A, that is, the (ir)reversibility scores of each binomial type. The correlation measure that is suggested is Kendall’s τ (tau), which is consequently also used here. Correlations are reported in terms of the value of Kendall’s τ as well as the level of significance, with significance assumed at a p value of less than .05, as is standard. Of the 214 cases analyzed, as many as 148 (69 percent) received a p value of less than .05, that is, are significantly (or even highly significantly, 128 showing a p value of less than .01) correlated with the time variable, and thus exhibit a significant trend over time. This first result already indicates that there is a great deal of dynamic development in the degree of binomial reversibility, and, as we will see in the following, the significance levels as such even underestimate the proportion of dynamic binomials in our time span, since a number of significant nonlinear trends are hidden in the group of binomials that first appear to be examples of nonsignificant and thus static patterns.
Among the binomials that achieved significance in the initial testing, cases were assigned to the category of freezing trends if the correlation coefficient was positive, indicating that with passing time, the (ir)reversibility score has risen. Within this category of binomials, three further patterns were distinguished. First, we distinguish true freezing, which ends in a fixed order with an (ir)reversibility score of at least 99 percent, and second, a freezing trend, in which we witness a significant development toward greater fixedness, but without the binomial (yet) achieving frozenness. True freezing corresponds to pathway D in Figure 2, while the general freezing trend may be seen as an incomplete version of the same. The third type of a freezing trend is a reversal in preference, as in pathway F in Figure 2, which in addition to a freezing trend that ends in a score of more than 75 percent begins with values of less than 25 % a preference for sequence B and A (i.e., a score for A and B of less than 25 percent) has given way to a preference for A and B. Significant cases producing negative values for Kendall’s τ were assigned to the broad category of unfreezing trends, including true unfreezing (ending in a point below 75 percent) and unfreezing trends that merely show a significant falling line without ending up with a reversible status.
While the analysis of assigning individual cases to larger categories of pathways of change could have ended here, I was aware that a simple linear correlation, such as Kendall’s τ statistic, may mask nonlinear developments. All cases, whether they tested as significant or not, were thus also graphically plotted to detect such nonlinear patterns. Two emerged in the analysis, both of which are combinations of freezing and unfreezing trends: a binomial may show a trend toward rising scores but at a certain point (the summit) begins to produce a falling trend line. Likewise, we may see an unfreezing trend up to a certain low point, followed by a freezing trend. Such patterns were witnessed both among the significant cases (in which one of the two trend lines was steep or long enough to mask the other) and among the nonsignificant cases (in which the two trends canceled each other out). Cases were assigned to these patterns of a freezing trend followed by an unfreezing trend or an unfreezing trend followed by a freezing trend if both trend lines (the one leading up to summit or low point and the one following on from there) produced significant correlations with the time variable on their own.
A related phenomenon that masks an actual historical trend in the nonsignificant cases was the pattern of a significant trend line that sets in only after a few decades of wild vacillations. These zigzagging patterns are clearly due to low frequencies since they only occur in decades in which the total frequency of the binomial is fewer than 2,000 occurrences, and in most cases fewer than 1,000 occurrences. Remember that only decades in which a binomial had an overall frequency of at least 50 occurrences were included in the study to ensure the reliability of the (ir)reversibility scores computed. For most binomials, the cutoff point of 50 occurrences per decade appears to have been effective, since the values for decades with frequencies greater than 50 (and even fewer than 2,000) are in harmony with those in later decades with many more occurrences, leading to stable flat lines or stable historical trends. In seven cases, a higher threshold would have been necessary. The patterns of these seven cases are likely to be due to the fact that in lower-frequency decades, individual writers, possibly writing books with several occurrences of the same binomial in them, may strongly influence the overall (ir)reversibility scores. All seven cases, which did not initially achieve significance, turned out to show a significant freezing or unfreezing trend if those initial data points that exhibited zigzagging were excluded. These binomials were then also assigned to the categories of freezing or unfreezing trends.
It appears then that there are three types of reasons why binomials did not achieve significant values for a Kendall’s τ correlation: first, the overall correlation may hide combinatory trends, or initial vacillations due to low frequencies may hide significant trends—both of which are amended by reassigning cases to dynamic patterns. However, cases may also be nonsignificant for the simple reason that there is indeed no significant trend and (ir)reversibility scores have remained relatively stable over time. These cases, representing flat lines in the graphs as in pathways A through C in Figure 2, were assigned to the categories of continuing frozenness if the majority of data points lay above 99 percent, to continuing preference if the majority of data points lay between 75 percent and 98.99 percent, and to continuing reversibility if the majority of data points lay below 75 percent. Finally, a small number of binomials tested did not achieve significance in the correlation between scores and time because there were too few data points available, since a few items became frequent enough for analysis only in the second half of the twentieth century. These cases were assigned to the “unclear” category, which also contains some items exhibiting wild zigzagging that does not amount to any significant trend line or a stable line. These are most frequently combinations of proper nouns such as Afghanistan and Iraq or Russia and China, whose ordering is likely to be subject to political fluctuations.
Results: Pathways of Change for Binomial Reversibility
Table 1 shows the results of the meticulous process of assigning the 214 word form binomials studied to the hypothesized patterns of change and nonchange with the help of the correlation statistic. The table gives both the raw frequency of binomial types assigned to a certain category and the percentages (always calculated on the basis of all 214 types). The table is organized from the broadest categorization on the left (dynamic vs. static) to the finest categorization on the right (ten different types of change and nonchange), with the middle columns occupied by the categories of freezing trends, unfreezing trends, nonlinear trends, and no trend. One of the most interesting results is that static patterns are not at all predominant, as might have been assumed considering the small time frame of the study, a mere two hundred years. One could have expected that most of the binomials that are in use today have remained stable in their reversibility status during this period, having stabilized long before. Yet the late modern period is shown to be a period of dynamic changes in this respect, and the phenomenon of binomial reversibility is indeed revealed to be subject to strong diachronic developments. As large a proportion as 76 percent of binomial types reveals significant dynamic changes in the period considered, with only 20 percent remaining stable (and 4 percent being unclear).
Frequencies and Percentages of Binomials in the Diachronic Patterns (214 Binomial Types).
Among the rather static binomials, those that continue to prefer one variant over the other more or less strongly are the most frequent (63 percent of static binomials), followed by frozen ones (26 percent). Naturally, however, this distinction is not clear-cut, since the threshold for frozenness could be set differently than at the arbitrarily selected threshold of 99 percent. What does seem to be rare, however, is for binomials to remain rather reversible over a long stretch of time, as these cases make up only 12 percent of all static cases (and only 2 percent of the cases overall). The continuously frozen category consists of well-known fixed phrases such as born and raised, part and parcel, trial and error, or pros and cons, which suggests that these phrases have been frozen since before the beginning of the time span covered by the corpus. Examples that exhibit the same preference over time but also continue to allow the reverse variant are time and money, name and address, North and South, and beliefs and practices. A few of the rare examples with continuing reversibility are physical and mental, how and why, and social and economic. Figure 3 illustrates one example each for the three static categories.

Examples of static patterns: Continuing frozenness (part and parcel), continuing preference (time and money), and continuing reversibility (physical and mental).
Freezing, the pathway of change that has received the most attention in the past, is quite frequent if conceptualized as a broad category of freezing trends at 62 percent of dynamic binomials (and 47 percent of all binomials in the sample). It falls into three categories: true freezing leading to frozenness, a freezing trend stopping short of frozenness, and, as a special case, a freezing trend incorporating a reversal in preference. True freezing is rather rare in the period examined, representing only 6 percent of cases overall. 4 Examples in our data are law and order, track and field, and wear and tear, all of which are clearly freezes today but have achieved this frozen status only at some point within the past 200 years. Binomials showing a freezing trend in general, however, are very frequent in the data, and indeed make up the largest category of patterns of change and nonchange, representing 40 percent of all cases. Examples of freezing trends, with significantly rising trend lines toward a point of greater, if not complete, frozenness are ladies and gentlemen, live and work, arms and legs, and black and white. One of the most interesting categories, even if very rare (less than 1 percent of binomial types), is the special type of freezing trend in which we see a reversal in preference. Since unfreezing has not been mentioned as a possible pathway in previous literature, it is the more fascinating that not only does unfreezing occur, but in some cases, the gradual loss of preference for one variant goes so far as to lead to a preference for the reverse order. For the two examples of this category (salt and pepper and math and science), the reverse order was the more common one at the beginning of the nineteenth century (or when they were first attested in sufficient quantities later) at an (ir)reversibility score of less than 25 percent for the contemporarily preferred sequence. The rarity of examples in this category is almost certainly due to the short time span of the corpus, since a development spanning more than 50 percentage points of the (ir)reversibility score may typically take longer than 200 years to complete. A number of binomials in the data could be examples of a reversal in preference over a longer time span since they show developments from values at around 30 percent to values above 70 percent, such as vitamins and minerals, or flora and fauna. Since, however, we imposed a threshold of 75 percent (and 25 percent, respectively) as signifying preference for one binomial sequence, they do not strictly qualify as reversals in our terminology. Examples of the three types of freezing trends are graphically illustrated in Figure 4.

Examples of three types of freezing trends: True freezing (law and order), freezing trend (ladies and gentlemen), and reversal in preference (salt and pepper).
Unfreezing possibly runs counter to a model of usage-based exemplar learning (e.g., Bybee 2006, 2010; see discussion below) positing that the more frequently a variant is encountered, the more entrenched it ought to become. Even though unfreezing trends are not the most common category (24 percent of all binomials), they do occur, and not infrequently. True unfreezing with the end result of reversibility occurs in 6 percent, and unfreezing trends toward greater reversibility in 18 percent of all cases. True unfreezing is represented by cases such as women and girls, parents and teachers, and doors and windows, while mere unfreezing trends are observed for parents and children, high and low, and hardware and software. Even more interesting, we find as many as 14 cases (27 percent of all cases showing some kind of unfreezing trend) in which a previously frozen binomial with a score of greater than 99 percent at the beginning of its time line shows a significant unfreezing trend (e.g., male and female, sons and daughters, and flesh and blood). This is evidence that frozenness is not necessarily permanent, but a status that may be lost. Figure 5 gives an example each for true unfreezing and an unfreezing trend.

Examples of two types of unfreezing trends: True unfreezing (women and girls) and unfreezing trend (parents and children).
Finally, as explained in the methodology section, in addition to the hypothesized dynamic pathways of change (cf. Figure 2), two more complex patterns were detected in the data, namely, combinations of significant trends of freezing and unfreezing. The combination of a freezing trend followed by an unfreezing trend is very rare (1 percent of all binomials) but does show again that a loss of preference is possible, even if it was only recently acquired. The examples in this category are children and adults, boys and girls, and social and environmental. Likewise, the opposite development is possible: an unfreezing trend is followed by a freezing trend, which is somewhat more frequently attested (3 percent of all binomials). For some reason, a variant loses a degree of preference, but regains it, as in social and political, church and state, food and water, and shapes and sizes. Figure 6 illustrates both nonlinear patterns with an example each.

Examples of two types of nonlinear patterns: Freezing trend plus unfreezing trend (children and adults, summit in the 1950s) and unfreezing trend plus freezing trend (social and political, low point in the 1930s).
To sum up the quantitative results of our analysis, the hypothesis that the six major patterns set out in Figure 2 do occur in authentic data is confirmed, and two further, more complex patterns have been discovered in addition. In general, change in binomial reversibility is common (and is probably even underestimated in studying such a short period). Moreover, while freezing—the one pathway that has received the most attention in the previous literature—is indeed the most frequent of the dynamic pathways of change, a multitude of other pathways exist, some of which need yet to be integrated into theoretical frameworks.
Changes in Binomial Reversibility Related to Changes in Frequency
The results presented show that the category of freezing trends occurs the most frequently (ca. 47 percent) among the different diachronic patterns for the 214 binomials investigated. Could this predominance of freezing be due to the fact that the sample represents the most frequent binomials in the 2000s, that is, the most frequent at the end of the time span? These currently highly frequent items are likely to have a history of having become more frequent over time (in terms of a normalized frequency), and it is not out of the question that rising frequencies may be positively correlated with freezing processes. Lohmann (2011:201n137) points out that in his synchronic sample of rather irreversible binomials (with (ir)reversibility scores of 90 percent and above) extracted from the BNC, there is a significant negative correlation between the overall frequency of the binomial type and the probability of the number of observed reversed instances of the preferred sequence. This suggests that higher frequencies and higher (ir)reversibility scores are positively correlated. Even though it does not directly reflect the diachronic perspective and covers only one extreme of the cline of reversibility, this finding offers a first indication that freezing trends may be connected to rising frequencies over time. To find out whether such a connection results in a bias in the selected high-frequency sample, which would lead to an overestimated proportion of freezing trends and presumably a concomitant underestimation of unfreezing trends, a control sample of binomials with lower frequencies was collected. The same subcorpus of COCA (137 million words, three written genres, 2000–2011) was used as for the selection of the original high-frequency sample to randomly gather fifty binomials from all those with a frequency of 14 occurrences (0.1 pmw). Remember that the lower cutoff point for inclusion into the high-frequency sample was 137 occurrences (1 pmw). Using even lower frequencies than 0.1 pmw was deemed unfeasible since extremely low frequencies lead to unreliable (ir)reversibility scores (as the likelihood of turning up reverse examples is low). The same methodological procedure as described above was then applied to the fifty low-frequency items, resulting in the distribution of diachronic patterns of binomial reversibility based on Google Books data found in Figure 7, in which the proportions of patterns are compared for the high- and the low-frequency samples. The largest differences are indeed found for the freezing and unfreezing trends: freezing trends as a broad category make up 47 percent of all cases in the high-frequency sample, but only 14 percent in the low-frequency sample, while unfreezing trends cover 38 percent of the less frequent binomials (and 24 percent of highly frequent ones). The less frequent binomials also contain more static as well as unclear cases.

Proportions (in percentage) of diachronic patterns in the samples of high-frequency and low-frequency 2000s binomials.
A similar result is achieved with a different kind of control sample, namely when analyzing binomials that used to be highly frequent in the past. For this purpose, since the interface to the Google Books Corpus of American English does not allow for an extraction of the most frequent types in a given period, the COHA corpus needed to be used to collect the 100 binomial sequences that were the most frequent at the beginning of the time span, the early nineteenth century (in the decades of the 1810s, the 1820s, and the 1830s collapsed). The overlap with the sample of the 214 contemporarily most frequent binomials is not substantial: 36 binomials occur in both samples and thus contribute to the proportions of patterns in each sample. The proportions of patterns (based on Google Books) in both samples are compared in Figure 8. Once more, the original sample of currently frequent binomials has a higher percentage of freezing trends (47 percent vs. 23 percent) and a lower percentage of unfreezing trends (24 percent vs. 48 percent) than the sample of previously frequent binomials.

Proportions (in percentage) of diachronic patterns in the samples of high-frequency 2000s binomials and high-frequency 1810s–1830s binomials.
The conclusion is thus supported that selecting the most frequent binomials for analysis has indeed biased the results toward more cases of freezing trends and fewer cases of unfreezing trends. Nevertheless, it is a constant finding in all three samples that all hypothesized patterns occur (apart from true freezing and the even more dramatic reversals of preference, which are not witnessed in the—rather small—sample of contemporary low-frequency items). In addition, all three samples have much higher proportions for the dynamic pathways of change than for the static patterns, which account for only between 20 percent and 30 percent of all cases. The question now arises whether the higher proportion of freezing trends in the original sample (and the lower proportion of unfreezing trends) is due to the fact that more binomials in this sample have a history of rising general frequency. To answer this question, all binomials in the three samples were categorized as representing rising, falling, or stable frequencies over time. The categorization was made based on a correlation of frequencies per million words per decade in the Google Books American English data on one hand and the time variable on the other. Cases with significant positive correlations (employing Kendall’s τ) were classified as representing rising frequencies, cases with significant negative correlations as falling frequencies, and cases with no significant correlation as stable frequencies. The figures reveal that the high-frequency 2000s sample is certainly marked by a large proportion of rising frequencies (74 percent compared to 11 percent falling frequencies). The high-frequency 1810s to 1830s sample, on the other hand, is almost the exact opposite with 7 percent rising and 73 percent falling frequencies. The low-frequency 2000s sample sits in the middle, yet rising frequencies are here more frequent than falling ones (48 percent and 20 percent, respectively). The types of frequency trends were then cross-tabulated with the diachronic patterns of reversibility identified before. The cross tabulations are presented in Table 2.
Cross Tabulations of Types of Frequency Trend and Diachronic Patterns of Binomial Reversibility in Three Data Samples.
The most interesting observation is that we do not universally find a tendency for rising frequencies and freezing trends to co-occur (as well as falling frequencies and unfreezing trends), even though the first sample shows a remarkably high figure (n = 78) for the cell that combines rising frequencies and freezing trends; and the third sample, especially, has a high frequency of cases (n = 37) with both falling frequencies and unfreezing trends. Yet since the row and column totals differ strongly within each sample (reflecting the marked differences in proportions of frequency trends over time), statistical tests are required. Since the interesting issue, both from a theoretical point of view and judging from the data presented in Table 2, is the interaction between freezing/unfreezing trends on one hand and rising/falling frequencies on the other, a chi-square test was performed on these two variables independently for each sample (because sample sizes differ strongly, and Samples 1 and 3 have overlapping cases). Only Sample 1 (the contemporary high-frequency set) produces a significant result, with freezing trends occurring more often than expected with rising frequencies, and unfreezing trends occurring more often than expected with falling frequencies (χ2 = 5.23, p = .022*, df = 1, effect size measured in Cramér’s V = .21). In the other two samples, there is no significant interaction between the two variables (Sample 2: χ2 = 0.873, p = .35, df = 1, V = .20; Sample 3: χ2 = 0.676, p = .411, df = 1, V = .11). Furthermore, Table 2 provides a large number of counterexamples to the hypothesis that freezing trends require rising frequencies: there are 22 different cases of freezing trends with falling frequencies (26 in the table, but four overlap between Samples 1 and 3), such as necessary and proper or body and soul, but no case of true freezing. It is thus possible that freezing does need rising frequencies to be completed with the result of frozenness, but this would need further corroboration. In addition, we find 38 different examples of binomials showing an unfreezing trend while becoming more frequent over time (40 cases in Table 2), such as ideas and feelings or fat and sugar. This tendency provides evidence that an unfreezing trend is not merely a symptom of falling frequencies, but a pathway of change in its own right. Usage-based theories (e.g., Bybee 2006, 2010), which assume that rising frequencies inevitably lead to entrenchment, thus face a difficulty in explaining unfreezing trends, especially since we have also noted examples of previously completely frozen binomials unfreezing (see further below).
Motivation for Changes in Binomial Reversibility
The results showing the dynamic changes in binomial reversibility raise the question of the motivation of such changes. In general, it can safely be assumed that these processes are complex and difficult to determine, since the question of motivation is notoriously difficult to answer in all questions of language change. There are a number of binomials whose developments may be attributed to sociocultural changes concerning their extralinguistic referents, as the developments in binomial reversibility seem to have proceeded parallel to changing preferences in the social order. Such a case could, for example, be made for the unfreezing trend of men and women (with scores between 98 percent and 99 percent from the 1810s to the 1970s, with a sharply falling trend line down toward 83 percent in the 2000s), which is in all likelihood connected to the real-life emancipation of women. After all, Malkiel (1959:145) drew attention to the fact that the ordering of elements A and B in a binomial could be due to “priorities inherent in the structure of a society,” as appears to be evident in ordering tendencies such as male before female, old before young, masters before servants, and so on. Benor and Levy (2006:239) included this “power constraint” as a potential determinant of the ordering of binomials in their quantitative analysis of how frequently different proposed constraints were satisfied or violated. They found that when binomials can be ordered according to social power hierarchies or according to their centrality in society (with the more powerful or more central element in the A slot), they will more often than not be thus ordered, namely in 69 percent of cases (Benor & Levy 2006:252); Mollin (2012:91) puts this figure even higher at 84 percent of cases. Therefore, if cultural value hierarchies are reflected in the ordering of particular binomials, this preferential ordering ought also to change when cultural values change, which is what we assume for the case of men and women. A similar speculation could be made for other cases, such as the true unfreezing of family and friends, potentially reflecting the development that self-selected peers have become more important relative to genetic kin in postmodern Western societies (cf. Pahl & Pevalin 2005), or the true unfreezing of Europe and the United States, indicating a potential surge in confidence among Americans vis-à-vis Europeans. However, arguments of this kind are difficult to substantiate with evidence, so that the sociocultural determination of changes in binomial reversibility needs detailed corroboration, preferably on the basis of both linguistic and sociological data. In addition, the majority of binomials cannot be ordered according to the power constraint, simply because their referents do not stand in a hierarchical relationship in society. Likewise, sociocultural changes cannot be assumed to be the motivating force for most changes that we observe in reversibility status.
In the absence of a sociocultural motivation for change, one possible explanation for the freezing process could be deduced from usage-based views of language (e.g., Bybee 2006, 2010; Beckner et al. 2009). Usage-based approaches emphasize the role of frequency in shaping grammar. Frequently occurring items become entrenched and gain separate entries in the mental lexicon. Bybee (2006:715) writes of irregular morphological forms, “High-frequency sequences become more entrenched in their morphosyntactic structure and resist restructuring on the basis of productive patterns that might otherwise occur.” This reasoning may be applied to the freezing of binomials (as of any other type of multiword unit). Freezing is then due to a cognitive automatism connected to input frequency. The more frequently a speaker encounters a binomial (in speech and writing) in one particular order, the more likely he or she is to produce this particular order herself or himself, thus again providing more input for other speakers reinforcing this binomial order. I have previously suggested that this may turn into a self-energizing process:
Once a certain threshold of reversibility score is reached, enough speakers will have internalised this sequence X and Y as a chunk and will produce it exclusively, so that in turn even more speakers are exclusively confronted with X and Y and internalise it as well—until it is completely frozen. (Mollin 2012:100)
While this self-energizing process can explain the upward spiral of (ir)reversibility scores in the freezing process, usage-based approaches have so far (to my knowledge) not addressed the possibility of unfreezing, or antilexicalization in the sense of Brinton and Traugott (2005). Usage-based theory could, however, account for unfreezing that is connected to the falling general frequency of a binomial, given that when a chunk is used less and less frequently, it may lose entrenchment and its status of a chunk stored whole in the mental lexicon. A fixed binomial could therefore also lose fixedness, and become relegated to the status of ad hoc coordination once more, resulting in a proportionally greater frequency of the previously dispreferred variant. This process is possible because, as Bybee and Torres-Cacoullos (2009:188) propose, even autonomous chunks still retain associations to their underlying individual lexical elements. However, the finding that unfreezing does also (if less typically) occur with rising frequencies poses a challenge to this usage-based account of unfreezing. A further challenge to the model is the fact that many binomials, even if they begin a freezing trend, do not complete it, with the dispreferred variant still in existence—after all, true freezes are rather rare (cf. Figure 1). In addition, we have found that individual binomials move between unfreezing and freezing trends, and even show reversals in preference. A purely cognitive-automatic explanation of reversibility change is as yet not sufficient to explain these patterns.
There must thus be factors that set changes in motion or even reverse changes. General theories and research on language change suggest that there is typically a social motivation in terms of specific individuals or groups of speakers introducing new forms (“innovators” in the sense of Milroy & Milroy 1985) or disseminating changes (“early adopters”). While the makeup of the Google Books corpus does not allow for fine-grained sociolinguistic investigations into who is in the vanguard of a change in the reversibility status of a given binomial, we can nevertheless assume that factors such as positive and/or negative identification with certain other speakers as well as overt and covert prestige will play a role in the dissemination of changes in binomial reversibility, as in other types of language change.
A final factor that may explain changes in binomial reversibility is explicit “verbal hygiene” in the sense of Cameron (1995), in which speakers campaign for the use or nonuse of certain linguistic items or structures, as in the movement toward politically correct language. Regarding binomials, this is conceivable especially in the ordering of word pairs referring to groups of people of, for example, differing ethnicities or genders. The development of the latter is the subject of a small case study.
Case Study: The Diachronic Development of the Reversibility of Gender Binomials
Some of the most dramatic changes in reversibility shown in the data concern gender pairs. There are fourteen binomials in the original high-frequency 2000s sample that juxtapose equivalent terms referring to men and women, twelve of which show significant developments over time. On average, the gender binomials show a value of Kendall’s τ = −0.65 for the correlation of (ir)reversibility scores with time, which means that on average they show a high negative correlation (i.e., unfreezing), while the average value for the whole group of binomials is τ = 0.18—on average, the (ir)– reversibility scores of all binomials in the sample are only slightly positively correlated with time. The gender pairs have thus witnessed exceptionally extensive historical developments regarding their degrees of reversibility. Naturally, the past decades have also seen dramatic changes in the roles of men and women in Western societies, which have, following the women’s liberation movement of the 1960s and 1970s (also called second-wave feminism in distinction from first-wave feminism focusing on women’s suffrage), moved from patriarchy to (at least nominal) emancipation of the genders (Rosen 2006). This seems to be partly reflected in the reversibility status of binomials: Figure 9 shows the developments of all binomials in the sample that juxtapose equivalent female and male terms (excepting proper names). For better comparability, all figures shown concern the binomial variant with the male element first, whether these are the versions that are preferred today or not. To increase graphical transparency, word form binomials are collapsed into lemmatized binomials in the graph if the two binomials (in the singular and the plural) show exactly parallel developments, as is the case for father(s) and mother(s), male(s) and female(s), as well as brother(s) and sister(s).

Diachronic development of binomial reversibility in gender binomials.
There is a general trend of a falling line, meaning that the preference for the male element to be placed first is eroding. There are only two binomials that show stable lines: Mr and Mrs continues to be frozen, while men and women continues to show a marked preference for the male element in the first slot. All other binomials show significant trend lines, namely freezing trends for those binomials that already prefer the female element first today (Mom and Dad, ladies and gentlemen, and mother(s) and father(s)), and unfreezing trends for binomials that still prefer the male element in the first slot (husband and wife, man and woman, sons and daughters, male(s) and female(s), and brother(s) and sister(s)). Boys and girls is a special case since it combines a significant freezing trend up to the 1950s with a significant unfreezing trend from the 1950s onward. If we consider only the developments in the second half of the twentieth century, however, we can see that twelve out of fourteen gender binomials show significant trends toward a smaller relative frequency of tokens with the male element ordered before the female element, the other two remaining stable. There is no binomial showing a movement toward a greater preference for the male element to come first. This is rather overwhelming evidence in favor of the hypothesis that social developments are mirrored in the diachronic (ir)reversibility scores of gender binomials.
If we consider the changes plotted in Figure 9 in more detail, it is interesting that the binomials in which the tendency to name the woman first (apart from ladies and gentlemen, see below) are those referring to the parental roles of men and women: contrary to usage in the nineteenth century, in which father(s) and mother(s) was strongly preferred, we have witnessed an unfreezing trend to the point of reversibility, with a mild preference today to name mothers first. One may speculate that this is because the mother’s typically more central role in child raising is now seen to be more important than the traditionally larger familial authority of the father. This development is independent from the feminist movement (beginning in the 1960s), given that the two lines for mother(s) and father(s) have been steeply falling since the beginning of the twentieth century. Related is Mom and Dad, which, however, is very rare in writing before the 1930s. All parental binomials now prefer the female element to come first, especially Mom and Dad, which is almost frozen.
Ladies and gentlemen is a special case, being the only gender binomial in our sample to have shown a greater frequency of female-first than male-first occurrences from the beginning of our sampled time frame. As mentioned above, Potter (1972:314) notes an apparent reversal in preference from the usage in Chaucer and speculates (without giving evidence) that in England, the order of the formula changed in the early eighteenth century. In our data, the binomial shows an unfreezing trend from a reversible status to almost complete frozenness (95 percent of occurrences are in the form ladies and gentlemen). However, there are two different uses to be distinguished: the use as a form of address (e.g., Ladies and gentlemen, I am happy to be speaking to you tonight) and the use for general reference to men and women of elevated social status or as a polite term of reference to men and women in general (e.g., A party of ladies and gentlemen was waiting outside). A case study was undertaken on the basis of the COHA corpus, whose 1,749 occurrences of ladies and gentlemen or gentlemen and ladies were categorized as either terms of address or terms of reference. In general, we find that the use of the binomial for reference has been steadily declining from the 1810s to the 2000s (τ = 0.642**, p < .001), so that almost all occurrences of the binomial today are used to address an audience. The two types of uses also differ in their (ir)reversibility score, as is shown in Table 3. Since the figures are small for the first half of the nineteenth century and for the twentieth century in the case of gentlemen and ladies, they have been collapsed into larger time bands in the table, but the correlations with the variable of time are significant even on a decade-by-decade basis. The scores for ladies and gentlemen used as a term of address begin with a moderate preference for the female element first, showing true freezing up to a frozen status today (τ = 0.627**, p < .001). The referential uses, however, were still reversible in the first half of the nineteenth century and show a freezing trend toward a moderate preference for ladies to be named first today (τ = 0.389*, p = .018), yet this use is dying out. We may speculate that the freezing trend for the declining referential use of the binomial has been triggered by the freezing of the address formula, which remains practically the only use of the binomial today. Nevertheless, the question remains why the form ladies and gentlemen, with the female element first, came to be frozen as the address term. After all, the other pair of address terms in the data, Mr and Mrs, continues to be frozen in the opposite order. Ladies and gentlemen has been termed a “politeness formula” by Holmes (2000:148). It is only used in speaking, directly addressing mixed audiences in formal contexts (all occurrences of the address form in the written COHA corpus are representations of speech). In these contexts of direct formal speech, apparently, the ordering constraint of power (Benor & Levy 2006:239) is overridden by one of politeness, in that precedence is given to the weaker gender to not draw attention to their less dominant status and to accord them respect. Cooper and Ross (1975:105) also point out ladies and gentlemen as an exception to their ordering principle of male before female, stating that “[t]his freeze represents a politeness convention,” adding, supposedly humorously, “[p]oliteness conventions are in general contrary to natural tendencies.” Cooper and Ross thereby confirm that politeness, required in situations of direct address, may revert the concept of power ordering so that precedence is given to the powerless. These initial reasons, which lead to a preference for ladies to be named first, however, are in all likelihood no longer consciously followed, since the binomial has frozen and become a routine formula.
(Ir)reversibility Scores for ladies and gentlemen (term of address vs. term of reference), Based on All Occurrences in COHA.
Thus, apart from two binomials that have remained stable in their almost complete or complete frozen preference for the male element to come first, we see a general trend toward less strong preferences of this type. The developments regarding the parental binomials as well as ladies and gentlemen are unconnected to the women’s liberation movement, but eight of the fourteen gender binomials studied (husband and wife, man and woman, sons and daughters, male(s) and female(s), boys and girls, and brother(s) and sister(s)) show unfreezing trends that become noticeably more marked from the 1970s on. This is the time when the women’s liberation movement in the United States was at its height, with women campaigning for legal and de facto equal rights, and indeed leading to a greater equality of the genders, even if some issues (such as equal pay) are still unresolved (Rosen 2006). The unfreezing trends in binomial reversibility are certainly connected to this sociocultural development, but it is notable that no complete emancipation has been achieved in binomial ordering preferences either. The majority of gender pairs continues to be preferentially ordered male-first, even if these preferences are less strong than they used to be in the 1950s. The data thus show a limited reflection of a more emancipated role distribution in society in binomial reversibility. Of interest, however, this development is unlikely to be due to explicit language planning, since the issue of ordering in gender binomials has featured rather low on the list of sexist usage targeted by feminists and language professionals that were alerted to sexism in language by the feminist movement. Most of the discussion on sexist language from the 1970s on has focused on the use of generic he as a pronoun, generic masculine nouns such as man, mankind, chairman, and so on, asymmetric terms of address (Mr. vs. Mrs. and Miss) and asymmetric semantics, as in man and wife, master/mistress, and so on. Consulting influential manuals for nonsexist usage, we find that the ordering of binomials occurs at best in a marginal paragraph among other “miscellaneous” issues (as in Miller & Swift 1988:116-118; Maggio 1991:13; Schwartz 1995) or not at all (Frank & Treichler 1989). The somewhat less pronounced preference for male-first binomial ordering that we witness in recent decades thus seems to be less due to verbal hygiene (in the sense of Cameron 1995) practiced in regard to this specific point of usage, but probably rather due to a generally heightened awareness of potentially sexist features of the English language in the wake of campaigns for nonsexism. It is even to be hoped that a part of the motivation for this change lies in an actually changed perception of the values and roles of men and women in society, in that individuals occasionally place women first in binomials because this order is to them a natural reflection of the world. While it is more difficult to change the attitudes that underlie sexist language than to change sexist language itself (e.g., Ehrlich & King 1992), the great success of feminist language planning seems to have been to raise awareness of linguistic choices (Curzan 2003:180). In the wake of this awareness raising, real linguistic changes have followed. As Curzan (2003:181) states, “[T]he feminist movement has been surprisingly successful in pushing through changes in the norms of accepted linguistic behavior, in writing particularly but also to some extent in speech,” citing examples such as the demise of generic he or the decline of the use of girl to refer to adult women. A further success story is contributed by Holmes and Sigley (2002) concerning the changes in occupational labels, who also point out that corpora today show far higher frequencies of female terms, indicating that women have also become more visible in English. Overall, it seems that most of the sexist features targeted by feminist language planning have undergone change toward less sexist alternatives. The case of binomial reversibility gives a first indication that the influence of feminist linguistics goes beyond those features that were explicitly targeted, leading to slowly changing language use even for features that were not expressly denounced. Anyhow, the times in which prescriptive grammarians insisted on male-first ordering (e.g., Wilson 1560:234, quoted in Bodine 1975:134: “the worthier is preferred and set before. As a man is sette before a woman.”) are certainly past.
Conclusion
This study has been the first to tackle the question of whether binomial reversibility is subject to change over time, and, if so, which pathways of change binomials may follow. It was hypothesized that six abstract patterns would be attested in authentic language data, here written American English from the nineteenth and twentieth centuries. Three of the abstract patterns are static, representing binomials whose degree of reversibility has remained stable during the time span covered (even though changes may have occurred before or may yet occur). The remaining three patterns are genuine pathways of change: binomials may freeze, developing a stronger preference for one of the two possible sequences; they may unfreeze, leading to a greater degree of exchangeability of elements A and B; or there may even be a change in preferred order (from B and A to A and B). Indeed, all six patterns are attested in the data, with the dynamic pathways of change represented by 76 percent of the binomials studied. In addition, the empirical analysis revealed that more complex patterns, such as combinations of freezing and unfreezing, also occur, which suggests that linear trends are not irrevocable, but may be reversed once more. The motivation for changes in binomial reversibility has been touched on, even though such motivation is difficult to ascertain. Changes may reflect sociocultural developments relating to the value attached to the referents of the binomial elements in society, or may even be brought about by explicit language planning (as in the gender word pairs). Less obviously, changes may combine sociolinguistic and inner-linguistic mechanisms, as when the use of a binomial sequence by an influential individual or group tilts the balance toward a greater preference (or dispreference) for one sequence, setting in motion an automatic self-energizing process of freezing or unfreezing, even though this has yet to be empirically shown.
One of the main results to come forward from the study is the evidence for unfreezing processes. Unfreezing has not been envisioned as a potential pathway in the classic literature on binomials. Even the currently most comprehensive account of lexicalization (Brinton & Traugott 2005) neglects processes of antilexicalization: as I argue, binomial unfreezing is an example of this process since it shows how items that are very lexical (i.e., stored as chunks) become less lexical over time. Unfreezing processes have also not been envisaged in the usage-based framework of language, which can offer a plausible theoretical explanation for unfreezing taking place with falling general frequencies only. A particularly important finding in this respect is the evidence of binomials that undergo unfreezing trends even while becoming more frequent over time. In general, the association between rising frequencies and freezing trends on one hand and between falling frequencies and unfreezing trends on the other is only tentatively supported by the data and is certainly not absolute. Future research within the usage-based framework could therefore direct its attention more generally to reversals of entrenchment, both for multiword units as well as for constructions, taking into account the possibility that factors other than frequency of occurrence may play a role.
Footnotes
Acknowledgements
I wish to thank the anonymous reviewers and the editors for their careful readings of previous drafts of this article. Parts of this research have been supported by the DFG in project MO 1756/4-1. This support is gratefully acknowledged.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Parts of this research have been supported by the DFG in project MO 1756/4-1.
