Abstract
Combining the frameworks of multi-dimensional (MD) analysis and rhetorical structure theory (RST), this study examines the linguistic co-occurrence patterns in the discourse of corporate annual reports (CARs) and interprets their underlying functional dimensions. Our corpus consists of texts of corporate 10K reports from firms listed on New York Stock Exchange (NYSE; N of texts = 642, totally 14,674,047 tokens). Five functional dimensions are quantitatively extracted and qualitatively interpreted: (1) expression of direct persuasion; (2) expression of impersonal stance; (3) subjective versus objective positioning; (4) integrative expression of stance versus fragmented expression; and (5) expression of reliability. All these dimensions contribute to the communicative function of persuasion. The analysis of the rhetorical structures of excerpts with high concentrations of co-occurring linguistic features on each dimension further indicates the communicative strategy of persuasion. The proposed MD model is then applied to analyze the effect of firm performance on the linguistic variation in CAR discourse. We found that firm performance can significantly affect the linguistic variation in CAR discourse. CAR discourse from firms with good performance is more reliable. The result reveals managements’ use of concealment strategy in impression management. This study has implications for MD analysis, business discourse analysis, language pedagogy and accounting research.
Keywords
Introduction
Corporate annual reports are formal documents produced by companies in fulfillment of obligations to disclose their financial results, in which both voluntary and legally required information about a firm’s financial situation and prospects are disclosed (Fuoli, 2017). Shareholders are the main addressees of corporate annual reports (CARs). Their target audience groups also include public and private investors, financial analysts, government authorities, specialized media, brokers, creditors, employees, competitors and society at large (Ditlevsen, 2012; Fuoli, 2017). As mandatory filings for corporate information disclosure, CARs are critical in reducing information asymmetry and agency conflicts between managers and stakeholders of a firm, and thus have aroused attention from scholars in various fields.
CAR has long been considered the pulse of corporate realities with a main purpose of informing shareholders about the performance and health of the company (Bhatia, 2010). But linguistic literature shows that it is used strategically to create a positive image (Bhatia, 2008, 2010; Ditlevsen, 2012; Fuoli, 2017) and has changed from being primarily a statutory document to being a public relations document (Ditlevsen, 2012). It is considered to have two communicative purposes: to give a true and fair view of the state of the company’s affairs and to provide a positive image of the company (Ditlevsen, 2012). In other words, they seek both to inform and persuade. The accounting literature also acknowledges the promotional character of CARs, especially under the heading of impression management (Brennan and Merkl-Davies, 2013). Impression management assumes that managements strategically select, display and present narrative information in corporate documents in a manner that is intended to distort readers’ perceptions of corporate achievements (Leung et al., 2015) and influence their impression of firm performance and prospects (e.g. Courtis, 2004; Merkl-Davies et al., 2011). For example, Li (2008) found the annual report texts of firms with lower earnings are harder to read. The reason might be that managers make bad news costly by writing excessively long annual reports with unnecessarily big words and long sentences to reduce readers’ response to bad news (Bloomfield, 2008). Athanasakou and Hussainey (2014) found firms disclose more forward-looking performance information in narrative sections of CARs when raising debt or conveying bad news. Chakrabarty et al. (2018) found managers with higher risk incentives issue less readable annual reports. From these studies, we can see that many text-external factors can affect the language choices in writing CARs, and thus, there could be linguistic variation within this specific register. Among those factors, the firm performance is a widely recognized one. Therefore, this study will examine the effect of firm performance on the linguistic variation in CAR discourse.
CARs are typically divided into many different sections, such as information about the company’s business, financial statements and management’s discussion and analysis (MD&A). Different sections may play different roles in fulfilling the communicative purposes (Fuoli, 2017). The mainly technical and numerical sections are inclined to fulfill the communicative purpose of informing readers, while the more narrative and descriptive sections mainly aim at providing positive image to persuade readers (Camiciottoli, 2010; Fuoli, 2017). Obviously, the most important section in CAR consists of the accounting information that is the financial statement, which is nothing more than the numbers displayed in the form of tables, graphs and calculations (Bhatia, 2010). These numbers are audited by public accountants. The narrative sections offer a review of the company’s performance based on the numbers, but they are not certified by the public accountants. Understanding narrative information is important for understanding the financial data, corporate decisions and corporate behaviors (Li, 2010). In the United States, the section shows general management commentary on the business accompanying the numerical financial statements in CAR is known as MD&A. According to Securities and Exchange Commission’s (SEC) requirement, MD&A should address five specific areas: operations, financial condition, liquidity, forward-looking information, and risk and uncertainty (Clarkson et al., 1999). As MD&A is a channel for managers to express their perspectives on firms’ past performance, current conditions and future prospects (Feldman et al., 2008), it provides an opportunity to study the language choices in CARs made by managers from different firms. MD&A has become an emerging focus of textual analysis, and it is the focus of this study as well.
CAR is a specific genre with its own structural patterning and high level of intertextuality and interdiscursivity. Ruiz-Garrido et al. (2005) found CARs showed a textual structure that include 16 possible moves, such as information about the company, financial review and shareholder information. Bhatia (2010) found that the genre of CAR consists of four different kinds of discourse: accounting discourse, discourse of economics, public relations discourse and legal discourse.
In addition to genre analysis, many studies have investigated some specific discourse features that serve persuasive function in CAR, such as hedging (McLaren-Hankin, 2008), personal pronouns (Camiciottoli, 2014), metaphors (Ho and Cheng, 2016) and multimodality (Camiciottoli, 2019). They aim to examine how ideologies are indexed by these discourse features or how these features interact with text-external factors. Hyland (1998) examined how metadiscourse in CARs facilitates CEOs to control and evaluate the information they provide to direct readers’ understanding and appraisal. Fuoli (2017) examined stance markers in CARs to investigate the role of stance in the discursive process of corporate identity construction. Most previous studies shared the goal of identifying specific linguistic features that can distinguish the register of CAR from others or that have special functions in this register. But ‘register differences are best described in terms of sets of co-occurring linguistic features that have a functional underpinning’ (Biber and Egbert, 2016). Register differences cannot be comprehensively described by one specific linguistic feature. Therefore, this study uses the corpus-based multi-dimensional (MD) analysis framework (Biber, 1988, 1995) to identify the co-occurrence patterns of linguistic features in CAR discourse. We then examine the functions of these patterns further by rhetorical structure analysis (Mann and Thompson, 1987, 1988) to deeply investigate how managers use them to achieve their communicative purposes and to reveal managers’ communicative strategies in CAR.
Analytic frameworks
Two analytic frameworks, MD analysis (Biber, 1985, 1986) and rhetorical structure analysis (Mann and Thompson, 1988), are employed. Both methods focus on the communicative functions of linguistic features, but they are not overlapping. They are complementary in nature. The MD analysis emphasizes on analyzing the underlying functions of lexical–grammatical features, while the rhetorical structure theory (RST) is illuming of the functions of relations between propositions within a text. By employing these two complementary analytic frameworks to study a specific register, one can achieve a better understanding of its communicative functions in terms of its micro-level lexical–grammatical features and macro-level rhetorical structure features. The two frameworks are briefly outlined as follow.
MD analysis was originally proposed by Biber (1985, 1986) and fully developed by Biber (1988) to compare spoken and written registers. It has since been used extensively in describing language variation in a wide range of registers (e.g. Biber, 2006; Conrad, 2018; Kessapidu, 1997). Biber and Conrad (2009) defined a register as a language variety associated with both a particular situation of use with pervasive linguistic features that serve important functions within that situation of use. Functions are at the heart of studying linguistic variation in registers. MD analysis, therefore, as a tool for analyzing linguistic variation in registers, assumes that linguistic features co-occur in the texts because they share communicative functions. It uses factor analysis to quantitatively identify which lexical and grammatical features frequently co-occur in a corpus and then qualitatively interprets the functional dimensions underlying those co-occurring sets of features. As linguistic variation is too complex to be analyzed by any single dimension, the description of linguistic variation in a given register is MD (Biber, 1988: 22).
MD analysis has been applied to analyze business registers, such as business letters (Kessapidu, 1997), advertising (Koteyko, 2015), workplace discourse (Friginal et al., 2013) and some more controlled sub-registers, like discourse of outsourced call centers (Friginal, 2008). Most of these studies not only identified the co-occurrence patterns of linguistic features, but also qualitatively interpreted the communicative strategies and business factors indexed by those linguistic patterns. However, the co-occurring linguistic patterns investigated by the previous MD studies only include lexical–grammatical features. Communicative strategies, though, are established at the macro-level as well (Kessapidu, 1997). According to functional grammar, all the units of a language are considered as organic configurations of functions (Halliday, 1994). Every part of a text has a function to play with respect to other parts in the text (Taboada and Mann, 2006). Therefore, the discourse units that come before and after the discourse units where co-occurring lexical–grammatical patterns occur may be as important as those patterns themselves in understanding why writers make the language choice of using those linguistic features. In other words, studying the rhetorical context of the co-occurring patterns of linguistic features can help better understand the role those patterns play in a given situation. Therefore, this study combines RST analysis with MD analysis to investigate the rhetorical relations between the discourse units with high concentrations of co-occurring linguistic features and their adjacent units. We interpret the communicative functions of those rhetorical relations to better understand the underlying functions of those co-occurring patterns.
RST, proposed by Mann and Thompson in the 1980s, is a functional theory of text structure. It describes the relations between text parts in functional terms (Mann and Thompson, 1988). RST assumes that discourse is not a pile of disordered clauses; it consists of discourse units with complex structural relations to each other. The minimal unit of discourse is called elementary discourse unit (EDU); EDUs combine with each other to form other, complex, units of discourse (Stede et al., 2017). In other words, RST addresses text organization by means of relations that hold between parts of a text (Taboada and Mann, 2006). Those relations are called rhetorical structure relations (Mann and Thompson, 1987).
Mann and Thompson (1987, 1988) defined 24 RST relations, including mononuclear and multinuclear relations, based on pragmatic and semantic functions (Taboada and Mann, 2006). A mononuclear relation contains two types of units: the nucleus, which expresses the more important information in the structure, and the satellites, which contribute to the nucleus and are secondary. A multinuclear relation contains two or more units of equal importance, each of which is assigned the role of nucleus (Carlson and Marcu, 2001). RST postulates a hierarchical, connected structure of texts, in which every part of a text has a function to play, with respect to other parts in the text (Taboada and Mann, 2006). Thus, it can be used to analyze the communicative functions of discourse units that have high concentration of co-occurring patterns of linguistic features identified by MD analysis.
To our knowledge, MD and RST analyses have not been applied to the register of CAR. CARs are of great importance in the business world. Their function seems to be changing from ‘informing and reporting’ to increasingly ‘promoting’ (Bhatia, 2010), which blurs the functional boundaries of this register. Therefore, it is necessary to investigate the underlying functional dimensions of linguistic variation in CARs. Our main research questions are as follows:
What are the salient patterns of linguistic variation in CARs? What are the functional dimensions underlying these patterns? Can firm performance significantly affect the linguistic variation patterns in CAR discourse?
Research methods
Corpus
Our corpus consists of texts of MD&A in corporate 10K reports from 2012 to 2016 of 135 US firms listed on the New York Stock Exchange (NYSE). We sample firms listed on the NYSE, because it is the largest stock exchange in the world, as calculated by the market capitalization of firms listed on it.
Our corpus comprises 642 texts totaling 14,674,047 words. We selected sampled firms by pseudo-random sampling. First, we eliminated financial firms (industry code 6000–6999) because the content of their CARs differs from that of companies in other industries (Li, 2006). To avoid the potential bias of having a sample skewed toward either large or small firms, or value or growth firms, we sequentially ranked all the NYSE listed firms in the Center for Research in Security Prices (CRSP)–Compustat database by size and book-to-market (BTM) ratio in fiscal year 2012. Then, we sorted them into terciles by size and BTM, using the benchmark from the Kenneth French website. Fifteen firms were then randomly selected from each of the nine portfolios. We downloaded their 10K filings from EDGAR website for the five fiscal years (2012–2016), resulting in (15firms × 9portfolios × 5years =) 675 reports. We then extracted the MD&A section from those 10K reports.
All the tables, headings and paragraphs less than ten words were deleted to improve the accuracy of the annotation of linguistic features. Our final corpus was reduced from 675 to 642 texts, as we deleted 33 texts that were either shorter than 2000 words or had too many linguistic features marked as outliers in factor analysis.
Methods
For MD analysis, we extracted co-occurring patterns of linguistic features from our corpus by factor analysis. We began with the 67 lexical–grammatical features identified by Biber (1988) and used the Multidimensional Analysis Tagger 1.3 (Nini, 2015) to annotate our corpus with these features. We deleted features with average frequencies of less than 0.003 per 100 tokens; consequently, 59 linguistic variables were retained for the final analysis. The z-scores of the 59 linguistic features in each text were then imported to SPSS 20.0 for factor analysis. A principal factor analysis (PFA) is conducted on 59 items with non-orthogonal rotation (Promax). As PFA is a commonly used method for exploratory factor analysis and it accounts for shared variance, it is suitable for the present study. Loadings of features on each factor with an absolute value less than 0.3 were deleted as insignificant. After factor analysis, we qualitatively interpreted the shared function of the co-occurring linguistic features on each factor and analyzed the underlying functional dimension of each factor with the help of RST analysis.
For RST analysis, we identified some sections in CAR that frequently use the features on each factor as excerpts and annotated their rhetorical structure. To be specific, taking the analysis of the positive co-occurring linguistic features on Dimension 1 as an example, after MD analysis, first, we selected the CAR that has the highest score on Dimension 1. Second, we highlighted the positive features in this CAR text. Third, we extracted one section that has high concentration of these co-occurring features as our excerpt, and then we annotated the rhetorical structure of this excerpt. The rhetorical structures were annotated with the help of RSTTool 3.45 (O’Donnell, 2003), under the guidance of Carlson and Marcu (2001) and Stede et al. (2017). In the annotation, first we broke the excerpts into EDUs – clauses; second, we figured out which adjacent units are to be connected to each other and in what order; third, we identified rhetorical relations between adjacent units and, finally, we decided on the nucleus/satellite status of the linked units. The rhetorical relations we used are those defined by Mann and Thompson (1988). After annotation, we analyzed the communicative function of the rhetorical relation between the discourse unit with high concentration of positive features on this factor and its adjacent units to further interpret the function served by the co-occurring pattern of linguistic features.
For analyzing the association between firm performance and linguistic variation in CARs, we ranked all the CAR samples by their reported earnings per share (EPS). EPS is a performance ratio that measures how effectively a firm’s managers are using its various resources to achieve profits (Nickels et al., 2016). Then, we sequentially ranked all the 642 CARs by EPS and equally divided them into three groups. CARs in group 1 (G1) had highest EPS, while those in group 3 (G3) had lowest EPS. Each group equally consists of 214 CARs. A one-way multivariate analysis of variance (MANOVA), using Pillai’s trace, is conducted to compare the dimension scores of CARs in G1 and those in G3, so as to explore the effects of firm performance on dimension scores of CARs. Then, we follow the MANOVA with separate analyses of variance (ANOVAs) on each of the five dependent variables to find out whether CARs in the two performance groups differ in each dimension. Finally, we qualitatively interpreted the possible causes of the observed linguistic variation.
An MD model of CAR discourse
For our factor model, the Kaiser–Meyer–Olkin (KMO) value (0.58) is acceptable, and the result of Bartlett’s test of sphericity (approx. chi-square = 13,945.453, df = 1711; p < 0.0001) is significant, indicating that a factor analysis is useful with our data.
Cattell (1966) suggests using the point of inflection in the scree plot of possible factor models as the cut-off for retaining factors. Our scree plot (Figure 1) shows a sharp break between Factors 5 and 6, and factors to the left of the breaking point were retained.

The scree plot.
In the five-factor solution, each factor was represented by at least five salient loadings, except Factor 3 which was represented by four highly salient loadings; in the six-factor solution, two of the factors were represented by less than five salient loadings. In general, five salient loadings are required for a meaningful interpretation of the construct underlying a factor (Biber, 1988: 88). Therefore, a five-factor solution was adopted. Our five-factor solution accounts for 29.0% of total variance, which means it can explain 29.0% of the linguistic variation in the register of CARs. This is smaller than Biber’s (2006) previous studies, but close to many other MD analysis, like Friginal and Weigle’s (2014) four-factor solution which accounted for 30.5% of the shared variance.
The five factors and the loadings of the features that constitute them are shown in Tables 1–5 and the underlying functional dimensions of the five factors are interpreted below. In the analysis of each factor, first, the generic function of the linguistic features on the factor is explained based on previous studies; second, the function of the linguistic features in CAR discourse is interpreted by analyzing excerpts from our corpus; and finally, the rhetorical structure of some section in a CAR that has highest score on this dimension is annotated and analyzed to support our interpretation of the function of those co-occurring linguistic patterns and to explore the writers’ communicative purposes and strategies within the situation.
Factor 1: expression of direct persuasion.
Factor 2: expression of impersonal stance.
Factor 3: subjective versus objective positioning.
Factor 4: integrative expression of stance versus fragmented expression.
Factor 5: expression of reliability.
Interpretation of Factor 1
Factor 1 is the most powerful one among the five factors, accounting for 9.54% of total variance (about 1/3 of the explanatory power of the model). It represents a basic functional dimension of the CAR register. Eleven features have weights larger than 0.3. The first nine features co-occur frequently and are in complementary distribution with the last two features.
Two different forms of that relative clauses are the highest loading features on Factor 1. That relative clauses provide a way to talk about nouns, either for identification or to provide additional information (Winter, 1982). Following that relative clauses are modal verbs. Biber (2006) found modal verbs are frequently used to mark stance in university registers. As epistemic modals, predictive modals and possibility modals have different degrees of certainty. As Biber and Finegan (1989) indicated, a predictive modal is a grammatical marker to show certainty, while a possibility modal marks doubt. In CARs, predictive modals often co-occur with that relative clauses and first-person pronouns to describe the positive outcomes that would be brought by the firm’s own activities, in which the predictive modals play the role of enhancing the certainty of the proposition (see example 1). On the other hand, possibility modals often co-occur with that relative clauses to describe the negative outcomes that may be brought by some external factor(s), in which possibility modals play the role of weakening the certainty of the proposition (see example 2). Thus, these co-occurring features are used to persuade readers to be confident in the firm. (In the examples below, positive-loading features are background highlighted and negative-loading features are underlined. For clauses, only the relative words are marked.) Example 1: Among the key components of consumers’ investment program are projects that will enhance customer value. Example 2: We follow industry news, trade issues, exchange rates, foreign demand, weather, crises and other world events that may affect our ingredient prices. Example 3: After evaluating all available positive and negative evidence, although realization is not assured, we determined that it is more likely than not that the results of future operations will generate sufficient taxable income to realize the deferred tax assets. Example 4: As a result of our annual and other periodic evaluations, we may determine that the intangible asset values need to be written down to their fair values, which could result in material charges that could be adverse to our operating results and financial position.

Rhetorical structure 1.
The above analysis suggests that those co-occurrence patterns are used to directly express the firms’ stance, aiming to persuade readers to evaluate their performance from a positive perspective.
This factor has two negative features –total prepositional phrases and past tense – and their weights are small. Prepositional phrases are an important device for packing high amount of information into discourse (Biber, 1988: 237). Past tense shows narrative orientation (Biber, 2006). In CARs, their frequent co-occurrence indicates an informational report of past events. As shown in the following examples, these two features usually co-occur with numbers, further indicating that the shared function of these two features in CARs is neutrally reporting historical financial information. Example 5: South America Example 6: Net sales
Interpretation of Factor 2
Factor 2 is constituted only of features with positive weights. Be as main verb and predicative adjectives have high positive weights. Predicative adjectives indicate evaluative propositions that project the writer’s commitment to the propositional content (Kessapidu, 1997). Agentless passives allow objects to be the grammatical subjects of sentences and thus focus readers on the objects rather than the people who work on them (Conrad, 2018), and thus impersonalize the proposition. In CARs, these two features often co-occur with be as main verb, usually in present tense, to express the firm’s evaluation on its performance and condition in an impersonal way. In this structure, predicative adjectives state the firm’s evaluation and agentless passives impersonalize its evaluation (see examples 7 and 8). Example 7: The credit risk under these interest rate and foreign currency agreements is not considered to be significant. Example 8: The Company’s integration of the Alaska operations is progressing well and is expected to be substantially complete by the end of the third quarter 2016. Example 9: The Company records accruals for legal matters when the information available indicates that it is probable that a liability has been incurred and the amount of the loss can be reasonably estimated. Example 10: A valuation allowance would be established if, based on the weight of available evidence, management believes that it is more likely than not that some portion or all of a recorded deferred tax asset would not be realized in future periods.

Rhetorical structure 2.
Overall, in CARs, agentless passives when co-occur with the other positive features on this factor, they function as a device to distance the firm from negative obligations and to show the firm’s conformity with generic norms. And meanwhile, the other features help to express the firm’s private evaluation to influence readers’ understanding and impression. Therefore, the label of ‘expression of impersonal stance’ is suggested here.
Interpretation of Factor 3
As only one linguistic feature has positive weight on this factor, we discuss the negative-loading features at first. Two features have large negative weights: pronoun it and total other nouns. Pronoun it is used as dummy pronoun on Factor 2, but as referential pronoun here on Factor 3. Compared with first-person pronouns, pronoun it, as a third-person pronoun, marks relatively inexact reference to persons outside of the immediate interaction (Biber, 1988: 225). The authoring firms use ‘it’ to position the firm as a third entity, so as to impress readers that the author would describe the firms objectively. The use of pronoun it shows a salient mark of objectification. As nouns are the primary bearers of referential meaning (Biber, 1988: 104), high frequency of nouns can indicate an informational focus. When pronoun it is used as a referential pronoun, it often co-occurs with many nouns and long words to provide additional information about the firm’s performance or express the firm’s assessment of its performance (see example 11). Unlike demoting or eliding themselves (like the function of passives), here the authoring firms use pronoun it to involve themselves into the discourse though as a third entity, aiming to adopt a more objective tone to claims about themselves. Example 11: Ameren Missouri’s plan outlined its ongoing transition to a more fuel-diverse generation portfolio over the next 20 years, which it believes maximizes the use of its current generation fleet for the benefit of its customers while leveraging energy efficiency, environmental controls, renewable energy resources and lower cost generation to meet future needs.
Figure 4 presents the rhetorical structure of an excerpt from the ‘Overview’ section, which describes capital spending. The nucleus (Unit 2) describes the firm’s investments in utilities, and the satellite (Span 3–5) provides additional information about the future investment trends. In Unit 3, pronoun it co-occurs with nouns to express the firm’s own expectation for its investments, reflecting a more objective tone. Its rhetorical relation to the nucleus (Unit 2) is elaboration, which is a subject matter relation indicating that the satellite provides details or more information on the propositions in the nucleus (Mann and Thompson, 1988), without any attempts to alter readers’ inclinations. Therefore, the firm here is trying to objectively provide additional information to readers.

Rhetorical structure 3.
Only one feature has a positive weight on Factor 3: first-person pronouns. First-person pronouns are markers of ego-involvement in texts (Biber, 1988: 225). This subjective device can help construct an engaging, intelligent and credible persona (Hyland, 2005). The strategic use of first-person pronouns in CARs helps firms to claim their positive persona by emphasizing their efforts, describing their performance and expressing their views, with an underlying purpose of constructing a proactive image (see example 12). Example 12: Based on our available credit facilities, recent issuance of senior unsecured notes and our history of positive operational cash flows, we believe that we have adequate liquidity to meet our needs for fiscal 2013 and beyond.
Interpretation of Factor 4
Six features have positive weights on this factor: average word length, nominalizations, private verbs, subordinator that deletion, present participle WHIZ deletion relatives and phrasal coordination. Chafe (1982) proposes a general communicative function of ‘integration’, which means to create maximum effect with the fewest words (Tannen, 1982). Many of the positive-loading features share the function of integration. Nominalizations and phrasal coordination are devices for achieving integration (Chafe, 1982); forms of present participial WHIZ deletion relatives are more compact and integrated than full relative clauses (Janda, 1985) and longer words convey more specific, specialized meanings than shorter ones (Biber, 1988: 238). Therefore, a frequent use of these features marks informational integration in a text.
Private verbs, another positive feature with large weight, refer to intellectual states and acts (Biber, 1988: 242). In CARs, private verbs often co-occur with subordinator that deletion to express the stance of the authoring firm. As subordinator that deletion is a form of syntactic reduction (Biber, 1988: 244), it may increase the difficulty of understanding its proposition. As a result, to understand the text, readers need to think from the writers’ perspective and understand the writers’ intention. Therefore, when it co-occurs with private verbs to express firms’ stance, it can help to enhance the effect of the stance on readers. Their co-occurrence with other positive-loading features that share an integrative function shows that stance is expressed in a more compact and efficient way (see examples 13 and 14). Example 13: We believe the critical accounting policies, including amounts involving significant estimates, uncertainties and susceptibility to change, include the following: revenue and expense recognition for racing events, accounting for NASCAR broadcasting revenue and purse and sanction fees, revenue recognition for marketing agreements. Example 14: The 2013 annual evaluation found the carrying values for NHMS and KyS exceeded estimated fair value reflecting lowered estimated future cash flows because the economic recovery has been slower and weaker than previous forecasts.

Rhetorical structure 4.
Past tense and sentence relatives have negative weights on this factor. Past tense is usually taken as the primary surface marker of narrative (Biber, 1988: 223). In CARs, past tense is frequently used to describe past operating events. Sentence relatives are used either for attitudinal comments by the speaker (Biber, 1988: 106) or for providing extra information (Xiao, 2009). In CARs, they frequently co-occur with past tense in the section of ‘Operating Results’ for objectively providing more information about the effects of past events on financial results in a less integrative but more fragmented way. Sentence relatives often come up as insertions between two propositions, which makes the texts seem more fragmented (see examples 15 and 16). They are used to disclose both positive and negative information, indicating that authoring firms do not use it as a device to express firms’ stance, but to neutrally disclose previous performance. Example 15: Metals traffic volume decreased 52,249 carloads, or 28.4% Example 16: This increase was primarily due to a 6% increase in the volume of milliliters sold

Rhetorical structure 5.
Overall, since the dimension underlying this factor seems to distinguish stance expressions that are informationally integrative from fragmented expressions, the label ‘Integrative Expression of Stance vs Fragmented Expression’ is suggested here.
Interpretation of Factor 5
The features with positive weights on this factor are by-passives, split auxiliaries, downtoners and place adverbials. Downtoners reduce the force of the verb (Quirk et al., 1985); they are used to indicate the reliability of propositions (Chafe, 1985). In by-passives, the agent is demoted, which suggests that the patient is more closely related to the discourse theme than the agent (Biber, 1988: 228). Split auxiliaries occur when adverbs are placed between auxiliaries and their main verb (Biber, 1988: 111) – for example, They are seriously injured by … In CARs, the adverbs placed between auxiliaries and main verbs are often downtoners, such as ‘partially’; by-passives frequently co-occur with them to describe how one financial performance of the firm is affected by other performance. Downtoners play the role of modulating or revealing the specific extent of the impact, which tends to make the description appear more reliable (see examples 17 and 18). Most place adverbials that co-occur with those positive features are text-internal references that are used to give more supportive facts, which may enhance the perceived reliability of the information described (see examples 17 and 18). Therefore, the shared function of the positive-loading features on Factor 5 is to express reliability. Example 17: This improved performance was partially offset by lower volume as described above. Example 18: As reflected in the table below, the increase in 2013 resulted from higher volumes, partially offset by lower average revenue per unit as lower market-based export coal rates, the effects of changes in the mix of business and slightly lower fuel surcharges more than offset rate increases.

Rhetorical structure 6.
Although the mononuclear relations in this paragraph are subject matter relations that have no direct pragmatic function, the co-occurrence of the positive-loading features suggests that the underlying function of this factor is to make a reliable impression on readers. The intention of persuasion (i.e. the intention of altering readers’ inclinations) underlying this co-occurrence pattern is expressed less directly than other factors.
Only one feature has a negative weight: infinitives. As its weight is low (–0.33) and it has many different functions, it is difficult to identify its function here without any co-occurring features. Infinitives are a form of complementation (Biber, 1988: 232). In CARs, they are frequently used as verb and noun complements, and the head verb or noun frequently encodes the authoring firm’s expectation or intention (see example 19). Example 19: We expect
Firm performance and linguistic variation in CAR discourse
This study looks at how firms’ financial performance affects the variation in the use of linguistic features in CARs. The results of MANOVA indicated that there was a significant effect of firm performance on the dimension scores of CARs (F (5, 419) = 4.00, p < 0.01, η2 = 0.046). Figure 8 plots the mean dimension scores of CARs from the two firm categories: firms with good performance (high EPS) and firms with bad performance (low EPS).

Comparison of factor scores for five dimensions.
Figure 8 shows that regardless of firm performance, the degree of direct persuasion of CAR discourse is low (Dimension 1); in terms of differences, the CAR discourse from firms with bad performance is more subjective (Dimension 3) while that from firms with good performance is more reliable (Dimension 5). Dimension 5 is the only dimension in the model that differentiates between CARs from firms with good performance and those with bad performance above the level of significance (ANOVA: F = 16.70, p < 0.01, η2 = 0.038). A comparison of examples 20 and 21 illustrates the difference highlighted by Dimension 5. Example 20: Revenues increased $205 million in 2013, but decreased $132 million in 2012. As reflected in the table below, the increase in 2013 resulted from higher volumes, partially offset by lower average revenue per unit as lower market-based export coal rates, the effects of changes in the mix of business, and slightly lower fuel surcharges more than offset rate increases. The decrease in 2012 was due Example 21: Gaming revenues decreased by $14,360,000, or 6.6%,
The difference highlighted by Dimension 5 could be related to managers’ impression management strategies. Previous studies on managerial disclosure found managers selectively disclose information for self-interest and manipulate the content and presentation of information in corporate documents with the purpose of distorting readers’ perception of corporate performance and prospects (Leung et al., 2015). Obfuscating bad news and rhetorical manipulation are two concealment strategies frequently used in impression management (Merkl-Davies & Brennan, 2007). In this study, the CAR discourse from firms with bad performance use fewer place adverbials to refer to supportive information and fewer downtoners, by-passives and split auxiliaries to reveal specific extent of the effect of corporate behaviors on firm performance, indicating that managers from this firm category are more inclined to use linguistic devices to obfuscate bad news and conceal negative performance. Firms with bad performance use concealment strategies more frequently in their CARs. This is consistent with Li’s (2008) finding that managers may be opportunistically structuring the annual reports to hide adverse information from readers.
Conclusion
The analysis of CAR discourse based on the MD analysis framework and RST reveals many interesting characteristics of this discourse. The five factors extracted by factor analysis illustrate the main patterns of lexical–grammatical features prevailing across this register. The five functional dimensions underlying the co-occurrence patterns of linguistic features, all contribute to the complex communicative function of persuasion: Dimension 1, ‘expression of direct persuasion’ marks the degree to which persuasion is marked directly; Dimension 2, ‘expression of impersonal stance’ marks the degree to which firms’ stance is expressed impersonally to influence readers’ understanding and impression; Dimension 3, ‘subjective versus objective positioning’ distinguishes subjective expressions that marks authoring firms’ ego-involvement in constructing positive persona from objective expressions; Dimension 4, ‘integrative expression of stance versus fragmented expression’ marks the degree to which firms integrate additional information into their stance expression; Dimension 5, ‘expression of reliability’ marks the degree to which reliability of propositions is expressed to give readers an impression of the firm as a reliable one. The persuasive function of CAR discourse discovered by this study conforms with Bhatia’s (2008) claim: ‘firms exploit lexical–grammatical linguistic resources to “bend” the norms and conventions of “reporting” to promote a positive image of the company, even in adverse and challenging economic circumstances’.
From an applied perspective, the MD analysis of CARs from firms with different levels of performance shows that firm performance is a significant text-external factor that can affect the linguistic variation in CARs (F = 4.00, p < 0.01, η2 = 0.046). Firms with good performance adopt a more reliable tone in their CAR discourse than firms with bad performance (F = 16.70, p < 0.01, η2 = 0.038). This variation can be attribute to possible impression management strategies used by firms to influence readers’ perception of corporate performance.
The present study has implications for MD analysis, business discourse analysis, language pedagogy and accounting research. First, this study indicates MD analysis is remarkably useful in detecting generic features by analyzing distinguished linguistic choices in specific register especially when combined with RST analysis. Second, our comprehensive description of the persuasion function of CARs shed light on studies of communicative functions of sub-registers of business discourse. Third, distinguishing the linguistic variations in business English from that in general English and identifying linguistic variations in sub-registers of business discourse are useful for promoting register awareness and improving language pedagogy. Fourth, this study provides a new angle for conducting textual analysis in accounting research.
Most of the linguistic features in the present study are borrowed from Biber’s (1988) study on spoken and written registers; as a result, the explanatory power of our model is not very strong. In future studies, we will choose linguistic features that are frequently used in business discourse to enhance the explanatory power. In addition to firm performance, there are many other text-external factors that might influence the linguistic choice in CAR discourse. Future studies will compare linguistic features in CARs from firms in different countries, industries, and of different sizes and statuses to investigate more text-external factors that may affect firms’ stylistic choices in CAR discourse.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received the following financial support for the research, authorship and/or publication of this article: This work was supported by the Social Science Foundation of Ministry of Education of China (grant number: 20YJC740002) and National Social Science Fund of China (grant number: 16BYY178) the Young Academic Talent Programme and New Faculty Research Programme of Beijing International Studies University.
