Abstract
Aims and Objectives/Purpose/Research Questions:
The purpose of the study is to figure out what factors condition the phenomenon of preposition drop (P-drop) in locative, directional and temporal phrases. Specifically, we investigate what kind of phrases allow P-drop in Russian spoken in Highland Daghestan and aim at understanding the rationale for this phenomenon.
Design/Methodology/Approach:
We conduct a quantitative analysis of data extracted from the Corpus of Russian spoken in Daghestan, which includes interviews with 53 native speakers of 15 Daghestanian and Turkic languages, amounting to 228 thousand tokens.
Data and Analysis:
Data from 47 (29 male; 18 female) consultants speaking Russian as a second language (L2) who produced a sufficient number of prepositional phrases (PPs) were included in the analysis. 50 PPs were collected from each speaker, resulting in a data set of 2350 PPs. Each PP was annotated for P-drop and several sociolinguistic and linguistic parameters. We fitted a logistic mixed-effects regression model to determine which parameters are significant predictors of P-drop.
Findings/Conclusions:
We show that the probability of P-drop depends on preposition type, phonetic context and the speaker’s fluency in Russian. We propose that the prominence of P-drop in the speech of Daghestanian highlanders results from an interplay of two factors: a typological tendency for certain spatial and temporal locations to be formally unmarked, and incomplete acquisition of the Russian prepositional system.
Originality:
This is the first detailed quantitative study of P-drop based on an inferential statistical analysis of data from a large number of L2 Russian speakers from Daghestan.
Significance/Implications:
The results show that the apparently contact-induced phenomena such as P-drop may be explained both by typological tendencies and incomplete acquisition of L2. This paper is thus important both for the typological study of this phenomenon and for L2 acquisition research.
Introduction
Preposition drop (P-drop) is a cross-linguistic phenomenon. It is attested in standard languages, for example, Modern Greek (Terzi, 2010), but has predominantly been discussed with respect to non- standard varieties. These include dialects (Bailey, 2018 and Myler, 2013 for British English dialects (1); Cattaneo, 2009 for Northern Italian dialects (2) among others) and contact-influenced varieties, for example, multiethnolects such as Kiezdeutsch variety of German (Wiese, 2009 among others), see (3).
(1) Northwest British English (Myler, 2013, p. 189) John came [to] the pub with me.
1
(2) Bellinzonese Italian (Cattaneo, 2009, p. 287) te ve [a] ginasctica ‘You go to gymnastics.’ (3) Berlin Kiezdeutsch German (Wiese, 2009, p. 792) morgen ich geh [zum] arbeitsamt tomorrow ‘Tomorrow, I will go to the job center.’
A special case of contact-influenced varieties are creole languages, where P-drop has also been registered as a prominent feature (Holm, 2004, p. 232), see (4) and (5).
(4) Haitian Creole French (DeGraff, 2007, p. 122)
timoun yo al Mache Pòspyewo
children ‘The children have gone to the Post-Pierrot Market.’ (5) Sierra Leonean Creole English (Yillah & Corcoran, 2007, p. 194) a. I de [na] tɔng ‘(S)he is in town.’ b. I go [na] tɔng
‘(S)he went to town.’
As can be seen from (1)–(5), P-drop typically occurs when a PP denotes the goal of motion but is not limited to this context. To date, an extreme case seems to be reported for non-standard varieties of Russian: in Russian spoken by native speakers of indigenous minority languages, P-drop affects such contexts as spatial (6) and temporal (7) adverbial phrases, comitative phrases (8), etc.
(6) Erzya Russian (Shagal, 2016, p. 370)
2
[v] Saranske živët [in] Saransk. ‘(He/she) lives in Saransk.’ (7) Nanai Russian (Stoynova, 2019, p. 21, ex. 76) my s Amura priexali sjuda 1 [v] sem’desjat vtorom godu [in] seventy second. ‘We came here from the Amur region in the year of 1972.’ (8) Archi Russian (Daniel et al., 2010, p. 75) ja vot podružilas’, podružilas’ [s] avarcami make_friends. ‘I made friends with the Avars.’
Contact varieties of Russian therefore provide a rich and intriguing material for studying the phenomenon of P-drop. This paper focuses on Russian used as a lingua franca in Daghestan, a region of the Russian Federation characterized by a remarkable language density. The empirical basis of our investigation is Dobrushina et al. (2018)—the Corpus of Russian spoken in Daghestan (DagRus)—which provides ample data coming from Daghestanian highlanders whose native language (L1) is one of the local languages and who speak Russian as a second language (L2).
The paper addresses the following issues:
(a) What factors condition and constrain the phenomenon of P-drop? What kind of prepositional phrases (PPs) allow P-drop and to what extent?
(b) What can be a possible rationale for P-drop? Can any of the existing analyses of P-drop capture the pattern observed in the Russian speech of Daghestanian highlanders?
The remainder of the paper is organized as follows: the second section provides an overview of the existing analyses of P-drop; the third section describes the data, methods of data collection and annotation; the fourth section presents a statistical analysis of the data set, which reveals predictors significant for P-drop; the fifth section interprets the results of the statistical analysis and discusses reasons underlying the observed P-drop pattern; and the sixth section concludes and discusses P-drop from a cross-linguistic perspective.
Background
This section reviews previous treatments of the phenomenon of P-drop, grouped by the type of explanation they propose. This guides our annotation of PPs in the third section. We relate our own analysis to the ones discussed here in the “Relevance to previous research” subsection.
Phonetic reasons
P-drop in contact varieties of Russian has been reported to mostly affect simple prepositions, especially v ‘in(to)’, which is sometimes accounted for by phonetic properties or phonotactic constraints of the speakers’ L1s.
For instance, the Nanai, Ulch and Enets languages (spoken in the Russian Far East and Northern Siberia) are less tolerant to word-initial consonant clusters than Russian, and a quantitative study of Russian spoken by the Nanai, Ulch and Enets people (Khomchenkova et al., 2017; Stoynova, 2019) indeed showed that P-drop may be driven by cluster avoidance. Specifically, in these varieties the drop of v ‘in(to)’ is less likely before vowel-initial complements than before consonant-initial ones, with palatalized consonant-initial complements being an intermediate case. This proposal is supported by two additional facts: (a) cluster simplifications are also occasionally observed at the word-level (kusno instead of vkusno ‘tasty’); and (b) P-drop depends on the extent of phonetic interference from L1 exhibited by individual speakers (Stoynova, 2019, p. 21).
Phonetic factors are also mentioned in Daniel et al. (2010) who studied the Russian speech of Archi, Avar and Lak speakers from three Daghestanian villages. They observe omission of only three prepositions, namely v ‘in(to)’, s ‘with; off, from’ and na ‘on(to)’, noting that ot ‘from’, u ‘at’, iz ‘from, of’, and dlja ‘for’ are never dropped and that the preposition v ‘in(to)’ is omitted most frequently. The latter finding is hypothesized to be partially a phonetic artifact of their annotation: v ‘in(to)’ is “sometimes realized as bilabial rather than labiodental and is hardly audible, so that, in many cases, it is not easy to decide whether it is a dropped [v] or its weak to zero realization” (Daniel et al., 2010, p. 75).
Morphosyntactic interference with other languages
P-drop has also been accounted for by appealing to morphosyntactic interference on the part of the minority language, specifically, in papers devoted to contact-influenced varieties of Russian.
Daniel and Dobrushina (2009, 2013), Daniel et al. (2010) propose that this process is conditioned by the morphosyntax of Daghestanian languages, characterized by rich nominal declension paradigms and the employment of postpositions rather than prepositions. Thus, the phrases corresponding to the Russian preposition + case-marked noun complex have the form of a case-marked noun or, less frequently, a case-marked noun + postposition in these languages, as illustrated in (9) and (10) for Mehweb Dargwa.
(9) a. Mehweb Dargwa (adapted from Chechuro, 2019, p. 64)
ustuj-če-b
table. ‘on the table’ b. Russian na stole on table. (10) a. Mehweb Dargwa (adapted from Lander, 2019, p. 321)
heč’ dubur-li-če aqu-r
‘over that mountain’ b. Russian
nad toj goroj
over ‘over that mountain’
In addition, certain spatial locations (especially those introduced by place names) may appear in an unmarked essive form in Daghestanian languages (see, e.g., Daniel & Ganenkov, 2009). Based on the pattern of P-drop exhibited by Archi, Avar and Lak speakers, these authors conclude that “the preposition seems to drop only or primarily if the first language expresses the main meaning of the Russian preposition by morphological means” (Daniel et al., 2010, p. 77).
A range of papers reporting P-drop in regional varieties of Russian attribute it to morphosyntactic interference with the local Finno-Ugric and Turkic languages: Chuvash in Bajda (2018); Komi-Permyak and Tatar in Boronnikova (2014); Karelian, Vepsian, Mordvin, Tatar and Chuvash in Myznikova (2014); and Erzya Mordvin in Shagal (2016). All these minority languages are head-final and express spatial and temporal meanings predominantly by case suffixes, which is what prompts their speakers to drop prepositions when speaking Russian, according to these authors.
Khomchenkova et al. (2017) and Stoynova (2019), investigating contact-influenced varieties of Russian spoken in the Russian Far East and Northern Siberia, also consider the morphosyntactic interference factor. While they propose that the drop of the preposition na ‘on(to)’ can only be accounted for by morphosyntactic influence of L1, most instances of P-drop involve monoconsonantal prepositions and are best accounted for by phonetic factors (see the “Phonetic reasons” subsection).
Markedness principle
One of the features of P-drop often cited in the literature is its tendency to occur in semantically unmarked contexts.
On the one hand, P-drop is discussed as being related to the semantics of the prepositions corresponding to the English ‘to’ and/or ‘at’, which are cross-linguistically dropped most frequently. Gehrke and Lekakou (2013) and Bailey (2018), both referring to Zwarts (2008, 2010), notice that exactly these prepositions have the most neutral and the most basic spatial semantics. A similar point is found in Biggs (2015, p. 223): she quotes Caponigro and Pearl (2008) who assume that the prepositions to and at can be omitted while the preposition from cannot because of the difference in their degree of markedness. This is reminiscent of the so-called goal/source asymmetry—a cognitive bias that has been proposed to underlie various linguistic phenomena. For instance, goal path phrases are linguistically encoded more often than source path phrases (Georgakopoulos, 2018; Lakusta & Landau, 2005; Lakusta et al., 2007). More importantly, an adposition-like element marking a goal may be omitted while the one marking a source typically may not (see Ihara & Fujita, 2000 for Japanese). On the other hand, Gehrke and Lekakou (2013) argue that restrictions on P-drop are explained by the semantics of the complement noun. According to their hypothesis, P-drop may only occur when the corresponding noun denotes a stereotypical location, for example, house and university. Cattaneo (2009, p. 288), in turn, assumes that the possibility of P-drop is conditioned by the “familiarity” of particular locations to the speaker.
Similar argumentation is found in Comrie’s (1986) discussion of locative constructions in Eastern Armenian. His key theoretical suggestion is that the formal markedness of a construction correlates with “the degree of markedness of the locational situation in the world being described” (Comrie, 1986, p. 87). 3 Specifically, a citation form of a noun can only be used in simple statements such as “something is located somewhere”. If the semantics of the predicate is different from “be located at”, a case-marker is employed, sometimes accompanied by a postposition. Finally, a case and a postposition are both needed when one wants to specify the relation between the locatum and location (in, on, under, etc.).
Variation in the extent of formal marking in locative/directional constructions has also been described by Haspelmath (2019) in the context of “differential place marking”, although he prefers to explain this variation in terms of frequency, expectedness and efficient coding (Haspelmath, 2019, p. 328).
Exceptional syntactic structure
P-drop in Indo-European languages has received much attention in the generative literature. All studies we are aware of propose to account for the phenomenon by positing exceptional syntactic structure. As Bailey (2018, p. 56) points out, two major approaches have been taken to analyze P-drop constructions: assuming no PP projection; or positing a null-headed PP. The former approach does not involve a PP at any level of representation and posits pseudo-incorporation of the bare noun denoting location into a verb (Gehrke & Lekakou, 2013 for Modern Greek; and Hall, 2018 for Multicultural London English). Analyses that do assume a PP projection in the contexts of P-drop differ with respect to how the null preposition is “licensed”: either the null P incorporates into the verb (Ioannidou & Den Dikken, 2009 for Modern Greek; Bailey, 2018 and Myler, 2013 for British English dialects) or the noun denoting location undergoes movement into the PP projection (Longobardi, 2001 for Veneto dialects; Collins, 2007 for the English noun home; Cattaneo, 2009 for Bellinzonese Italian; and Terzi, 2010 for Modern Greek). Biggs (2015) stands out, as she proposes the phonetically null element alternating with at/to in Liverpool English to belong to the category κ, rather than P. This κ is a semantically abstract head, having only a basic allative or stative meaning, depending on the context, whose function is to license inherent case on the noun phrase (NP).
While some of these formal analyses (Biggs, 2015; Bailey, 2018; Cattaneo, 2009; Gehrke & Lekakou, 2013) attempt to capture the intuition that P-drop happens in unmarked contexts (cf. the “Markedness principle” subsection), others propose a special syntactic structure as the sole explanation.
Data
We investigate the Russian speech of consultants from highland villages of Daghestan where Russian is used as a lingua franca. Daghestan is a republic within the Russian Federation located in the Northern Caucasus. Around 50 languages are spoken in the relatively small area of mountainous Daghestan (about 50,000 km 2 ). Most of them belong to the Nakh-Daghestanian (East Caucasian) language family, but speakers of three Turkic languages (Azerbaijani, Kumyk, and Nogai) and one Iranian language (Tat) also live there.
Russian began to be taught at local schools that were established by the Soviet government in the 1930s. Initially, it was taught by the locals who had a decent command of Russian and then, in the 1950s, by Russian teachers who were sent by the government to teach in these villages (see Dobrushina, 2013, p. 382). Since then, most Daghestanians have acquired Russian through schooling. They are also exposed to Russian when watching television and traveling to towns. Currently, Russian is widely and rather fluently spoken as L2 by Daghestanian highlanders (for details on the status of Russian in Daghestan see Dobrushina & Kultepina, 2020).
The interaction between ethnic Daghestanian languages and Russian has been studied in a number of papers, including Daniel & Dobrushina (2009, 2013) and Daniel et al. (2010). These studies present an overview of specific linguistic features that are observed in the speech of Daghestanian highlanders. P-drop—the subject of this paper—is also discussed there in qualitative terms (see the “Morphosyntactic interference with other languages” subsection). We now turn to a detailed quantitative study of this phenomenon across a large number of speakers of different L1s.
Sampling
For our research we use data from Dobrushina et al.’s (2018) Corpus of Russian spoken in Daghestan (DagRus). The current version of the corpus comprises 50 sociolinguistic interviews with 55 consultants who are L1 speakers of 15 Daghestanian and Turkic languages. 46 interviews were recorded in 25 villages and four in the city of Makhachkala, the capital of the Republic of Daghestan (see Figure 1). 4 The total number of tokens produced by the consultants is about 228,000.

Speakers’ native languages and places where the interviews were recorded.a
In order to study P-drop we collected a data set of PPs registered in the speech of all interviewees. Since the interviews are of varying length (10 to 95 minutes), we collected 50 PPs from the middle of each interview; six speakers who produced less than 50 PPs were excluded from the sample. The decision to avoid the beginning and the end of the interviews was guided by our strive to capture the most natural speech: bearing in mind the observer’s paradox of Labov (1972), we expected that at the beginning the speakers would try their best to accommodate to the interviewers who speak Standard Russian; we also supposed that they would be too tired towards the end, possibly producing more non-standard features than they would when talking to their peers. On average, it took a speaker around 10 minutes to produce 50 PPs. Thus, in our sample, 50 consecutive PPs collected from a 60-minute interview roughly came from the 25:00–35:00 fragment. In most cases, we went back to the original recordings to double-check whether a preposition was preserved or dropped. Whenever our perception clearly diverged from the corpus transcription, we went with the former, noting this discrepancy in our data set.
Annotation
A bare NP was analyzed as involving P-drop whenever semantically warranted, based on comparison with L1 Russian. For example, the phrase postupit’ kursy ‘to enroll in a training program’ in DagRus was analyzed as involving omission of the preposition na ‘on(to)’ that one expects in L1 Russian (postupit’ na kursy ‘to enroll in a training program’).
Each occurrence of a PP (with or without P-drop) was annotated with a number of parameters. Sociolinguistic parameters included the speaker’s sex, year of birth, L1 and education level. Apart from that, observations from previous studies of P-drop (reviewed in the second section) led us to annotate the PPs with the following linguistic parameters:
(i) prepositional head;
(ii) initial phoneme of the prepositional complement (consonant/vowel);
(iii) complement type (toponym, temporal location, institution, other); and
(iv) semantic type (goal, source, location, and other).
Finally, we evaluated each speaker’s fluency in Russian, based on two metrics calculated for the examined interview fragment: their speech rate (average number of words per minute); and closeness to the L1 benchmark. The latter metric is the average number of deviations from L1 Russian per 100 words. We considered deviations at the morphological, syntactic and lexical levels, excluding P-drop (see the Appendix for the full list of types of deviations that we counted, along with the examples and the variant expected in L1 Russian). To arrive at a more accurate and less subjective measure, both of us (L1 speakers of Russian) examined each fragment independently. Then, the initial annotator of the text fragment compiled a final, ‘‘consensual’’ list of deviations, including those that had been overlooked during the first round. Having obtained the data on speech rate and deviation ratio, we converted these data into coefficients, with the 0.1 value attributed to the speaker exhibiting the lowest speech rate and the highest deviation ratio respectively. The coefficient values for the rest of the speakers were calculated using the following formula from Osborne (2013, p. 142): 1 − (native speaker average − individual score) / ((native speaker average − lowest score) / 0.9). 5 The resulting fluency index is the average of the two coefficient values.
Analysis
In this section we present a descriptive and inferential statistical analysis of the entire data set to find out which linguistic and sociolinguistic parameters have an effect on P-drop. At the end of this section, we discuss the context-type parameter for the relevant fraction of the data and show how it helps to reveal distinct P-drop patterns among speakers.
Descriptive statistics
The collected data set consists of 2350 PPs (50 from each of the 47 speakers), 421 of which involve P-drop. 29 speakers in our sample are male, 18 speakers are female.
Let us look at how P-drop depends on the annotated linguistic parameters. As expected from previous research, different prepositions do not have equal propensity to be dropped. The bar plot in Figure 2 shows that P-drop only ever occurs with seven prepositions, namely v ‘in(to)’, na ‘on(to)’, s ‘with/from/off’, iz ‘from’, za ‘for, behind’, k ‘to’, pro ‘about’. 6

Number of omissions and productions per preposition.
The preposition v ‘in(to)’ exhibits an especially robust pattern: it is dropped in 42% of the PPs it heads; na ‘on(to)’ comes next with 13.5% of omissions. Minimal pairs illustrating the alternation between preposition omission and preposition retention in the speech of the same consultant are given for v ‘in(to)’ in (11) and for na ‘on(to)’ in (12).
(11) da, on yes, rodilsja on be_born. ‘Yes, he was born [lit. gave birth] in Chuvek, he was born in Chuvek.’ [arhit.хив.42]
7
(12) а. u nego vse zapisi at 3
[on] latin ‘he had all of his notes taken in latin script [lit. language]’ [yangikent.маллакент.40.] b. vse zapisi vёl all. ‘[he] took all notes in latin script [lit. language]’ [yangikent.маллакент.40.]
The data in Figure 2 suggest that the P-drop pattern displays the goal/source asymmetry discussed in the “Markedness principle” subsection, since the prepositions s ‘with/from/off’ and iz ‘from’ that may mark source are dropped much more rarely than the typical goal-marking prepositions v ‘in(to)’ and na ‘on(to)’. This idea is supported by the data in Table 1, which show that PPs encoding a goal path display P-drop much more prominently than PPs encoding a source path. Note, however, that PPs encoding location allow P-drop roughly to the same extent as goal-encoding PPs. This means that we cannot provide a uniform account for the observed P-drop pattern appealing to the goal/source asymmetry alone.
Number of preposition (P)-omissions and semantic type of prepositional phrases.
Since the prepositions v ‘in(to)’ and na ‘on(to)’ are the only ones that are systematically omitted and also constitute the bulk of goal and locative PPs (81.6% and 93.8%, respectively), it is probably their inherent properties that are responsible for P-drop. We suggest that these prepositions are especially prone to omission because they are the ones used in most general locative and directional phrases, not necessarily specifying the relation between the figure and the ground. 8 According to Comrie (1986), these are precisely the environments that tend to be least marked (cf. the “Markedness principle” subsection). Therefore, the prepositions v ‘in(to)’ and na ‘on(to)’ are grouped together in our further statistical analysis.
In Table 2 we can see that the frequency of P-drop does not seem to depend on the phonetic environment: prepositions are omitted more or less equally frequently before vowel-initial and consonant-initial complements.
Number of preposition (P)-omissions and initial phoneme of the P-complement.
Let us now examine the relationship between P-drop and three sociolinguistic parameters: sex, L1 family and education level, visualized in Figure 3. Since some of the fifteen L1s were represented by only one speaker, we merged the languages according to their genealogy. As a result, in Figure 3 we have the (Nakh-)Daghestanian family (comprising Andi, Archi, Avar, Bagvalal, Akusha Dargwa, Itsari Dargwa, Mehweb Dargwa, Muira Dargwa, Tsudakhar Dargwa, Lak, Rutul, Tabasaran, and Tokita) and the Turkic family (Kumyk and Azerbaijani). A similar procedure was applied to the education level parameter. While the DagRus corpus distinguishes five levels of education (incomplete secondary, secondary, secondary specialized, incomplete higher, and higher), we unify the former four into non-higher education and contrast it with higher education, since some of these levels characterize one or two speakers only. Building on what we found about the ability of various prepositions to drop, in Figures 3–5 we plot the rate of omissions, only considering PPs that are headed by the seven Ps that are in principle omittable: this way we partially solve the problem of an uneven distribution of omittable and non-omittable prepositions across speakers.

Rate of omissions and sociolinguistic parameters (sex, native language, and education level).

Rate of omissions and year of birth.

Rate of omissions and fluency in Russian.
As can be seen from Figure 3, the difference between men and women with respect to the frequency of P-drop is more pronounced for the speakers of Turkic languages than for the speakers of Daghestanian. We can also see that the speakers of Turkic drop prepositions more frequently than the speakers of Daghestanian languages. 9 Finally, speakers with higher education tend to omit prepositions more rarely than those with lower education levels.
Figure 4 shows how the ratio of omissions to the number of produced omittable prepositions depends on the year a speaker was born in. Each point represents one speaker. While the distribution of the points does not allow us to make a definitive conclusion, the linear trend (with the confidence interval around it) shows that there is no significant correlation between the year of birth and the rate of omissions: the confidence interval is too wide for us to be sure that the average number of preposition omissions decreases along the horizontal axis.
Figure 5 shows the relation between the rate of omissions and the speaker’s fluency in Russian. The latter is represented by an index combining speech rate and the ratio of deviations from L1 Russian (see the “Annotation” subsection for details). Each point, again, corresponds to one speaker. The linear trend reveals that speakers who are more fluent in Russian tend to omit prepositions less frequently.
In the following subsection we run a logistic regression analysis using the R software 10 to assess the significance of the factors discussed above.
Logistic regression
The linguistic and sociolinguistic factors described in the “Descriptive statistics” subsection are summarized in Table 3.
Variables and effect type.
In order to see how significant each factor in Table 3 is, we employed a logistic mixed-effects model (Baayen, 2008, pp. 242–259; Gries, 2013, pp. 293–315; Levshina, 2015, pp. 254–266). A mixed-effects model was most fitting since it allows incorporating both fixed and random effects (Speaker in our case). 11 Aiming to arrive at an optimal model, we followed the backward stepwise variable selection procedure (Levshina, 2015, pp. 266–267), using the function drop1(). This function checks which predictor could be deleted to obtain a better fitted model (Gries, 2013, p. 266). We applied this function four times and, as a result, left out four parameters (sex, year of birth, education level, and language family), based on the values of the Akaike information criterion. The results of the logistic regression for the three remaining parameters (fixed effects) are presented in Table 4.
Fixed effects of the logistic regression.
Note: *, p-values below the 0.05 significance threshold; **, p-values below the 0.01 significance threshold; and ***, p-values below the 0.001 significance threshold.
Let us discuss the columns with numerical values in Table 4. The first column in Table 4 shows the estimates which specify the slopes of the regression line. Positive coefficient values in the estimate show that the relevant predictor contributes to P-drop, while negative values mean that the predictor (or its particular value) and P-drop are negatively correlated. The second column in Table 4 displays the standard errors of estimated coefficients. The p-values in the fourth column in Table 4 are based on the z-statistics from the third column. They show how confident we can be in rejecting the null hypothesis that a parameter has no effect on P-drop. Asterisks mark predictors that are statistically significant (cannot be rejected as having no effect); their number reflects the degree of confidence.
We can see that the intercept (corresponding to a situation when all continuous explanatory variables equal zero, and all categorical variables are at their reference levels) and all three predictors turn out to be significant Figure 6 plots the effects of all three predictors.

Plots of the fixed effects of the logistic regression model.
These effects plots provide the predicted probability values of the outcome (P-drop) for given values of the predictors. These are obtained from “inserting” the value of a predictor into the model formula; the effect is calculated for one predictor at a time, while the other predictors are taken at mean values multiplied by their regression coefficients. A 95% pointwise confidence interval is drawn around the estimated effect of each predictor, based on standard errors computed from the covariance matrix of the fitted regression coefficients. The rug plot at the bottom of the uppermost left graph in Figure 6 shows the location of the fluency index values. Furthermore, Figure 6 shows that the most powerful predictors of P-drop are the speakers’ fluency in Russian and preposition type.
Additional observations
During the annotation process, we noticed that the speakers seem to exhibit different patterns of P-drop, depending on the type of preposition and semantic context. In this subsection, we classify the contexts into core and non-core, and show how context type and preposition type reveal the existence of three groups of speakers in our sample.
Contexts of P-drop
Recall that we annotated the collected PPs for the semantic type of the NP appearing with an overt or omitted preposition. In particular, we specified whether this NP denoted a toponym (13), an exact temporal location (14) or an institution (15)—these are NPs that have been observed in the previous literature to be prone to less formal marking (see the “Markedness principle” subsection).
(13) i [v] Čumljax rabotala ona učitelem and [in] Chumli. ‘She worked as a teacher both in Chumli [and. . .].’ [yangikent.янгикент.55] (14) [v] devjanosto sed’mom godu tam požar byl [in] ninety seventh. ‘In ‘97 there was a fire there.’ [shangoda.мегеб.syn-139] (15) nu on mog by well 3 [v] institut postupit’ pravil’no že? [in] institute. net, [v] texnikum pošël no [in] vocational_school. ‘Well, he could have gone to college, right? But no, he went to a vocational school instead.’ [archib.арчиб.syn-114]
As can be seen from Table 5, the omission rate of the systematically dropped prepositions v ‘in(to)’ and na ‘on(to)’ turns out to be higher in these three contexts than elsewhere. In addition, these contexts account for the majority (66%) of P-drop cases, so we refer to them as core contexts.
Omission of prepositions (Ps) v ‘in(to)’, na ‘on(to)’ and the semantic type of the complement.
Contexts of P-drop and inter-speaker variation
The speakers can be divided into three groups, according to contexts in which they drop prepositions (see Table 6). 12
Groups of speakers according to their preposition drop patterns.
Note: aimportantly, all these speakers produced 6–15 prepositional phrases headed by v ‘in(to)’ and na ‘on(to)’ which do not belong to core contexts, so core contexts are indeed special for them.
Speakers in the first group only omit prepositions v ‘in(to)’ and na ‘on(to)’ and only in core contexts. The second group of speakers also omit only v ‘in(to)’ and na ‘on(to)’, but do it in other spatial, temporal, and more abstract contexts as well, such as those illustrated in (16) and (17).
(16) kogda Sovetskij Sojuz byl when soviet. [in] general. ‘In the times of the Soviet Union people here mainly worked in the sovkhoz.’ [kina.кина.нд40] (17) oni tol’ko [na] tabasaranskom razgovarivali 3 ‘They spoke only Tabasaran.’ [arhit.хив.42]
The third group comprises speakers who omit v ‘in(to)’, na ‘on(to)’ and other prepositions in various contexts, e.g., (18) and (19).
(18) [za] pradedušku našego tože vyxodila ona [for] great-grandfather. ‘She was married to our great-grandfather as well.’ [about a woman who married 12 times] [archib.арчиб.syn-138] (19) svobodno možno bylo freely possible be. [s] rossijskim pasportom proexat’ [with] Russian.
An obvious question to ask at this point is whether the observed P-drop patterns correlate with the speakers’ fluency in Russian. We can see from Figure 7 that the higher the fluency index, the narrower the range of environments with P-drop. However, the difference between the groups does not reach statistical significance (p = 0.29, analysis of variance test).

Preposition-drop patterns and fluency in Russian.
Nevertheless, the fact that the speakers can be neatly divided into three well-defined groups suggests that there might be some psycholinguistic reality behind this classification, namely, the closer the consultant’s speech is to L1 Russian, the narrower is the scope of P-drop.
Discussion
Interpretation of the results
In this subsection we interpret the results of the statistical analysis and discuss what may underlie the significance of the three parameters—preposition type, fluency index, and phonetic environment.
Prepositions v ‘in(to)’ and na ‘on(to)’
We have seen that the prepositions v ‘in(to)’ and na ‘on(to)’ are the ones that are systematically dropped by the speakers featured in DagRus. A possible motivation for the observed pattern is that these prepositions may have quite abstract, ‘empty’ semantics, in particular, in the core contexts for P-drop that involve specific time-referring and place-referring NPs.
In fact, in many prepositional languages a small group of nouns, such as town/city and street names may appear without a preposition when denoting a location, as illustrated in (20)–(22). Stolz et al. (2014, p. 287), based on a sample of 147 languages, found that toponyms in the function of location, direction and source are zero-marked in 90% of the languages. 13 In fact, even Old Church Slavonic and Old Russian marked locative and directional phrases by case only, without prepositional ‘support’ (23). 14
(20) French (Mel’čuk, 2018, p. 272)
on s’est vu rue de Rivoli
‘We saw each other on Rivoli street.’ (21) Maltese (Stolz et al., 2017, p. 463) jgħallem Għawdex ‘He teaches on Gozo (an island).’ (22) Marshallese (Schlossberg, 2018, p. 139 via Haspelmath, 2019, p. 317) ļe e=j pād Lojkar man ‘He is at Lojkar.’ (23) Old Russian (Polnoe sobranie russkih letopisej, 1846, p. 27) Svjatoslav’’ bjaše Perejaslavci Svjatoslav be.3 ‘Svjatoslav was in [the town of] Pereyaslavl.’
In addition, Haspelmath (2019) distinguishes a group of what he calls topo-nouns “denot[ing] concepts which are commonly used as spatial landmarks” (Haspelmath, 2019, p. 322), see (24).
(24) Modern Greek (adapted from Terzi, 2010, p. 178) pao/ime [sto] liman go. ‘I go to/I am at the port.’
Turning now to temporal locations, it was observed in Haspelmath (1997, pp. 116–119) that a class of expressions denoting “various time periods combined with modifiers, especially demonstratives, the adjectives ‘last’ and ‘next’, and the universal determiner ‘every’” systematically appear zero-marked in a number of languages (Haspelmath, 1997, pp. 116–119), see (25)–(26).
(25) English a. in the morning vs. this morning b. on Friday vs. last Friday (26) Tagalog (Haspelmath, 1997, p. 117) a. sa Linggo at Sunday ‘on Sunday’ b. tuwing Linggo every Sunday ‘every Sunday’
We observe a similar pattern in our data: for instance, in PPs meaning ‘at this/that time’, omission of v ‘in(to)’ is more frequent than retention: P-drop occurs in 13 out of 18 such examples.
The patterns of P-drop that we observe in DagRus thus accord well with the typological tendencies. In fact, they are even more pronounced in the varieties of Russian spoken in Daghestan than in the languages mentioned above. We suggest that exact temporal locations (27) may be analyzed as modified time-denoting nouns and this is what expands the scope of contexts admitting less marking (P-drop).
(27) i vot dobralis’ and [v] tri časa uže na plato [in] three. ‘And so, we reached the plateau already at three o’clock.’ [karata.тукита.нд14]
It might be the case that Russian spoken in Daghestan and other contact-influenced varieties of Russian exhibit P-drop in a wider range of contexts than typically discussed in the literature because prepositional complements are inflected for case in Russian. Moreover, the morphological case required in spatial PPs headed by the prepositions v ‘in(to)’ and na ‘on(to)’ depends on whether the phrase encodes a goal of motion or a static location: v dom ‘into the house (
(28) a. ezdili [v] Maxačkalu ljudi otsjuda go. ‘people from here used to go to Makhachkala’ [chankurbe.дуранги.add.syn-1] b. [v] Maxačkale polučila pasport [in] Makhachkala. ‘[she] got a passport in Makhachkala’ [archib.шалиб.syn-40] (29) a. ego otpravili [na] godičnye kursy 3 ‘They sent him to a one-year training program.’ [chuni.чуни.60ик] b. vot [na] takie kursax obučalsja ‘So [I] took this kind of training program.’ [yangikent.янгикент.38]
Fluency in Russian
The index of fluency in Russian is another significant predictor for P-drop. That is, speakers who are more fluent in Russian tend to omit prepositions less frequently than those whose speech considerably deviates from the L1 benchmark.
Mastering prepositions is known to be a very challenging task in L2 acquisition (see Celce-Murcia & Larsen-Freeman, 1983; Covitt, 1976 for English). In particular, Celce-Murcia & Larsen-Freeman (1983) observe that language learners make three types of mistakes: (a) use a wrong preposition (30a); (b) omit a required preposition (31a); and (c) use a superfluous preposition (32a). All these strategies are employed by the speakers from our data sample, as evident from (30b), (31b), and (32b).
(30) a. L2 English (Celce-Murcia & Larsen-Freeman, 1983, p. 261) My grandfather picked the name b. DagRus Corpus [darvag.дюбек.нд15] udarila
hit. ‘[she] hit [me] on the back’ (31) a. L2 English (Celce-Murcia & Larsen-Freeman, 1983, p. 261) I served [in] the Army until 1964. b. DagRus Corpus [archib.арчиб.syn-138] [v] armii služil [in] army. ‘[he] served in the Army’ (32) a. L2 English (Celce-Murcia & Larsen-Freeman, 1983, p. 261) I studied in Biology for three years. (no preposition required) b. DagRus Corpus [karata.тлибишо.нд32]
15
tol’ko na odnoj kartoške pitalis’ only on one. ‘[we] only fed on potatoes.’
Generalizing Celce-Murcia & Larsen-Freeman’s (1983, p. 250) ideas about English, the following factors may cause difficulties in the acquisition of prepositions in L2:
(a) information that is signaled by a preposition in L2 can be signaled by other means in L1: an inflection on a noun/article and/or a postposition; and
(b) L2 prepositions cannot be directly semantically mapped onto their functional equivalents in L1.
In light of the above, we can try to explain why the predominant “mistake” made by the Daghestanian speakers of Russian in the realm of PPs is omission of the required preposition. This may have to do with the fact that the L1s of our speakers (Nakh-Daghestanian and Turkic) are head-final and, thus, postpositional. In these languages, location in space and time is encoded to the right of the nominal (predominantly by case suffixes and sometimes also by postpositions), whereas in Russian the nominal may be marked both on the left (by prepositions) and on the right (by case suffixes). Therefore, we may expect that an individual who has not fully mastered Russian PPs will tend to omit, rather than replace or insert a preposition in case of uncertainty. This is most expected in contexts where preposition choice is idiosyncratic and/or where the preposition can be omitted in the target language. In L1 Russian this is most clearly seen in the domain of temporal expressions (33); we note that the variation illustrated in (33c) and (33d) is apparently limited to phrases containing such modifiers as pervyj ‘first’ and poslednij ‘last’ and a restricted set of nouns in the accusative case.
(33) L1 Russian a. v pjat’ časov in five. ‘at five o’clock’ b. na sledujuščej nedele on next. ‘next week’ c. (v) poslednee vremja in latest. ‘recently’ d. (v) pervyj raz in first. ‘first time/on the first occasion’
This explanation can be extended to other contact-influenced varieties of Russian mentioned in the second section, since in all those cases the minority language in contact with Russian is head-final. Of course, the most valid test for our hypothesis would be to take a head-initial, prepositional language in contact with Russian and see whether those speakers omit prepositions substantially less frequently; cf. Jarvis and Odlin (2000) who find that native speakers of Finnish (a postpositional language) omit required prepositions in their L2-English, while native speakers of Swedish (a prepositional language) do not. Possible candidates that are most similar to our case in terms of sociolinguistics would be speakers of Romani and speakers of German who have historically lived in Russia. Unfortunately, we do not have the type and amount of data that are needed for comparison with DagRus, so we leave this issue for future research.
Initial phoneme
The factor whose significance we are least certain about (based on the p-value in Table 4 and the effect size in Figure 6) is the initial phoneme of the P-complement. Recall that our statistical model returned the result that vowel-initial complements are less conducive to P-drop.
In principle, we could appeal to the properties of the phonological systems of Daghestanian and Turkic languages that our informants natively speak, namely, the absence of the [v] (labiodental fricative) phoneme 16 and the ban on consonant clusters in the syllable onset (Kibrik & Kodzasov, 1990; Shiraliev & Sevortjan, 1971), as was done for other contact-influenced varieties of Russian (Khomchenkova et al., 2017; Stoynova, 2019). However, if we look at individual prepositions, we can see that it is the consonant–vowel preposition na ‘on(to)’ that displays a striking contrast between vowel-initial and consonant-initial complements. It is dropped before vowels in 3.6% of cases and before consonants in 16.8% of cases. The monoconsonantal prepositions v ‘in(to)’ or s ‘with; from, off’, on the other hand, do not show a sharp contrast: v ‘in(to)’ is dropped before 39.7% of the vowel-initial complements and 42.9% of the consonant-initial complements; the respective percentages for s ‘with; from, off’ are 6.5% and 6.8%. 17 Although we do not know the reason underlying the unexpected pattern exhibited by na ‘on(to)’, it can definitely not be attributed to consonant cluster avoidance.
Relevance to previous research
In the “Background” section we reviewed four main groups of approaches to P-drop: (a) phonetics/phonotactics-based accounts; (b) accounts appealing to morphosyntactic interference with other languages; (c) markedness-based analyses; and (d) formal syntactic accounts positing exceptional structure for P-drop constructions.
We have just argued that phonetic and phonotactic reasons are rather unlikely to be definitive for P-drop exhibited by Daghestanian highlanders.
While it is not our aim here to provide a formal syntactic treatment of the phenomenon, if we were to do so, it would be clear that a pseudo-incorporation analysis would not be fitting or at least sufficient, as P-drop observed in DagRus is by no means restricted to bare, non-modified argument NPs. A formalization that appears to be most compatible with our data is the null κP proposal of Biggs (2015), since it is not predicated on the PP’s adjacency to a verb or bareness of the complement NP; in addition, it can capture the fact that NPs in P-drop environments in DagRus typically bear proper case morphology.
Our explanation bears a certain affinity to the markedness and interference accounts. Specifically, our idea is that the prepositions v ‘in(to)’ and na ‘on(to)’ are systematically dropped because they have a very abstract meaning in core contexts; this is close to the idea that these contexts are unmarked. Incomplete acquisition that we appeal to as another factor conditioning P-drop is related to morphosyntactic influence of the consultants’ L1 on their Russian speech. However, this influence is more general than interference with case and postpositional systems of a particular L1: it is the absence of marking of the left-edge of the NP in the speakers’ L1 that makes the acquisition of the Russian prepositional system a particularly challenging task (see Jarvis & Pavlenko, 2008, p. 94 for similar ideas concerning L2 English acquisition).
Thus, our account combines and complements insights from previous research, providing a more general, bipartite explanation.
Conclusion
In this paper, we presented a quantitative corpus study of the phenomenon of P-drop in Russian spoken in highland Daghestan. Based on a data set consisting of 2350 PPs coming from sociolinguistic interviews with 47 speakers we found three factors that are significant predictors of P-drop. These are preposition type, fluency in Russian, and phonetic context. We consider the former two factors and their synergistic effect to conceal deeper reasons underlying P-drop.
The preposition type factor is a manifestation of a cross-linguistic tendency toward less formal marking of certain spatial and temporal locations, such as toponyms and their like, and referential temporal expressions. The idea is that P-drop is particularly prominent precisely with those prepositions that are employed in the aforementioned contexts: v ‘in(to)’, na ‘on(to)’ in Russian and their correlates in other languages.
The fact that a lower fluency level corresponds to more extensive P-drop is reminiscent of one of the strategies that L2 speakers employ to cope with incomplete acquisition of the prepositional system—to avoid prepositional marking in case of uncertainty.
The aforementioned typological tendency can be more or less pronounced in languages of the world: for instance, several British English dialects exhibit P-drop in a slightly wider range of contexts than Standard English. When this tendency is coupled with incomplete acquisition, the probability of P-drop to be extended to more contexts becomes higher than at a chance level. Therefore, we find more P-drop in contact varieties, including pidgins and creoles.
We suggest that contact-influenced varieties of Russian investigated so far appear to exhibit more prominent P-drop than other languages for two reasons: (a) it is in contact almost exclusively with head-final, postpositional languages; and (b) Russian prepositional complements are inflected for case, which encodes key semantic distinctions, such as location and direction, and their metaphorical extensions. The first factor leads L2 speakers of Russian to omit rather than replace prepositions in case of uncertainty. Due to the second factor, prepositions with abstract locative semantics can be dropped without a significant loss of meaning.
To summarize, the prominence of P-drop in the speech of Daghestanian highlanders results from an interplay of two factors: a cross-linguistic tendency for certain spatial and temporal locations to be formally unmarked; and incomplete acquisition of the Russian prepositional system on the part of native speakers of postpositional languages.
Footnotes
Appendix
Types of deviations from monolingual (L1) Russian counted for the purposes of assessing the fluency level.
agreement (noun-modifier; subject-predicate)
(A1) a. DagRus [arhit.хив.26] sovetskij vlast’ ‘Soviet.
b. L1 benchmark
sovetskaja vlast’ ‘Soviet.
(A2) a. DagRus [balhar.балхар.нд22]
togda naši proigral
then our.
b. L1 benchmark
togda naši proigrali
then our.
‘Then we [the guys from our village] lost.’
choice of the form of a pronominal expression
(A3) a. DagRus [chuni.чуни.31н]
doroga byla, ego rasširili
road(
b. L1 benchmark
doroga byla, eё rasširili
road(
‘There was this road; it was widened.’
government
(A4) a. DagRus [makhachkala.add-2]
tradicionnyj islam nas učili
traditional.
b. L1 benchmark
tradicionnomu islamu nas učili
traditional.
‘We were taught traditional Islam.’
(A5) a. DagRus [yangikent.янгикент.41]
frukty ne bylo
fruit.
fruktov ne bylo
fruit.
reflexive verbs without the reflexive affix or vice versa
(A6) a. DagRus [kina.кина.нд40]
vot azerbajdžanskije èto otary byvalis’
b. L1 benchmark
vot azerbajdžanskije èto otary byvali
(A7) a. DagRus [karata.тлибишо.нд22]
sčetovod imel togda v škole
accountant possess.
b. L1 benchmark
sčetovod imelsja togda v škole
accountant possess.
aspect and tense form choice
(A8) a. DagRus [darvag.ерси.мд01]
vot èta kniga, ona najdena v Drezdene
b. L1 benchmark
vot èta kniga, ona byla
najdena v Drezdene
found.
‘This book here, it was found in Dresden.’
(A9) a. DagRus [archib.арчиб.add.syn-1]
a začem mne posmotret’
b. L1 benchmark
a začem mne smotret’
‘And why should I look (at it)?’
word order and headedness in relative clauses
(A10) a. DagRus [rikvani.зило.нд01]
pošёl k kto znaet učiteljam
go.
b. L1 benchmark
pošёl k tem kto znaet. . .
go.
(A11) a. DagRus [archib.арчиб.add.syn-2]
v Maxačkale kotoryj oni postroili dom
in Makhachkala.
dom kotoryj oni postroili v Maxačkale
house which.
(A12) a. DagRus [shangoda.мегеб.syn-153]
u nas <. . .> reputacija xorošaja
at 1
byla čem u nix
be.
b. L1 benchmark
u nas <. . .> reputacija lučše
at 1
byla čem u nix
be.
‘our reputation was better than theirs’
lexical choice
(A13) a. DagRus [archib.арчиб.syn-138]
ručnyje knigi
b. L1 benchmark
rukopisnyje knigi
‘handwritten books’
other grammatical deviations
(A14) a. DagRus [archib.арчиб.add.syn-1]
a papa xočet akkuratno čtoby delat’
b. L1 benchmark
a papa xočet akkuratno delat’
c. L1 benchmark
a papa xočet akkuratno čtoby delali
Acknowledgements
We are grateful to all our colleagues who discussed previous versions of the manuscript with us, especially Peter Arkadiev, Ilya Chechuro, Michael Daniel, Nina Dobrushina, Ezequiel Koile, George Moroz, Maria Polinsky, Ilya Schurov, and Natalya Stoynova. We also thank the two anonymous reviewers and the editorial team of the International Journal of Bilingualism for providing critical feedback on the manuscript. Any remaining errors are our own.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The article was prepared within the framework of the HSE University Basic Research Program.
