Abstract
This study explores how decision-making is collaboratively accomplished in Video Assistant Referee (VAR) protocols as a form of technologically mediated, high-stakes professional communication. Using 107 official VAR recordings from the 2024–2025 Turkish Football Federation season, including a mid-season shift from Turkish to English as a lingua franca (EFL), the analysis uncovers how on-field referees (REFs) and VAR officials reach consensus. Findings reveal that VAR–REF coordination unfolds through three sequential phases: (1) establishing recipiency, (2) framing evidence, and (3) closing the protocol. Turkish-language episodes are relationally rich, using honorifics (e.g. hocam) and kinship terms (e.g. abi) to expand sequences and manage disagreement while EFL episodes are procedurally concise and linguistically minimal, emphasizing clarity and progressivity. Across both settings, decisions emerge through coordinated talk, gesture, and embodied action. The study advances understanding of institutional interaction, multilingual teamwork, and technologically mediated judgment, highlighting how language shapes the organization of (dis)agreement in professional decision-making.
Keywords
Introduction
Referees are the constant decision-makers in a high-stakes professional football match. As the main on-field referee, they are required to monitor every movement of 22 players across a 7000-square-metre field, continuously assessing whether each action complies with the Laws of the Game, and deciding when to intervene, and how to restore order when those rules are broken (International Football Association Board, 2025). There is no pause, no moment of stillness for at least 90 minutes. Their attention is split across multiple sensory channels: The crowd’s roar, the players’ appeals, the movements of their assistants, and their own position on the field. While their attention is fixed on the unfolding play, they are simultaneously maintaining audio and visual coordination with the three on-field colleagues: The two assistant referees positioned on the touchlines and the fourth official standing between the benches. Together, they form a live network of human judgment, embodied vigilance, and verbal coordination (Zglinski, 2022: 6).
But this network extends beyond what the referees can see or hear on the pitch. Somewhere else, often in a remote operations room, another team of referees is watching the same match through multiple broadcast camera feeds. Every decision is subject to instant re-evaluation from different angles and in slow motion (Gasparetto and Loktionov, 2023). When a possible error is detected, this off-field team intervenes. Suddenly, a voice enters the earpiece: “Possible handball. I recommend an on-field review.” The main referee blows the whistle, makes the standard TV screen signal, jogs to the sideline, and watches the replay footage on a monitor (see Figure 1). Through the headset, the on-field main referee (REF) and the Video Assistant Referee (VAR) discuss the footage, each describing what they see, testing interpretations, and jointly working toward a single institutional outcome: The final decision.

The REF making the TV screen signal with the whistle blowing.
This scene captures the essence of football’s technologically mediated officiating system: An interactional space where human judgment and digital vision are combined under extreme temporal and institutional pressure. The introduction of VAR has thus transformed not only how decisions are made but also how professional communication is accomplished in real time. That is, decision-making in football is no longer a solitary or purely visual act. It has become a collective communicative achievement involving talk, gesture, technology, and accountability before a global audience (see Dyer, 2015 for a detailed review on controversy of sports technology).
While the VAR system was introduced to increase fairness and accuracy, it also introduced new challenges of coordination, understanding, and authority (Rogerson et al., 2024). The VAR and REF must rapidly establish mutual orientation, describe events, negotiate interpretations, and reach consensus, all through the constrained channel of headset communication. These moments of talk are brief but consequential: The timing of a single word, the form of address, or the tone of agreement can shape not only the decision but also the legitimacy of officiating itself. These interactions provide a unique window into how institutional reasoning is organized (and accomplished) through language and embodied conduct.
This study investigates how REFs and VARs establish recipiency, frame evidence, display (dis)agreement, and conclude the protocol during VAR interventions in the Turkish Football Federation’s 2024–2025 Premier League season. The season is particularly distinctive as midway through in Week 20, the Federation shifted the language of officiating from Turkish to English as a lingua franca (EFL) when international VAR officials were appointed (Atabay, 2024). This change provides an exceptional comparative opportunity to explore how professional coordination unfolds across native (Turkish) and non-native (EFL) communicative environments within the same institutional system. Drawing on 107 official VAR recordings released by the Turkish Football Federation, this study employs fine-grained sequential analysis to examine how communication practices (such as address forms, sequential timing, and embodied gestures) are used to secure attention and align decision-making across technological and linguistic boundaries. By analyzing these episodes, the study contributes to our understanding of how consensus and accountability are interactionally achieved in one of the most scrutinized institutional arenas of modern sport.
The paper is organized as follows. It first introduces the professional and technological configuration of refereeing and the interactional ecology of the VAR system. The next section outlines the data and method, detailing how official recordings were collected and analyzed. The subsequent section presents the findings, organized around the three sequential phases of the VAR protocol. The paper concludes by discussing the implications of these findings for research on institutional communication, multilingual coordination, and decision-making under technological mediation. At the core of the analysis lies a comparative focus on how decision-making is interactionally accomplished in English-as-a-lingua-franca (EFL) VAR protocols versus native Turkish VAR protocols.
Refereeing in professional football and VAR
Refereeing in professional football is a collective and distributed activity organized through multiple interdependent roles (see Aragão e Pina et al., 2018). A standard officiating team consists of four on-field referees who jointly sustain the order of the match through coordinated observation, communication, and decision-making. The main referee (REF), also known as the center referee, is the ultimate authority on the pitch, responsible for enforcing the Laws of the Game, managing player conduct, and rendering final judgments. Supporting the REF are two assistant referees (linesmen) positioned along the touchlines. They contribute localized monitoring of offside positions, boundary decisions (throw-ins, corner kicks, goal kicks), and fouls occurring in their immediate zones. A fourth official (4RE), located near the technical area, manages substitutions, controls the display of stoppage time, and assists in administrative and disciplinary matters (International Football Association Board, 2025: 69–79).
Over the past decade, this configuration has been expanded by the introduction of the Video Assistant Referee (VAR) system, which integrates digital technology into the referee team’s decision-making infrastructure (Spitz et. al., 2020). The VAR system is designed to support the on-field officials in resolving incidents that may critically affect the outcome of a match. According to FIFA regulations, VAR intervention is restricted to four categories of potentially decisive incidents: (1) the scoring of a goal and any preceding infringement; (2) penalty decisions; (3) direct red-card offenses; and (4) cases of mistaken identity in disciplinary sanctions (de Oliveira et al., 2023).
The VAR team, operating from a remote Video Operations Room (VOR), typically consists of a Video Assistant Referee (VAR), an Assistant VAR (AVAR), and a Replay Operator (RO). The VAR and AVAR are licensed referees trained in video review procedures, while the Replay Operator handles the technical control of broadcast feeds and camera angles (see Figure 2). Together, they continuously monitor live broadcast footage from multiple synchronized cameras to identify potential “clear and obvious” errors. When a possible error is detected, the VAR communicates directly with the REF through the headset system, describing the incident and recommending a review. The REF may accept the information immediately or conduct an On-Field Review (OFR) using the sideline monitor. Regardless of VAR input, the final decision always rests with the REF, who signals an on-field review by forming a rectangular “TV screen” gesture before approaching the monitor (International Football Association Board, 2025: 146–156).

The VAR room.
This communicative network creates a technologically mediated and institutionally regulated interactional ecology. All four on-field officials (i.e. the REF, two assistants, and 4RE) remain connected via an open-microphone, full-duplex communication system, which enables real-time, simultaneous talk and listening. The VAR team is linked to this same network, meaning that both co-present participants (the on-field referees and players) and non-co-present but hearable participants (VAR officials in the VOR) are integrated into a single audio field. Communication is encrypted, recorded, and time-stamped for transparency and later assessment. During a VAR review, however, only the VAR and the REF are authorized to exchange talk relevant to the decision, while the assistants and the 4RE typically listen silently unless explicitly addressed (Zglinski, 2022).
From an interactional point of view, the VAR protocol represents an institutional form of talk-in-interaction conducted in a high-stakes, time-critical, and technologically mediated environment (see Drew and Heritage, 1992). Communication between the VAR and REF is not merely the exchange of factual information but the joint accomplishment of decision-making under accountability constraints. Each review unfolds as a sequentially organized episode in which participants must establish recipiency, display mutual understanding, and achieve an actionable conclusion, often within seconds, while multiple audiences (players, coaches, spectators, and broadcast commentators) observe the process.
Comparable to other institutional contexts such as emergency service dispatch calls (Whalen et al., 1988), crisis management in London underground line control rooms (Heath and Luff, 1992), or cockpit coordination in aviation (Nevile, 2004), VAR communication is characterized by its emphasis on precision, timing, and coordination: Talk is minimal, highly routinized, and embedded in an environment of overlapping visual and auditory activities. It is through this organized interplay of talk, technology, and embodied conduct that referees collectively maintain the integrity and pace of the game.
Within this framework, the present study investigates how VARs and REFs accomplish reaching consensus during the review sequences, focusing on first how VARs secure the attention of REFs who are simultaneously managing the on-field environment, then how both parties orient to their accounts, and finally how they bring the VAR protocol into end. By comparing interactions conducted in Turkish and in English as a lingua franca (EFL), the study also highlights how linguistic and cultural resources are mobilized to coordinate professional judgment and uphold institutional accountability within football’s evolving, technology-mediated officiating system (see Mauranen, 2018 for the conceptualization of English as a lingua franca).
Data and method
This study is based on the official VAR recordings released by the Turkish Football Federation (TFF) on its official YouTube channel during the 2024–2025 Turkish Super League season (Turkish Football Federation, 2025). The dataset provides high-quality, synchronized audio–visual materials capturing the real-time communication between the REF and the VAR officials during match incidents. These recordings include both the verbal exchanges within the communication system and the embodied conduct of REFs on the pitch (e.g. gestures, whistle use, and monitor consultation). In total, 107 documented VAR interventions were systematically collected and archived for detailed analysis.
A significant development in this dataset is the mid-season shift in the institutional language of communication. From Week 20 onwards, the TFF introduced international VAR officials, and all subsequent VAR–REF exchanges were conducted in English as a lingua franca (Atabay, 2024). This transition from Turkish to English interaction offers a rare opportunity to compare two institutional communication settings within the same sport and organizational system. It allows for a fine-grained, comparative examination of how VARs and REFs establish recipiency, initiate review sequences, and negotiate decision-making in native (Turkish) and non-native (EFL) professional environments.
Across the season, the dataset covers a diverse range of VAR interventions with match-changing consequences. Of the 107 documented cases, 54 resulted in penalties (32 for defender foul play and 21 for handball offenses). VAR recommendations also led to 20 direct red cards, primarily for violent conduct or denial of an obvious goal-scoring opportunity, and the cancellation of 16 goals (10 for attacking fouls, 4 for offside, and 2 for handball). Additionally, 10 penalty calls were overturned: four due to no foul, three for no handball, two for offside, and one for deceptive foul play. One disciplinary card was canceled following confirmation of offside interference. Also, seven cases involved explicit disagreement between the VAR and the REF regarding the appropriate on-field decision (see Icbay, 2026). The disagreement cases refer to instances where the REF rejects the VAR’s intervention and maintains their original decision. These moments of divergence, as a form of breaching the preferred agreement routine, provide critical interactional evidence of how (dis)alignment is achieved, resisted, or reformulated through talk and embodied conduct in technologically mediated settings (see Garfinkel, 1967 for breaching experiments). The full list of interventions and their classifications is presented in Appendix 1.
This study adopts Ethnomethodology and Conversation Analysis (EMCA) as its analytic framework to examine how REFs and VARs collaboratively accomplish decision-making in football. Rooted in Garfinkel’s (1967) conception of Ethnomethodology (EM), the study views decision-making not as a pre-given structure but as a practical accomplishment emerging from the participants’ situated actions and interpretive work (Icbay, 2025a). Within this framework, Conversation Analysis (CA) (Sacks et al., 1974) offers a set of established methodological tools for tracing how institutional tasks are accomplished turn by turn. In this study, the analysis draws in particular on (1) turn-taking organization (e.g. how participants manage timing, overlap, and progressivity), (2) adjacency pairs and sequence organization (e.g. summons–answer, recommendation–acceptance/decline, and practices of sequence expansion and closure), (3) repair (how troubles of hearing/understanding are addressed and resolved), and (4) preference organization (how agreement and disagreement are designed, delayed, mitigated, or accounted for). These tools are complemented by a multimodal orientation to how embodied conduct (gaze, gesture, headset-touching, whistle use, the TV-screen signal, and movement to the monitor) is temporally coordinated with talk in accomplishing each phase of the VAR protocol.
Rather than treating the VAR protocol as a fixed institutional script, this study approaches it as an interactional achievement, continually negotiated through talk, gesture, gaze, and technological mediation. The analysis relies on naturally occurring, video-recorded data, transcribed using Jeffersonian (2004) conventions to capture both verbal and embodied conduct (e.g. the rectangular “TV signal,” whistle use, and gaze shifts). Each analytic claim is grounded in participants’ own displayed orientations. VAR sequences are institutionally formatted interactional practices (Drew and Heritage, 1992), comprising ordered stages of initiating review, framing evidence, (dis)agreeing on interpretation, and closing the protocol. These sequences reveal the dynamic display of epistemic (knowledge) and deontic (authority) stances (Heritage, 2012; Stevanovic and Peräkylä, 2012), showing how professional authority in refereeing is interactionally negotiated rather than procedurally imposed.
In this paper, epistemic stance refers to the participants’ moment-by-moment displayed orientations to knowledge, perception, and interpretive access. That is, it demonstrates how speakers index what they know, how they know it, and with what degree of certainty and authority they can treat a description as warranted. Following the EMCA work on epistemics, there is a difference between epistemic status and epistemic stance. Epistemic status concerns the relatively stable, role- and context-based distribution of access to knowledge (who is structurally positioned to know and see what), whereas epistemic stance concerns the locally displayed position a speaker takes in a given turn (e.g. claiming strong vs weak access, asserting vs querying, upgrading vs downgrading certainty) (Heritage, 2012; Heritage and Raymond, 2005).
This distinction is particularly consequential in VAR decision-making because the protocol creates systematic epistemic gradients. The REF’s epistemic status is grounded in embodied, real-time perception and the management of the match environment, while the VAR team’s epistemic status is grounded in technologically mediated access to multiple camera angles, replay, slow motion, and comparative viewing. However, the analyses in this paper do not treat epistemic authority as a fixed attribute of a role. Instead, this study explores how epistemic authority is enacted, challenged, and ratified turn by turn. In the following excerpts, I track recurrent practices through which the participants display epistemic stance, such as: (1) claims of perception and evidential grounding (e.g. formulations that tie an assessment to a particular view, angle or replay moment), (2) graded assertability (strong assertions vs tentative proposals, and upgrades/downgrades of certainty), (c) epistemic primacy displays (treating one’s access as decisive, or inviting the co-participant’s access through interrogatives and candidate understandings), and (d) epistemic negotiation practices through which participants test interpretations and pursue alignment (Curl and Drew, 2008; Heritage, 2012; Heritage and Raymond, 2005).
The epistemic stance in VAR talk is not only about “having seen” an event. Rather, it is also about the competence to interpret what is seen in institutionally relevant terms. Accordingly, epistemic stance is treated as bound up with the production of professional vision: the socially organized ways in which professionals are trained to see, categorize, and render events accountable through institutionally sanctioned coding schemes (Goodwin, 1994). In the VAR interaction ecology, the epistemic stance is routinely displayed through practices that render visual evidence consequential: directing attention to a specific moment, angle, or point of contact, invoking rule-relevant categories, and treating particular visual details as sufficient grounds for an institutional action (e.g. continuing the check, recommending an on-field review, or closing the sequence).
Accomplishing decision-making in VAR protocols
The VAR protocols are analyzed through their temporal and sequential unfolding, focusing on three interactional phases: (1) initiating the protocol, (2) framing evidence, and (3) closing the protocol. This analytic segmentation follows the key principle of “order at all points” (Sacks, 1985), emphasizing how participants (REFs and VARs) themselves organize and display the procedural stages of an activity through talk, timing, and embodied conduct. Each phase captures a locally meaningful trajectory in which participants orient to the institutional task of decision-making within the constraints of real-time gameplay. These dimensions also provide an interactional basis for comparing how consensus is achieved across Turkish and EFL settings. In particular, the analysis attends to turn construction (the sequential development of disagreement and agreement) and embodied coordination (e.g. gaze, gesture, bodily alignment, and technological mediation) through which the referee displays understanding, stance, and accountability.
However, these interactional phases in the VAR protocols class are the researcher’s arbitrary categorizations as an analyst’s formal interpretation, not the participants’ orientation to their own construction of the decision-making as an interactional accomplishment. For the referees in the game, the only stage that they orient to as a decision-making process is the continuous 90-minute period (Icbay, 2022). At the same time, temporal structuring is not an analytic imposition but a participants’ phenomenon, something accomplished through their own orientations (Garfinkel, 1967). By segmenting the VAR–REF review into phases such as establishing recipiency or framing evidence, the analysis does not impose an external chronology but recovers the orderliness that the participants themselves display. For example, the transition from “VAR recommends on-field review” to “REF approaches the monitor” is marked by distinctive embodied and linguistic cues (e.g. whistle, TV-sign gesture, mutual acknowledgment). These actions publicly display mutual orientation to a new phase in the institutional sequence, thereby constituting the activity’s temporal organization.
Thus, the analytic claims in the following sections are grounded in the participants’ publicly available orientations rather than in analyst inference or contextual knowledge. Following the core EMCA principles, each excerpt is examined through (1) action formation (what social/institutional action a turn is designed to implement), (2) sequence organization (how actions are positioned and made conditionally relevant in adjacency pairs and expansions), and (3) turn design and turn-taking (lexical choice, address terms, modality, timing, overlap, and progressivity) (Sacks et al., 1974; Schegloff, 2007; Sidnell, 2010; ten Have, 2007). The analyses rely on the next-turn proof procedure, treating subsequent uptake, compliance, resistance, repair, or reformulation as the primary evidential basis for what prior turns are understood to be doing (Heritage, 1984; Hutchby and Wooffitt, 2008; Schegloff, 1992).
Because VAR review is conducted in a technologically mediated and video-supported ecology, the analysis also takes a multimodal perspective. That is, talk is analyzed together with the temporally coordinated embodied conduct through which officials display recipiency, suspension of on-field activity, transition to review, and closure (see Goodwin, 2000; Heath et al., 2010; Mondada, 2013). Accordingly, the analytic claims about “authority,” “certainty,” or “disagreement” are only made when they are demonstrably displayed in turn design and sequential consequences (e.g. mitigations, delays, accounts, repair initiations, upgraded directives, or explicit confirmations) rather than presumed from role relations. Analytically, I proceed in a constant-comparative way typical of CA: I build collections of recurrent practices across the dataset (e.g. summons–answer openings; recommendation/acceptance formats; evidence-framing formulations; closure confirmations), analyze multiple instances to establish the practice’s interactional environment and variability, and then select excerpts that are representative and/or deviant to test the robustness of the account (Drew, 1995; Schegloff, 1992).
Initiating the protocol
In the VAR–REF system, the opening phase of each review sequence begins with the establishment of recipiency: The interactional practice by which one participant’s talk (VAR) becomes recognizably directed to another (REF), the recipient ratifies their participation, and both parties move toward a jointly coordinated institutional action (Ford and Stickle, 2012; Lerner, 2003; Mortensen, 2009). In a typical VAR protocol, this is achieved through a summons–answer sequence, a fundamental conversational structure in which the first pair part (FPP) summonses the recipient’s attention, and the second pair part (SPP) ratifies that summons and establishes mutual orientation (Schegloff, 1968).
Episode 1: W28KA77—EFL.
1
In Episode 1 (W28KA77), recipiency is accomplished through a sequentially ordered adjacency pair between lines 1–3. The summons (“umut, julian speaking” in line 1) displays both recognition and self-identification. That is, the inclusion of both names is a reflexive procedure that secures mutual recognition across physical distance and possible linguistic diversity (Icbay, forthcoming). The answer (“yea-” and “yes”) is accompanied by the REF’s touching of the headset, an embodied signal to nearby players and assistants in the field that a remote communication is occurring. This gesture serves as an embodied display of recipiency (Goodwin, 2000), indexing to multiple audiences that the talk is being received and the REF is entering a new institutional practice. This adjacency pair also fulfills the conditions of conditional relevance (Schegloff, 1987): The VAR’s summons makes a response relevant, and the REF’s reply completes the adjacency pair, thereby enabling the subsequent institutional move (i.e. the VAR’s recommendation in lines 4–5). The subsequent embodied actions (i.e. REF’s whistle, TV signal, and run to sideline) display the closure of the recipiency phase and the transition to review action.
Similarly, in Episode 2 (W15SG96), the opening summons–answer pair occurs in lines 1–2, where the VAR’s summons (“turgut hocam” in line 1) projects both attention and deference. The REF’s immediate answer (“I’m listening” in line 2) demonstrates unmarked uptake (Heritage, 1984), ratifying the summons and displaying readiness for task engagement. The subsequent embodied moves (lines 5–7) once again mark the completion of the recipiency phase and transition into the review.
Episode 2: W15SG96—TUR.
In Episode 3 (W37KA07), the VAR’s self-identification (“jeroen speaking” in line 2) ensures recognition across linguistic variation: A practice rarely seen in the Turkish corpus, where the participants (i.e. Turkish speaking REFs and VARs) share a common linguistic and cultural frame. The REF’s minimal “okay” in line 3 functions as a continuer rather than an affiliative response, allowing the sequence to progress efficiently without additional relational elaboration.
Episode 3: W37KA07—EFL.
Similar to the recipiency establishing practice in the previous three episode, in Episode 4 (W25SK76), the REF’s “yes, I’m here” in line 2 with the act of touching the headset is both verbal and embodied confirmation. It again indexes to multiple audiences that the talk is being received and the REF is entering a new institutional practice.
Episode 4: W25SK76—EFL.
The four opening sequences so far has demonstrated how VAR–REF review sequences are opened through recipiency-establishing practices that make the VAR’s talk recognizably addressed to the on-field referee and secure the referee’s ratified participation before the institutional task (the review recommendation) proceeds (Ford and Stickle, 2012; Mortensen, 2009). Across the four episodes (EFL and Turkish), recipiency is recurrently accomplished via a summons–answer adjacency pair, often built through address terms/naming and, and in some cases, self-identification (“X speaking”), which works as a turn-allocation device that selects the next speaker and establishes mutual orientation (Lerner, 2003; Sacks et al., 1974; Schegloff, 1968). The summons makes a response conditionally relevant, and the referee’s minimal uptake (“yes/okay/I’m listening”) functions as the second pair part that completes the opening and enables the next institutionally relevant move (the VAR’s recommendation) (Schegloff, 1968, 2007). Recipiency is also shown to be multimodal: REF’s headset-touching, whistle use, and the TV-screen gesture display to co-present players and officials that a remote exchange is underway and that the REF is transitioning into a distinct phase of work, consistent with findings on embodied action and multimodal coordination in professional settings (Ford and Stickle, 2012; Goodwin, 2000; Heath et al., 2010; Mondada, 2013). Finally, the referee’s “okay”-type responses can be treated as progressivity-oriented continuers that keep the sequence moving efficiently toward review initiation rather than expanding affiliative talk, a well-documented interactional resource in CA (Schegloff, 1982).
In some sequences, however, the summons and the review directive are produced within the same turn, integrating both actions before any ratifying response. This structure slightly alters the sequential trajectory but maintains conditional relevance through immediate post-directive acknowledgment. In the following two episodes (W35BF55 and W11SA23), for example, the summons and directive are syntactically and sequentially fused, making the VAR’s turn a compound action that both summons and instructs. As a result, the REF’s response (“okay” in line 4 in Episode 5, and “okay hocam” in line 4 in Episode 6) serves a dual function: ratifying the address and accepting the institutional directive.
Episode 5: W35BF55—EFL. Episode 6: W11SA23—TUR.
In addition, a recurrent feature of the Turkish episodes (Episodes 2 and 6) is the use of the honorific address term hocam. In these Turkish sequences, recipiency is achieved through honorific address term as in “turgut hocam” in Episode 2 or “tamam hocam” in Episode 6. The term hocam (literally “my teacher” in Turkish) is a multifunctional indexical resource that operates at the intersection of language, culture, and institutional interaction (Keshavarz, 2022). Its use in VAR protocols carries both interactional and moral work: It acknowledges the recipient’s epistemic authority while simultaneously displaying affiliation and respect (Heritage, 2012; Hofstede, 2001). In contrast, EFL sequences exhibit a more procedural orientation to recipiency. Summonses are shorter and stripped of relational markers, as in “burak, on-field review” in line 1 in Episode 5 or “atilla jeroen speaking” in lines 1–2 in Episode 3. The absence of honorifics reflects a lingua-franca environment (Baker, 2018; Firth, 1996), where participants draw on minimal linguistic resources to maintain clarity and efficiency.
The next interactional practice within the opening phase involves the VAR’s formulation of the review request. Once mutual attention and readiness are secured, the VAR proceeds to the core institutional action: Delivering the official recommendation to initiate an on-field review (see Cunningham et al., 2012 for a similar finding in refereeing). This action is formatted through a standardized turn containing the formulaic expression “I recommend you an on-field review” followed immediately by the VAR’s account specifying the reason for the recommendation. The reason component provides the relevant grounds for the upcoming review, such as “possible handball” (line 5, Episode 1), “potential red card” (line 4, Episode 2), or “possible penalty” (line 4, Episode 3).
This first focal action is designed in a highly routinized, procedure-oriented format (“I recommend you an on-field review”), which is characteristic of institutional settings where the participants must produce actions that are recognizable, efficient, and accountable as doing the job rather than negotiating interpersonal relations (Drew and Heritage, 1992). This recommendation is immediately followed by a reason/grounds component (e.g. “possible handball,” “potential red card,” “possible penalty”), which functions as an account that specifies what the recommendation is for and thereby makes the upcoming disruption of play publicly warrantable (Houtkoop-Steenstra, 1990). Providing grounds here is not an “extra description” but a sequentially strategic practice: It projects the next course of action (review initiation) and can pre-empt delay, uncertainty, or resistance by supplying the relevant basis for compliance in the same turn as the directive/recommendation (Heritage and Lindström, 2012).
The closing of the opening phase is marked by the REF’s acceptance of this recommendation. Acceptance in the episodes here is not usually verbalized but accomplished through a recurrent multimodal action sequence, composed of three consecutive embodied moves: (1) producing the TV screen signal with the hands, (2) simultaneously blowing the whistle to suspend play, and (3) running toward the sideline to access the on-field review monitor. These actions are temporally and sequentially aligned, forming a recognizable and institutionally ratified pattern across all examined cases.
The VAR’s review recommendation and the REF’s response together form another adjacency pair within the opening phase of the VAR protocol. The first pair part (FPP) is the VAR’s directive-format proposal (i.e. “I recommend you an on-field review”), which makes conditionally relevant a second pair part (SPP) in which the REF either accepts or rejects that recommendation. In principle, the VAR’s suggestion is recommendatory, not binding: The REF retains ultimate authority to decline the review and allow play to continue. Thus, the structure embodies an asymmetrical but negotiable distribution of deontic rights (Stevanovic and Peräkylä, 2012): The VAR is entitled to propose, but the REF is authorized to decide.
From an interactional perspective, this recommendation-acceptance/decline pair is positioned as a next action made conditionally relevant by the completion of the summons–answer sequence. Once recipiency has been established, the interactional slot is open for the VAR to advance the review proposal. In this position, the VAR’s turn performs both a directive and an institutional report: it calls upon the REF to act (review the case) while simultaneously displaying the VAR’s epistemic access to video evidence unavailable to the on-field official (Heritage and Raymond, 2005: 16).
The reason component functions as an accounting practice that legitimizes the review proposal and ensures procedural transparency (Garfinkel, 1967). By providing a minimal but sufficient explanation—such as “for a possible handball at the APP”—the VAR demonstrates professional accountability and mitigates the deontic asymmetry of directing another official’s action. This design reflects the institutional requirement for rationalization of action (Heritage and Clayman, 2011): the VAR’s authority to recommend is contingent on being seen to act on observable, evidence-based grounds.
Framing evidence
The second major phase in the VAR–REF interaction is the framing of evidence: A phase in which the participants collaboratively produce, display, and evaluate the relevancy of visual materials to construct a justified and accountable decision (Cunningham et al., 2012). The framing evidence here refers not only to the selection of video segments but also to the interactional work through which the VAR and REF render those segments intelligible, observable and accountable as professional evidence (see Goodwin, 1994). The production of an evidential account is therefore not merely a technical display but a socially organized achievement, embedded within institutional roles, linguistic resources, and technological mediation (Due and Licoppe, 2020; Lynch, 1988). The framing evidence phase thus accomplishes two parallel goals: (1) to display that the REF’s final judgment is based on observable phenomena, and (2) to publicly justify the decision-making process to an imagined audience of players, officials, and viewers. In this way, the talk constitutes a form of reflexive accountability: The interaction itself provides evidence that due process has been followed.
Episode 7 (W5KF35) is taken from the match between Kasımpaşa and Fenerbahçe. In the 35th minute, the referee allows play to continue even though a Fenerbahçe player (number 97) is injured following a challenge by a Kasımpaşa defender (number 20) inside the penalty area. The referee’s initial decision is not to award a foul or penalty, allowing play to go on. However, a few seconds later, the VAR team intervenes and recommends an on-field review.
Episode 7: W5KF35—TUR.
2
The episode begins with the establishment of mutual contact between the REF and the VAR (lines 1–8). Through the recipiency the participants display mutual attention and readiness to engage, marking the transition into the institutional task. Following this opening, the framing evidence phase begins as the REF runs toward the sideline to inspect the footage provided by the VAR team. While the REF is in motion, the VAR delivers the first preliminary account of the event, offering an early formulation of what the video evidence shows: “number twenty steps on the foot of number ninety seven” (lines 9–14) (see Figure 3). This early verbal framing is a crucial resource that orients the REF to the focal point of the upcoming review, effectively shaping how the evidence will be seen and interpreted. Upon reaching the monitor, the REF initiates the formal viewing sequence with “let’s see, yes” (line 17), which signals both his physical engagement with the screen and his acknowledgment of the VAR’s prior formulation. In doing so, the REF claims epistemic access to the visual data and begins to recontextualize the VAR’s account through his own observation (see Figure 4).

The VAR’s initial account while REF running to OPR.

REF in the upper right corner, the VAR officials in the lower right corner, and the shared review screen displaying the relevant match footage on the left side.
During the initial viewing of the first camera angle, the VAR highlights the critical moment in the footage by explicitly verbalizing the key action: “yes, this is the step moment” (line 21). This utterance is produced in immediate response to the REF’s own preliminary reformulation of the scene—“after the ball left, number twenty” (line 19), which functions as an early interpretive move identifying the relevant conduct (“the step on the foot”). Through this adjacency pair, the VAR’s utterance operates as a second-position confirmation (Heritage, 2012), displaying alignment with the REF’s interpretation and thereby stabilizing the evidential focus of the review. Concurrently, the REF orients to another aspect of the institutional decision framework by ticking off a procedural criterion: “the ball is in the field” (line 22), followed by its immediate self-repetition in the next turn, “yes, the ball is in the field” (line 23). This self-confirmation acts as an audible demonstration of verification, publicly displaying that the REF is systematically evaluating the situational conditions for the incident (International Football Association Board, 2025). The VAR’s subsequent agreement (“yes hocam” in line 24) serves both as ratification of the REF’s assessment and as an affiliative response, maintaining epistemic symmetry and professional rapport.
This sequence is followed by the VAR’s offer to display a second viewing angle, which the REF declines by requesting that the current footage be played at normal speed. Through this rejection, the REF reasserts control over the review process, displaying both epistemic independence and procedural authority. Rather than relying on further visual input, the REF begins to construct his own evidential account of the incident:
This extended formulation serves as a composite narrative that integrates both temporal (“after the ball left”) and causal (“stepped on”) components, demonstrating the REF’s interpretive synthesis of what has been jointly observed. However, this emerging account is momentarily interrupted by the VAR’s simultaneous formulation (“he stepped on the foot of player ninety-seven” in line 33). This overlapping turn functions as an aligning but competing account. It confirms the core observation while momentarily reasserting the VAR’s evidential authority. The overlap here does not signal agreement but rather co-presence in professional seeing (Goodwin, 1994): Both participants articulate the same perceptual finding in near-synchrony, displaying shared understanding through parallel talk. The sequence then converges as the VAR explicitly ratifies one of the REF’s earlier checks, confirming “the ball is in the game” (line 37). This agreement completes a local cycle of verification, reinforcing intersubjective alignment between the two officials.
The next interactional step in the evidential framing phase is initiated by the REF’s request for a new visual perspective: “can I see it from another offside camera offside camera” (lines 39–40) (see Figure 5). This request marks a transition from preliminary verification to expanded scrutiny, displaying the REF’s ongoing pursuit of adequacy and completeness. That is, the turn projects sequence expansion (Schegloff, 2007), temporarily suspending closure of the assessment to seek further epistemic access. The request is not merely informational. It performs institutional accountability work, demonstrating that the REF is systematically exhausting all available visual resources before finalizing a high-stakes decision. This action indexes what Heritage (2012) describes as procedural objectivity: A methodical showing that the decision process is based on comprehensive evidence rather than personal intuition. As the VAR complies (“right away, yes” line 41), and the offside camera footage appears, the participants initiate a new co-viewing sequence (lines 43–47). Here, the review is no longer only about what happened but how clearly it can be seen.

The second angle in the review process.
The framing-evidence phase concludes with the REF’s decline of the VAR’s final offer to display a third viewing angle. The VAR’s offer, formulated as a continuation of procedural thoroughness, is met with the REF’s decisive rejection (“I don’t want it, I’m sure” in line 51). This utterance marks a clear epistemic boundary: The REF publicly claims sufficient knowledge to render a final judgment, thereby terminating the co-viewing sequence. The expression of certainty (“I’m sure”) functions as an explicit stance marker (Heritage, 2012), converting a perceptual finding into a definitive institutional decision. By rejecting further visual input, the REF transforms a collaborative evidential process into an individual act of authority, signaling the shift from interpretive exploration to epistemic finality. This move constitutes a sequence-closing third (Schegloff, 2007), which formally ends the multi-turn review phase and projects the upcoming transition to decision announcement.
In this episode, the framing of visual evidence (i.e. accounts) is achieved through collaborative, overlapping, and mutually elaborative turns. The talk is characterized by incremental confirmation and joint formulation, a sequential pattern that reveals how the REF and VAR work to see and describe the same event in real time. Here, the VAR and REF are not engaged in a simple question–answer or statement–acknowledgment exchange. Rather, they are co-producing a shared description of the replay. The overlaps create what Goodwin (1994) terms “mutual alignment of professional vision”—a step-by-step process where both ratify and refine the critical moment of the incident. Also, sequentially each “yes” operates as a continuer (Schegloff, 1982) that keeps the joint description alive, rather than closing it. Further, the alternation of turns (REF → VAR → REF → VAR) forms a choral alignment, showing that understanding is being built, not merely claimed. Thus, the participants are constituting the evidence as a social object: the “moment of stepping” only becomes the moment of contact through this mutual, sequential elaboration. The institutional meaning (i.e. “the step on the foot after the ball left”) is produced in the talk rather than extracted from the footage.
In contrast, the EFL episodes (Episode 3 W37KA07 and Episode 5 W35BF55) organize the same institutional task, evaluating the footage, in a much more linear and segmented manner. At the start of the review in Episode 3, the VAR unilaterally frames the visual scene: Episode 3: W37KA07—EFL (continues)
Compared to the Turkish episode (Episode 7: W5KF35) given above, the interaction is sequentially simple: Each VAR turn is followed by a minimal acknowledgment by the REF. There is no overlapping, no co-elaboration, and no collaborative reformulation. Instead, turns are designed for procedural clarity: The VAR provides declarative descriptions, the REF confirms recognition, and the review progresses.
This sequential pattern enacts what Firth (1996: 243) describes as the “let it pass” principle of lingua franca talk: Participants orient to comprehensibility and task completion, not to linguistic elaboration or co-construction. The VAR’s “you see” formulations in lines 13, 21, and 23 function as directives wrapped as evidential: Their sequential position makes them less negotiable, more instructional. Similarly, when the REF asks for another angle, the structure remains minimal: Episode 3: W37KA07—EFL (continues)
The adjacency pair is tight and closed: one request, one confirmation, one receipt token. There are no expansions or repetitions, and the talk moves immediately to the next procedural step. This compression shows a preference for projectability: Each action projects the next relevant one (Heritage and Clayman, 2011). The absence of overlapping talk and continuers indicates that participants treat the VAR review as a fixed procedural script rather than a locally negotiated evaluation. Thus, “seeing the point of contact” (line 18) is not collaboratively built but asserted and acknowledged—an epistemic asymmetry that aligns with the institutional hierarchy (VAR as expert, REF as confirmer).
The same procedural pattern appears in the second EFL episode (Episode 5: W35BF55), but even more abbreviated. The VAR frames the key scene: Episode 5: W35BF55—EFL (continues)
The REF’s repeated “okay”s in lines 20 and 23 here are minimal response tokens indicating acknowledgment without engagement. The turns are syntactically independent, with long pauses and no overlap—demonstrating a sequential thinning typical of institutional lingua-franca exchanges. When the REF initiates a request: Episode 5: W35BF55—EFL (continues)
The REF’s request in line 28 (“I I want the eh- other angle”) and his follow-up request for another angle in line 32 (“no no no it’s not good angle”) display on-line reasoning. That is, the REF’s evaluation is accomplished as a private thought made public, not a collaborative assessment. Sequentially, the REF’s repeated requests replace what, in Turkish, would be a multi-turn confirmation sequence between participants. Thus, in the EFL corpus, epistemic access is individual and declarative (“I want. . .,” “I can see. . .”), whereas in the Turkish episode it is collective and iterative (“yes hocam,” “another angle yes”). The difference is not linguistic but sequential: in the Turkish talk, understanding is displayed through shared turns while in the EFL talk, through independent completions.
The other key interactional distinction between the Turkish and EFL episodes lies in the use of address terms such as hocam (“my teacher”) and abi (“older brother”), both of which are deeply embedded in Turkish institutional talk. These terms are not merely politeness markers. They are sequentially consequential resources that accomplish alignment, deference, and moral stance in ongoing professional interaction. They appear precisely where epistemic rights are negotiated or relational balance is at stake, thereby shaping how evidence, uncertainty, and authority are sequentially organized.
In the Turkish episodes (Episode 7: W5KF35 and Episode 2 W15SG96), hocam is positioned in first or second position within the summons–response sequence, establishing a hierarchical yet collegial footing from the outset. In W5KF35, the interaction opens with: Episode 7: W5KF35. Episode 2: W15SG96.
In each case, hocam works as a summons with moral weight. It does more than gain attention: It acknowledges authority and situates the upcoming recommendation within an asymmetrical, but respectful, frame. The responses “yes sir” or “I’m listening” closes the adjacency pair and completes the moral alignment before the institutional task begins. Sequentially, this opening adds an extra move that is absent in the EFL data. In W37KA07, the interaction starts more procedurally: Episode 3: W37KA07.
Here, the VAR’s use of the first name “atilla” lacks relational indexicality: It summons but does not encode deference. The sequence moves directly to task talk. Thus, hocam in Turkish builds a pre-institutional alignment phase, establishing moral and epistemic footing before the procedural recommendation occurs.
In mid-sequence positions, hocam also appears in confirmation and alignment turns that maintain epistemic solidarity during joint viewing.
Episode 7: W5KF35.
The sequential placement is critical. It occurs after a series of overlapping turns, when participants have achieved convergence on what is visible. Inserting hocam here not only ratifies agreement but reframes the act of agreeing as collegial respect. Hocam thus functions as a reflexive display of professional accountability. It shows that alignment is not only perceptual but also moral — that due respect to institutional hierarchy has been observed while reaching consensus.
In Episode 2 (W15SG96), the REF employs another Turkish relational form, abi, at a key juncture of the review: Episode 2: W15SG96.
Here, abi surfaces just before the REF announces his decision to “stay with his call” in line 25. Sequentially, this is a turn-initial preface to a potentially disaligning move: The REF’s decision rejects the implicit expectation that the VAR’s review might change the call. By inserting abi before asserting authority, the REF mitigates possible disaffiliation. It redefines the epistemic asymmetry (decision rights) as a relation of camaraderie rather than confrontation.
In Turkish institutional settings, abi is frequently used to soften directive or oppositional actions. Its placement here is interactionally strategic: it prefaces the closure of a joint evidential sequence, converting an authoritative declaration (“I keep my decision”) into a collegial act. The later turn by the fourth referee (addressed to REF) reinforces this moral framing:
Episode 2: W15SG96.
Here, abi reappears from a third party (the fourth referee), indexing familiarity and shared professional space even while relaying procedural information. These placements show that abi organizes relational solidarity at transition points — where talk shifts from analysis to action, or from deliberation to closure.
The Turkish address terms have clear sequential consequences. Each insertion of hocam or abi expands the sequence, adding a moral or affiliative layer to otherwise procedural talk. This expansion manifests in overlaps, continuers, and additional confirmation tokens (“yes hocam” or “okay abi”), which keep the interaction open and responsive. In contrast, the EFL episodes lack such expansions, resulting in a more compressed sequential architecture: The exchanges rely on minimal uptake (“yeah,” “okay”), producing a smooth, one-turn-per-step rhythm. The absence of such terms in ELF episodes marks a shift from relational accountability to procedural accountability — from a context where moral and institutional orders are intertwined to one where professionalism is achieved through standardized communicative efficiency (Baker, 2018; Due and Licoppe, 2020; Firth, 1996).
Ending the protocol
The final phase of the VAR protocol marks the transition from joint evidence evaluation to individual decision declaration. This stage typically unfolds in three interlinked steps: (1) the REF formulates a final stance, either confirming (and agreeing with the VAR’s intervention) or revising the initial decision (i.e. keeping the initial decision), (2) the embodied ratification follows through the standardized gestures (the “TV rectangle” signal, whistle, and movement back toward the field), and (3) the REF verbally or visually announces the restart of play (see Icbay, 2025b for similar practices in different contexts). Across the Turkish and EFL episodes, this procedural architecture is similar, yet the interactional methods used to accomplish closure diverge significantly. The Turkish closings are interactionally expansive and relationally textured, while the EFL closings are procedurally compact and monologic.
Episode 5: W35BF55—EFL. Episode 6: W11SA23—TUR.
3
In Episode 5 (W35BF55), the VAR protocol ending is initiated with the REF’s declaration of decision ownership: “I’ll stay to my decision, it’s not a penalty” (lines 42–43). This utterance marks the point where the REF publicly reclaims epistemic and institutional authority over the call, thereby disagreeing with the VAR’s prior intervention (i.e. that a penalty offense may have occurred). As disagreement is a dispreferred action in interactional terms (Pomerantz, 1984), the REF also provides an account that renders the disalignment acceptable within the professional frame: “it’s natural position” (line 45). This account functions as a justification designed to manage both institutional accountability and peer alignment (Heritage, 1984; Stivers, 2008).
The closing is then ratified through a standardized embodied sequence: the REF performs the “TV screen” signal, blows the whistle, and jogs back to the field (lines 47–50). These multimodal actions publicly transform the verbal declaration into an official institutional act: the match’s restart. The VAR’s minimal acknowledgment (“okay”) confirms understanding but does not challenge the REF’s authority, thereby marking the formal termination of the review. The sequential organization here is procedurally compact and monologic: the REF’s verbal declaration is immediately followed by embodied ratification, leaving little space for further co-construction or relational elaboration.
In Episode 6 (W11SA23), a similar three-part structure is observable, yet the interactional texture of closure is notably different. The REF opens the final phase with a name and honorific (“onur hocam,” line 42), a recurrent form in Turkish institutional discourse indexing respect and collegiality. Through this address, the REF not only orients to the VAR as a professional peer but also establishes an affiliative frame for the forthcoming disagreement. The REF’s next turns (lines 43–46) include an explanatory account preceding the final decision: “I to the ball, after the blue hit the foot an inevitable touch, I think this is the case, that is why I keep my decision.” This multi-unit turn combines both evidential description and epistemic stance-taking (“I think”), demonstrating reflexive accountability (Garfinkel, 1967) before asserting authority. In doing so, the REF manages the dispreferred nature of disagreement through reason-giving and stance softening, transforming a potentially face-threatening act into a collegial resolution.
The VAR’s affiliative response (“got you hocam,” line 48) reciprocates this relational orientation, displaying understanding and solidarity while yielding epistemic primacy to the REF. The closing then proceeds through the familiar embodied package—the TV signal, whistle, and return to the field (lines 49–50)—which indexes institutional continuity and game resumption.
In the Turkish data, decision closure is accomplished through turn-competitive interruptions, repetitions, and relational address terms that display epistemic finality and moral alignment. In Episode W5KF35, for instance, the VAR begins to elaborate further evidence—“the contact on the foot=” (49)—but the REF cuts in, saying “no– I don’t want it I’m sure” (50–52). This repetition, coupled with “I’m sure,” publicly performs the REF’s certainty. The action is both epistemic and sequential: It terminates the possibility of further talk, signaling that the joint review has reached its endpoint. The REF’s next utterance—“I’m starting the game with penalty decision” (53–55)—formally transitions from evaluation to re-initiation of play. The decision is thus not only stated but demonstrated as the product of due deliberation.
Episode 7: W5KF35—TUR.
A similar pattern occurs in Episode 2 (W15SG96), where the REF prefaces his stance with “I abi-” (line 25) before declaring “I keep my decision” (line 26). Here, abi (literally “brother”) functions as a relational mitigator: It cushions the epistemic asymmetry of maintaining a personal decision against the VAR’s earlier recommendation. The VAR’s response, “okay it’s your decision” (line 30), aligns with this moral order, reaffirming the REF’s institutional right to decide. Sequentially, these relational tokens (abi, hocam) transform what could be an abrupt unilateral move into a morally balanced closure.
Episode 2: W15SG96—TUR.
4
The EFL episodes contrast sharply in their organization of stance-taking. In Episode 3 (W37KA07), the REF’s closing account takes the form of a short, self-contained monologue in lines 57–63. There is no overlap, no repetition, and no relational preface. The sequence is linear and self-sufficient, displaying procedural rather than interpersonal accountability. In Episode 5 (W35BF55), the closure is even more condensed: “okay, I’ll stay to my decision, it’s not a penalty” (lines 41–45). Thus, the REF’s declarative stance alone performs the closing function. The brevity of these utterances reflects the preference for procedural clarity that typifies lingua-franca interaction (Jenkins, 2015): participants orient to the institutional script rather than the moral nuances of collegial hierarchy.
Episode 3: W37KA07—EFL.
The handling of final evidential offers further illustrates this difference. In the Turkish-speaking VAR protocols, closure is actively negotiated. In Episode 7 (W5KF35), after multiple angle requests (lines 39–46), the REF explicitly refuses further input—“I don’t want it”)—before declaring certainty lines (50–52). This explicit refusal is not merely negative but interactionally productive, signaling readiness to close the evidential phase. In Episode 2 (W15SG96), the REF similarly transitions from visual exploration to verbal closure through “I abi- I keep my decision” (lines 24–25), combining a relational preface with an assertion of epistemic autonomy. By contrast, in the EFL episodes, no explicit refusals or relational markers accompany closure. In Episode 3 (W37KA07), the review simply progresses from the final viewing to the REF’s decision statement, while in Episode 5 (W35BF55), the REF self-terminates the viewing by saying “no no no it’s not good angle” (line 32), moving directly to stance without negotiation. The EFL sequences thus compress what in Turkish talk is a multi-turn transition into a single move.
From an interactional perspective, these patterns demonstrate two contrasting interactional architectures of accountability. In the Turkish closings, certainty and finality are performed sequentially—through interruptions, repetitions, relational address forms, and explicit verbalization of procedural steps. The decision becomes visible as an interactional accomplishment of consensus, moral alignment, and procedural correctness. In the EFL closings, finality is declared institutionally: The decision is already formatted as a procedural endpoint and therefore requires no relational elaboration. Turkish sequences display “interactional richness”—epistemic stance and moral order are co-produced—whereas EFL sequences display “interactional thinness,” privileging efficiency and standardization.
Conclusion
This study explored how professional decision-making is collaboratively accomplished in technologically mediated football officiating, focusing on the interaction between REFs and VARs. Through fine-grained sequential analysis of official VAR protocols recorded in both Turkish and EFL, the study uncovered how decision-making is not a simple cognitive process but an interactionally achieved practice organized through sequential talk, embodied action, and institutional accountability. Across both linguistic settings, the VAR protocol was found to follow a recurrent institutional architecture: Establishing recipiency, framing evidence, and concluding the protocol. Each phase involved distinct interactional practices through which the participants display orientation to their institutional roles and to the procedural norms of the VAR system. However, the comparison between Turkish and EFL sequences reveals key differences in how consensus (agreement and disagreement) is communicatively managed, reflecting both linguistic resources and the moral order embedded in each interactional environment.
In Turkish episodes, decision-making is characteristically interactionally rich and relationally textured. Address forms, such as hocam (“my teacher”) and abi (“older brother”), function as moral and epistemic resources that sustain collegial alignment and soften moments of potential disaffiliation. These linguistic elements are not peripheral politeness devices but core components of how institutional authority and accountability are enacted (see Keshavarz, 2022). They expand sequences, generate affiliative turns, and transform unilateral declarations into morally balanced closures. In disagreement sequences, such as when the REF decides to keep his initial call, the participants use these address terms to display deference, mitigate tension, and sustain a sense of professional solidarity. In this sense, decision-making in the Turkish VAR talk is an interactional accomplishment of both procedural and moral order, combining accountability with relational alignment (see Heritage, 2012).
By contrast, in EFL episodes, decision-making is procedurally compact and linguistically minimal. Sequential trajectories are shorter, overlaps are rare, and turns are designed for clarity and progressivity. The participants orient to efficiency and comprehensibility: the hallmarks of lingua-franca interaction (see Baker, 2018; Firth, 1996; Mauranen, 2006; Seidlhofer, 2013), rather than to the relational elaboration seen in Turkish. The absence of culturally embedded address forms and the use of declarative, self-contained utterances (“I’ll stay with my decision,” “it’s not a penalty”) produce an interactional thinness, where accountability is performed through procedural adherence rather than interpersonal negotiation. The “let it pass” principle (Firth, 1996) becomes visible in how minimal acknowledgment tokens (“okay,” “yeah”) suffice to ratify institutional actions, even in moments of disagreement.
The comparative analysis thus points to two distinct interactional architectures of accountability. In Turkish, accountability is dialogic, co-constructed, and relationally embedded whereas in EFL, it is monologic, standardized, and procedural. Both accomplish legitimacy, but through different semiotic economies. The Turkish mode relies on moral order as an interactional resource, displaying respect, affiliation, and epistemic balance, while the EFL mode relies on institutional order as a procedural script, foregrounding precision, transparency, and individual authority. These findings align with prior studies on institutional talk in multilingual and technologically mediated contexts, which highlight how professional coordination depends on the availability and deployment of shared linguistic and cultural resources (Drew and Heritage, 1992; Mondada, 2013).
The findings also contribute to the growing body of research on English as a lingua franca in professional communication, where participants use reduced linguistic means to sustain complex institutional actions (Firth, 1996; Mauranen, 2018). In VAR talk, EFL functions less as a medium of interpersonal expression and more as a pragmatic toolkit for procedural efficiency. What is lost in interactional richness is compensated by predictability and transparency, the qualities highly valued in transnational professional contexts (Jenkins, 2015). Thus, the shift from Turkish to EFL officiating represents not only a linguistic change but also a transformation in how institutional authority and accountability are interactionally accomplished. From an ethnomethodological perspective, these sequences illustrate the members’ methods for constructing order in real time (Garfinkel, 1967). VAR–REF coordination demonstrates how the participants treat talk, gesture, and technology as interdependent resources for sense-making. Each whistle blow, headset touch, or repetition (“I’m sure”) indexes the reflexive accountability of professional judgment. Agreement and disagreement alike are organized as publicly observable and reportable actions, sustaining the institutional demand for procedural fairness.
A more critical implication is that neither technology nor language choice should be assumed to straightforwardly enhance “objectivity” or “clarity.” VAR is often publicly framed as a corrective device that reduces human error, yet the excerpts show that what technology provides is not decisions but interactionally managed visibility—a negotiated “professional vision” in which officials formulate, propose, test, and ratify what the footage is taken to show (Garfinkel, 1967; Goodwin, 1994). Likewise, the shift to EFL may increase standardization and procedural transparency, but it can also reduce the availability of relational and culturally embedded resources for delicately managing disagreement and restoring alignment, thereby relocating the burden of legitimacy from collegial negotiation to the procedural script itself (Drew and Heritage, 1992; Firth, 1996). This trade-off matters because institutional accountability is not only a matter of producing a decision, but of making the decision recognizable as warranted and legitimate in real time, often through small interactional practices such as formulations, confirmations, and agreement displays (Heritage, 2012; Heritage and Raymond, 2005). From this perspective, future work and training might productively treat VAR talk not as a “neutral channel” but as a skilled interactional domain—one where standard phraseology is necessary yet insufficient, and where practices for clarification, repair, and disagreement management are crucial for maintaining both progressivity and legitimacy (Drew and Heritage, 1992; Schegloff, 2007).
In broader terms, this study offers insight into how multilingual professionals manage technologically mediated collaboration under extreme temporal and institutional constraints. The comparison between Turkish and EFL episodes shows that linguistic diversity does not merely alter the surface form of interaction—it reshapes the moral, procedural, and epistemic foundations of decision-making itself. Future research might extend these findings to other multilingual and multimodal professional settings—such as air-traffic control, remote surgery, or online arbitration—where technology mediates both perception and authority. In conclusion, decision-making in the VAR system is not an act of solitary judgment but a collective, interactionally achieved practice. Whether conducted in Turkish or in EFL, the VAR protocol embodies the dual demands of modern professionalism: precision and accountability on the one hand, and affiliation and moral alignment on the other. Through their talk, gestures, and timing, referees make fairness visible—turn by turn, gesture by gesture—under the watchful gaze of millions.
Footnotes
Appendix 1
The list of VAR recordings available on the official YouTube channel of Turkish Football Federation.
Appendix 2
Appendix 3
Episode 7: W5KF35—TUR.
Appendix 4
Episode 6: W11SA23—TUR.
Appendix 5
Episode 2: W15SG96—TUR.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
