Abstract
Habits allow environmental and interoceptive cues to trigger behavior in an automatized fashion, making them liable to deployment in inappropriate or outdated contexts. Over the long term, repeated failure of a once-adaptive habit to satisfy current goals produces extinction learning that suppresses the habit’s execution. Less attention has been afforded to the mechanisms underlying real-time habit suppression: the capacity to stop the execution of a cued habit that is goal conflicting. Here, I first posit a model by which goal-relevant stimuli can (a) bring unfolding habits and their projected outcomes into awareness, (b) prompt evaluation of the habit outcome with respect to current goals, and (c) trigger cessation of the habit response if it is determined to be goal conflicting. Second, I propose a modified stop-signal task to test this model of goal-directed stopping of habit execution. Finally, I marshal evidence indicating that the ventrolateral prefrontal cortex, situated at the nexus of salience detection, action-plan assessment, and motor inhibition networks, is uniquely positioned to coordinate the overriding of habitual behaviors in real time. In sum, this perspective presents a testable model and candidate neurobiological substrate for our capacity to “snap out of autopilot” and override goal-conflicting habits in real time.
Keywords
Habits develop after taking the same action in a given context consistently leads to reward (Dickinson, 1989; Dolan & Dayan, 2013; Thorndike, 1927). In such contexts, the need for conscious deliberation about what action to take to achieve the goal eventually becomes unnecessary. When the stimuli associated with these contexts present themselves, they automatically trigger the action that has repeatedly yielded reward in that context. The gradual transfer of such behaviors from the control of a deliberative, goal-directed behavior system to a reflexive, habit-based behavior system facilitates metabolic efficiency and frees up cognitive resources to engage elsewhere (Robbins & Costa, 2017). A rich literature has detailed the neural processes devoted to this transfer mechanism, whereby frequently repeated goal-directed behaviors encoded in dorsomedial striatum are shuttled and reconfigured via the dopaminergic striatonigrostriatal pathways (Belin & Everitt, 2008; Haber et al., 2000) to storage as stimulus–response habits in dorsolateral striatum (Everitt & Robbins, 2005, 2016).
The downside to habits is that what is gained in efficiency is lost in flexibility. Habits are generally insensitive to changes in current goals, outcome values, and contingencies (Dickinson, 1994; Robbins & Costa, 2017). Therefore, when a habit-cuing stimulus takes place amid a larger context in which goals may be different than usual, habits can lead us astray. “Have a great flight!” says the airport check-in employee after you are done checking in. “You too!” you say back instinctively. The first intersection after leaving home comes up, and you make the usual left turn toward work despite being on your way to the doctor’s office. You walk out of the bathroom and turn the lights off, even though your spouse is still in there brushing their teeth. These instances of “habitual action slips” (Gillan et al., 2011; Norman, 1981) underscore the potential for usually goal-satisfying habits to deploy in inappropriate contexts that undercut our goals. Although the action slips described here may amount to minor nuisances, action slips occur with greater frequency in certain neuropsychiatric disorders, such as obsessive-compulsive disorder (Gillan et al., 2011), that can contribute to severe functional impediments.
In healthy individuals, the repeated failure of a once-adaptive habit to satisfy the goals of a changed environment eventually results in extinction learning that suppresses the habit’s execution (Bouton, 2004; Goodman et al., 2016; Rescorla, 2001). But what if we need to halt an active or nonextinguished habit in real time? Unlike physiological reflexes that bypass the brain, habits are not inescapably executed to completion on the incidence of their initiating stimuli (Pollard, 2006). A shared human experience is the phenomenon of suddenly catching ourselves amid some seemingly automatic action—often one we did not consciously initiate—that is leading us astray from our current goal and changing our course. What are the mechanisms that facilitate this “snapping out of autopilot” capacity?
In the following sections, I formalize a model for overriding habits in real time, propose a task-based means for evaluating this model, and review evidence that points to the ventrolateral prefrontal cortex (vlPFC) as a crucial brain region for this capacity.
A Model of Overriding Habits in Real Time
Traditional conceptualizations of goal-directed and habitual behavior posit that these are two independent and noninteracting systems that govern separate aspects of behavior (Dickinson, 1985). After all, habits, by most definitions, are stimulus-triggered actions that bypass considerations of goals and outcomes. However, recent work has increasingly recognized the interplay between the goal-directed (model-based) and habit (model-free) systems at the computational, neural, and behavioral levels (Daw, 2018; Keramati et al., 2016; Krueger & Griffiths, 2018). Along these lines, I posit a model of goal-directed stopping of goal-conflicting habits that entails interaction between the goal-directed and habitual behavior systems.
After an initial stimulus/cue (Fig. 1a) has begun to initiate a habit response (Fig. 1b), the process of overriding the execution of that habit in real time requires several sequential steps. First, there must be subsequent perception of a goal-relevant stimulus/cue (Fig. 1c). As a default, stimulus–response habit cascades occur beneath the level of awareness; this secondary, goal-relevant stimulus snaps the system out of autopilot and brings the unfolding habit into awareness (Fig. 1d). This then prompts an evaluation of the habit response’s prospective outcome in the context of current goals (Fig. 1e). This entails a direct comparison of the outcome utility to be provided by executing the habit and that to be provided by taking an alternative, goal-directed action. If the result of this evaluation is a determination that the habit’s execution will conflict with current goals, then the third step is inhibition or cessation of the habit response and execution of the alternate, goal-directed action (Fig. 1f).

Depiction of the sequence of events (a-f) involved in goal-directed stopping of an unfolding goal-conflicting habit in real time.
Testing the Model
Motivating a task design
The laboratory task that comes closest to modeling the overriding of a habit in real time is the stop-signal task (Li et al., 2006; Logan et al., 1984). Although subjects are not tasked to override a habit per se, they are tasked to cease the execution of an already initiated, prepotent, stimulus–response cascade. At a minimum, the stop-signal task demonstrates that even amid an already initiated stimulus–response cascade, the perception of a secondary stimulus can prompt the cancellation of this cascade when its outcome conflicts with a newly current goal signaled by the secondary stimulus (Verbruggen et al., 2019).
In the traditional stop-signal task, however, the secondary “stop” signal likely does not promote stopping by prompting awareness of the initial stimulus–response cascade and facilitating consideration of its outcome relative to current goals. Rather, it initiates a parallel, competing action (to halt movement) via the hyperdirect cortical-subthalamic nucleus pathway that cancels ongoing movements (Aron et al., 2007; Chen et al., 2020). In this way, stopping in the traditional stop-signal task is reflexive rather than goal-directed. To prompt and evaluate goal-directed stopping of a habit that is recognized to be goal conflicting, I propose several modifications to the traditional stop-signal task.
The secondary stop-signal stimulus must be voided of its automatic association with stopping in order to probe goal-directed stopping and not reflexive stopping. That is, the subject must not be explicitly trained to inhibit behavior in response to the presentation of a secondary stimulus as a default action. A variation of the traditional stop-signal task, the conditional stop-signal task (Aron et al., 2007; Obeso et al., 2011), achieves this by presenting one of two types of primary go stimuli before the presentation of the secondary stop stimulus: a critical primary stimulus or a noncritical primary stimulus. Presentation of the noncritical primary stimulus (e.g., a left-pointing arrow) instructs the subject not to inhibit the go response even if the secondary stop stimulus is subsequently presented. Alternatively, presentation of the critical primary stimulus (e.g., a right-pointing arrow) instructs the subject to inhibit the go response if the secondary stop stimulus is subsequently presented. However, despite voiding the secondary stimulus of an automatic mandate to stop, doing so by imbuing the primary habit-cuing go stimulus with a prescribed conditional instruction does not test the capacity and model of snapping out of autopilot that is of interest here. First, this manipulation arguably prevents the formation of a strong prepotent response in relation to the primary stimulus, because the subject must engage active attention both during the presentation of the primary stimulus (to identify whether it is critical or noncritical and to retrieve the appropriate response contingency) and after it (to maintain the appropriate response contingency in working memory until the secondary stimulus is or is not presented). Second, it does not provide a way to examine whether the nature of the secondary stimulus (i.e., goal relevant vs. goal irrelevant; goal conflicting vs. goal consistent) modulates the extent to which an ongoing habit response can be recognized and adjusted in a goal-directed manner. Accomplishing this requires the presentation of varied secondary stimuli, for which it is the subject’s prerogative to determine—in a goal-directed manner—whether task conditions and current goals warrant stopping. Therefore, in order to (a) void the secondary stimulus of its automatic association with stopping, (b) promote the formation of a strong prepotent response to the primary stimulus, and (c) test the capacity of different classes of secondary stimuli to snap the system out of autopilot, I propose the following variation of the stop-signal task.
A goal-directed stop-signal task
First, a primary stimulus is presented—a dollar amount that decreases rapidly every millisecond that the subject has not pressed the “take” keyboard button. Then, on 70% of trials (goal-irrelevant trials), a set of three random (nonmonetary) secondary stimuli are also presented quickly after initial stimulus onset. This structure builds up a prepotent response to press the take button as quickly as possible after initial stimulus presentation, even as secondary stimuli are presented as well. On 30% of trials (goal-relevant trials), one of the secondary stimuli is another dollar amount. In half of these instances (goal-relevant, goal-consistent trials), the amount is significantly smaller than the original amount, in which the subject’s optimal action is still to press the take button. In the other half of these instances (goal-relevant, goal-conflicting trials), the amount is significantly larger than the original amount, in which the optimal subject action is to inhibit pressing the take button. This structure allows subjects to determine, in a goal-directed manner, which secondary stimuli warrant overriding the habitual button-press response.
This task structure is thereby suited to examine (a) the capacity of goal-relevant stimuli to prompt recognition of an ongoing habit and (b) the capacity of goal-relevant stimuli that are also goal conflicting to stop the habit’s execution. In doing so, the task allows measurement of subject variability in the three component aspects that the model proposes are necessary for goal-directed stopping of goal-conflicting habits: (a) reflexive reorienting to goal-relevant stimuli (increase in reaction time on goal-relevant compared with goal-irrelevant trials), (b) comparative evaluation of the utility of the habit response outcome compared with the goal-directed stopping outcome (accuracy on goal-relevant trials), and (c) motor inhibition (the speed of stopping on goal-relevant, goal-conflicting trials).
Other variations of the stop-signal task have shown that leveraging environmentally salient stimuli (e.g., red and green traffic lights), which already have habitual go/stop responses associated with them, can be an effective means to ensure the examination of a genuine habitual behavior on the task (Ceceli et al., 2020; Hochman et al., 2018). Moreover, tasks of this nature have shown that monetary incentives—when combined with real-time feedback—can weaken habitual behaviors (Ceceli et al., 2020). Thus, to parse potential influences of habit strength and monetary reward on the capacity to snap out of autopilot, one may need to compare performance on the version of the task proposed here (which uses monetary stimuli and performance-linked monetary rewards) with performance on versions that instead use environmentally salient stimuli and/or do not use performance-linked monetary rewards. Furthermore, it may also be useful to include a habit-check component that indexes the strength of habit formation for each subject. This could include, on a small subset of trials, having the primary stimulus increase in monetary value over time, in which case optimal subject behavior is to withhold pressing the take button. Insensitivity to this change in outcome—gauged by each subject’s average payout on such trials—could provide a useful index of habit strength.
Role of the vlPFC
The vlPFC is a strong candidate substrate for orchestrating the component processes required to override a goal-conflicting habit in real time. In humans, the vlPFC occupies the inferior frontal gyrus of the frontal lobe and consists of three broad anatomical and functional subregions: the pars orbitalis (Brodmann Area 47), the pars triangularis (Brodmann Area 45), and the pars opercularis (Brodmann Area 44). These vlPFC subregions have been variously implicated in salience detection, outcome evaluation, action-plan switching, arbitrating between the execution of habitual and goal-directed behavior, and motor inhibition. Importantly, structural connections link each of these subregions to each other (Petrides & Pandya, 2002; Saleem et al., 2014), allowing for the transfer of information between the vlPFC’s functional subunits. Below, I review the evidence for the vlPFC’s roles in these functions and the structural and functional connections that facilitate them (Fig. 2).

The neural connections of the ventrolateral prefrontal cortex that place it at the nexus of functions—motor inhibition, action-plan value assessment, and reflexive reorienting—needed to override the execution of a habit in real time.
Salience detection and reflexive reorienting
The vlPFC is a primary node in both the salience network—which also includes the adjacent anterior insula and the dorsal anterior cingulate cortex (Seeley et al., 2007)—and the bottom-up fronto-parietal-temporal ventral attention network (Corbetta et al., 2008). The salience network and ventral attention network are central to rapid detection of and orienting attention toward goal-relevant stimuli. The role of vlPFC in these processes is largely facilitated by direct inputs received from higher-order sensory processing areas and from the amygdala. These connections, delineated by nonhuman primate tract-tracing studies, are detailed below.
Visual input
The vlPFC receives direct inputs from extrastriate visual cortical areas in the inferotemporal cortex via the uncinate fasciculus (Bullier et al., 1996; Ungerleider et al., 1989). These connections help form the ventral visual what stream that processes object identity (Mishkin et al., 1983). vlPFC neurons in Areas 12/47 and 45 that are innervated by these inferotemporal visual areas have been shown to be face and object selective (Scalaidhe et al., 1997; Wilson et al., 1993).
Auditory input
Similarly, vlPFC receives input from auditory processing regions of the superior temporal cortex. vlPFC Areas 12/47 and 45 receive input from the lateral auditory belt, which processes complex auditory stimuli such as species-specific vocalizations and band-passed noise (Romanski et al., 1999). This has led to the supposition that, similar to its role for visual stimuli, vlPFC is involved in identification of auditory stimuli (Romanski et al., 1999). Indeed, individual vlPFC neurons have been shown to integrate both auditory and visual information of this kind (Sugihara et al., 2006).
Amygdalar input
The same vlPFC subregions that receive processed visual and auditory input are also reciprocally connected to the amygdala. Areas 45A and 45B in particular have amygdalar connections that are markedly richer than those of surrounding areas (Gerbella et al., 2014). It is thought that these connections may imbue the sensory input that vlPFC receives with value or emotional valence in order to guide visual gaze and attention toward salient and behaviorally relevant visual and auditory stimuli (Adolphs, 2010; Gerbella et al., 2014).
Overall, the higher-order sensory and amygdalar connections of the vlPFC provide it with direct access to the type of information needed to recognize and reorient attention toward stimuli that are salient or goal relevant. It is postulated that these information streams ultimately contribute to the formation of priority maps in the vlPFC that guide attention to the most behaviorally relevant stimuli (Gerbella et al., 2014).
Evaluating action-plan alternatives
Converging evidence from human functional MRI (fMRI) and animal models has long pointed to distinct corticostriatal circuits for the processing of goal-directed behavior (i.e., dorsolateral prefrontal cortex, ventromedial prefrontal cortex, orbitofrontal cortex, and the caudate) and habitual behavior (i.e., premotor cortex, supplementary motor areas [SMAs], and the putamen; Balleine & O’Doherty, 2010; Yin & Knowlton, 2006). The prospective planning functions of dorsolateral prefrontal cortex and adaptable action-value representations in ventromedial prefrontal cortex and orbitofrontal cortex help to facilitate flexible goal-directed behavior, whereas action sequences and stimulus–response associations stored in the motor cortical areas and putamen drive efficient habitual behaviors (Balleine & O’Doherty, 2010; de Wit et al., 2012; McNamee et al., 2015; Smittenaar et al., 2013; Tricomi et al., 2009; Valentin et al., 2007; van Steenbergen et al., 2017; Yin & Knowlton, 2006). Nonhuman primate tract-tracing studies demonstrate that the vlPFC is connected to the cortical components (Petrides & Pandya, 2002; Saleem et al., 2014) and striatal components (Ferry et al., 2000; Korponay et al., 2020) of both circuits. Moreover, vlPFC has recently been linked in human fMRI studies to facilitating comparative evaluations of goal-directed and habitual actions as well as to arbitrating which system predominates in a given context (Gruner et al., 2016; Lee et al., 2014).
Early fMRI studies highlighted that activity in the vlPFC accompanied behavioral switching on reversal-learning tasks (Cools et al., 2002) and the cessation of habitual behavior on go/no-go tasks (Garavan et al., 1999). More recently, a computational fMRI study (Lee et al., 2014) in humans used a sequential Markov decision task, which differentially favored either goal-directed (model-based) or habitual (model-free) control of behavior on a given trial to assess where the brain arbitrates the reliability of each behavioral system. It found that bilateral vlPFC activity encoded both goal-directed and habitual reliability signals as well as, importantly, the output of a comparison between those two signals (a maximum reliability signal). In other words, vlPFC activity correlated with whichever system made the better prediction on a given trial. Furthermore, functional connectivity between the vlPFC and the posterior putamen and SMA (regions underlying habitual behavior) was modulated by the maximum reliability signal in the vlPFC: the more the maximum reliability signal in vlPFC favored goal-directed control of behavior, the more negative coupling there was between the vlPFC and the habitual behavior regions. In a separate study, patients with obsessive-compulsive disorder, who exhibit imbalance between goal-directed and habitual behavior, were found to have reduced global functional connectivity in the same region of vlPFC (Anticevic et al., 2014), and another study in patients with obsessive-compulsive disorder found reduced functional connectivity between the vlPFC and caudate that was associated with poor cognitive flexibility (Vaghi et al., 2017). Furthermore, in healthy human subjects, striatal dopamine levels associated with goal-directed (model-based) behavior have been found to accompany goal-directed encoding in the vlPFC (Deserno et al., 2015), suggestive of a possible link between the vlPFC and the dopamine-driven dorsomedial to dorsolateral striatal shifts (Everitt & Robbins, 2013). Most recently, transcranial direct-current stimulation applied to the vlPFC was found to causally modulate the favoring of either model-free (habit) or model-based (goal-directed) learning systems (Weissengruber et al., 2019).
Collectively, these findings implicate the vlPFC as a central cortical component in evaluating action plans and arbitrating between the execution of goal-directed or habitual behavior depending on current goals.
Inhibition
The vlPFC is known to play a role in inhibition across numerous domains, including the inhibition of actions, memories, thoughts, and emotions (Dillon & Pizzagalli, 2007; Hooker & Knight, 2006). Converging findings from fMRI, lesion, and transcranial magnetic stimulation studies point to vlPFC Area 44, corresponding to the pars opercularis of the inferior frontal gyrus in humans, as the vlPFC subregion most consistently linked to motor inhibition (Aron et al., 2014). The structural connections of vlPFC Area 44 strongly support its role-inhibitory functions. For instance, nonhuman primate tract-tracing studies show that vlPFC Area 44 receives strong input from premotor cortex (Frey et al., 2014), and corticostriatal projections from Area 44 converge with those from premotor cortex in the ventrolateral putamen (Korponay et al., 2020). Human fMRI studies of motor inhibition across a range of tasks find that these three regions—vlPFC Area 44, premotor cortex, and ventrolateral putamen—are consistently activated in relation to motor inhibition (Jahanshahi et al., 2015; Majid et al., 2013; Schel et al., 2014; Simmonds et al., 2008; Zandbelt & Vink, 2010). For example, one study found that ventrolateral putamen activity that co-occurred with vlPFC and SMA activity was associated with reduced activity in motor cortex and corresponded with successful inhibition of motor activity on the stop-signal task (Zandbelt & Vink, 2010). Another study found a similar compilation of dorsal caudal vlPFC activity, pre-SMA activity, and ventrolateral putamen activity during volitional stopping on the marble task (Schel et al., 2014). Furthermore, a meta-analysis of studies using the go/no-go task found that vlPFC Area 44, pre-SMA Area 6/32, premotor cortex Area 6, and putamen were among the regions consistently activated during successful inhibition on no-go trials (Simmonds et al., 2008).
Overall, the structural and functional circuitry of the vlPFC place it at the nexus of salience detection, action-plan evaluation, and inhibition networks—the three functions proposed as necessary for overriding habits in real time. In this way, the vlPFC is equipped to alert us to the presence of goal-relevant stimuli, evaluate whether continued execution of the habit or switching to an alternative goal-directed action aligns better with current goals, and inhibit execution of the habit if it is determined to be goal conflicting. Signatures of vlPFC activity during these stages of the modified stop-signal task would help to bolster this account.
Conclusion
The brain has a well-established, long-term mechanism—extinction learning—for decreasing the deployment of no-longer-adaptive habits over time. The process by which the brain temporarily halts goal-conflicting habits in real time has received less examination. Here, I reviewed evidence of the brain’s capacity for snapping out of autopilot, posited a model through which the brain implements this function, proposed a task to test and assess the model, and marshaled evidence in support of the central role of the vlPFC in this function. Collectively, this framework provides a jumping-off point for future research into the brain’s capacity for goal-directed stopping of habits in real time. For example, examination of aberrance in this function may be warranted in forms of psychopathology that feature deficits in reinforcement learning, behavioral flexibility, and/or balancing goal-directed and habitual behavior, such as substance use disorder (McKim et al., 2016), obsessive-compulsive disorder (Gillan et al., 2011), and action disorganization syndrome (Niki et al., 2019).
