Abstract
Traumatic brain injury (TBI) affects a large population, resulting in severe cognitive impairments. Although cognitive rehabilitation is an accepted treatment for some deficits, studies in patients are limited in ability to probe physiological and behavioral mechanisms. Therefore, animal models are needed to optimize strategies. Frontal TBI in a rat model results in robust and replicable cognitive deficits, making this an ideal candidate for investigating various behavioral interventions. In this study, we report three distinct frontal TBI experiments assessing behavior well into the chronic post-injury period using male Long-Evans rats. First, we evaluated the impact of frontal injury on local field potentials recorded simultaneously from 12 brain regions during a probabilistic reversal learning (PbR) task. Next, a set of rats were tested on a similar PbR task or an impulsivity task (differential reinforcement of low-rate behavior [DRL]) and half received salient cues associated with reinforcement contingencies to encourage engagement in the target behavior. After intervention on the PbR task, brains were stained for markers of activity. On the DRL task, cue relevance was decoupled from outcomes to determine if beneficial effects persisted on impulsive behavior. TBI decreased the ability to detect reinforced outcomes; this was evident in task performance and reward-feedback signals occurring at beta frequencies in lateral orbitofrontal cortex (OFC) and associated frontostriatal regions. The behavioral intervention improved flexibility and increased OFC activity. Intervention also reduced impulsivity, even after cues were decoupled, which was partially mediated by improvements in timing behavior. The current study established a platform to begin investigating cognitive rehabilitation in rats and identified a strong role for dysfunctional OFC signaling in probabilistic learning after frontal TBI.
Introduction
Traumatic brain injuries (TBIs) are associated with significant risk for a variety of psychiatric diseases. 1 Cognitive deficits, including impaired behavioral flexibility and increased impulsivity also occur at high rates in the TBI population. 2,3 These deficits are chronic and can affect work productivity, family life, and increase risk for substance abuse. 4 Research in cognitive rehabilitation has promise to treat these problems, 5,6 but findings are inconsistent. A critical review of clinical cognitive rehabilitation in TBI indicated low statistical power and minimal controlling for or reporting of comorbidities, 7 highlighting the need for quality research in this domain. Similar issues may have also contributed to recent false advertising claims leveraged against commercial “brain training” programs by the U.S. government. 8 A stronger understanding of the mechanisms driving impaired behavioral flexibility and impulsivity may yield new insights to develop effective cognitive therapies for TBI.
Behavioral flexibility is the adaptation of behavior in response to changing contingencies. 9 Flexibility can be measured experimentally by changing reinforcing contingencies after a response strategy is acquired. The attentional set-shifting task was effective at identifying flexibility deficits after frontal TBI, 10 but the probabilistic reversal learning (PbR) task may be even stronger for repeated testing, given its challenging, ambiguous reinforcement contingencies. 11 Impulsivity is a heterogeneous construct defined as actions that may result in short-term gain at the cost of long-term benefits. 12 Impulsivity is measured experimentally by various tasks that concurrently pit a reinforced response against the need to inhibit that response under certain conditions (e.g., cues, time). Impulsive tasks capture deficits in rat models of focal bilateral frontal, but not unilateral parietal TBI. 13 -15 However, an underexplored impulsive task for TBI is the differential reinforcement of low-rate behavior (DRL) task, which requires a minimum amount of time between responses for reinforcement. 16 This task has components of inhibition of acute (“disinhibitory”) impulsive responding and inhibition of responses under times more relevant for impulsive choice. To date, only a single study used this in chronic TBI. 17
Both behavioral flexibility and impulsivity are reliant on circuits involving the frontal cortex (medial prefrontal, orbitofrontal cortices) and connections to the dorsal and ventral striatum, making these critical to study. Dopaminergic and noradrenergic modulation of the frontal cortex is a critical regulator of such behaviors 9,18,19 and striatal tonic and evoked dopamine activity is chronically reduced in a severity-dependent fashion after injury. 20 These changes in dopamine signaling across frontostriatal networks may lead to reduced outcome salience (i.e., less salient reinforcer-action contingencies) and impair behavioral flexibility or exacerbate impulsivity. Indeed, rats with frontal TBI showed differences in outcome salience by tracking the location associated with reinforcer delivery (i.e., goal-tracking) over reinforcement-predictive cues (i.e., sign-tracking). 21 Thus, a strategy for behavioral rehabilitation may involve augmenting the saliency of reinforcement contingencies. For example, a therapist may correct outcomes or explain why an outcome is not desired. However, rats 22,23 and patients 24,25 with TBI have difficulty discriminating outcomes which may limit automation or scaling of this approach.
An alternative method of cognitive rehabilitation may be to strengthen response-outcome relationships by forcing engagement in target behaviors—something that can readily be studied in animal models. This is often an element of various types of cognitive-behavioral therapy, which emphasize engaging in specific actions. A clinical trial showed modest promise of this approach for psychological distress after TBI, 26 but larger samples are needed to evaluate other injury-related outcomes. In non-injured rats, engagement in a target behavior (timing intervals) improved self-control for beneficial delayed outcomes, highlighting this therapeutic approach. 27,28 A recent study demonstrated efficacy of spatial-based behavioral intervention for experimental TBI, 29 but was focused on hippocampal-related functions and may not translate for frontally-dependent behaviors such as flexibility and impulsivity.
Thus, the goal of the current study was to establish an animal model for studying behavioral interventions after frontal TBI in rats by encouraging them to engage in target actions (i.e., respond correctly on a task). We performed three studies. In the first, we recorded local field potentials from 12 frontostriatal regions that are heavily involved in the assessed behaviors during the PbR task after sham or TBI surgery to gain a mesoscopic view of brain activity similar to human electroencephalography (EEG) in order to better understand the mechanisms driving post-TBI functional deficits and reorganization. 30 –33 Next, we evaluated using a cue to guide behavior (thereby increasing engagement with target behaviors) to drive performance on the PbR task. We then extended this rehabilitative paradigm to the DRL task, using the high-resolution behavioral data to parse impairments in disinhibitory versus timing-related impulsivity. To evaluate whether treatment effects persisted, we decoupled cues from reinforcing contingencies and tested for an additional 2 weeks. Finally, we probed the effects of cues in post-mortem brain using early activation and dopamine/norepinephrine markers. Across these experiments, we hypothesized that TBI would disrupt frontostriatal circuits, resulting in behavioral dysfunction that could be rescued by behavioral intervention.
Methods
Experimental design
In Experiment 1, to study how neural activity is affected by TBI during decision-making in aged rats, we recorded local field potentials while rats performed a self-paced probabilistic learning (PbR) task. We compared brain activity collected simultaneously from 12 electrode sites during reward-feedback in either TBI or Sham rats from Weeks 3-11 post-injury (Fig. 1A). In Experiments 2 and 3, to determine whether introducing salient cues to engage target behaviors could rescue chronic TBI-induced impairment, we conducted two behavioral studies and one histological study in young adult rats. The core design for these studies was a 2 × 2 factorial with rats in either TBI or Sham conditions, and Cue or Control conditions. In Experiment 2, we assessed initial behavioral flexibility using the Attentional Set-Shifting task (AST) and then determined whether impairments could be rescued by cues associated with the highest probability choice on the PbR task (Fig. 1B). We then measured neural activity in target regions using histological markers of early activation. In Experiment 3, we used a differential reinforcement of low-rate behavior (DRL) 20-sec task to measure impulsive behaviors associated with disinhibitory responding and poor timing of delays. After behavior was stable, cue importance was degraded (always on) to probe whether the intervention had lasting effects on impulsivity (Fig. 1C). Rats were randomly (or pseudorandomly when baseline data were available) assigned to conditions.

Study design.
Animals, brain injury, and electrode implant
Full details are in the Supplementary Material. A total of 123 male Long-Evans rats were used initially, 108 remained after surgical deaths and exclusions due to technical issues. In Experiment 1, rats (age: 8.5 months) were water-scheduled (2 h free access/day) and in Experiment 2-3 rats (age: 3 months) were food-restricted to approximately 85% of ad libitum weight. Bilateral frontal controlled cortical impacts were delivered according to previous studies and prior to behavior testing. 14,34 A 5.0 mm diameter flat-faced tip impacted the exposed medial prefrontal cortex (+3.0, 0.0 from bregma) at a rate of 3 m/sec to a depth of 2.5 mm. Sham controls underwent an “intact” sham procedure with no craniectomy. Previous researchers have expressed concern over craniotomy shams, 35 but we have not found differences in craniectomy versus intact procedures. 14
Rats in Experiment 1 underwent a second surgery to implant local field potential microwires one week following the TBI/Sham surgical procedure. The implantation procedures have been previously described in detail 36 and are expanded upon further in the Supplementary Material. The probes consisted of 32 wires, separated into eight metal cannula (four wires/cannula). Each cannula targeted the same AP/ML coordinates, but individual microwire was cut to a unique DV depth. For this paper we focused our analysis on 12 regions (Fig. 1A) previously identified as linked with reward-processing 32 and/or those adjacent to the injury site.
Behavioral apparatus
Full details are in the Supplementary Material. Rats were tested in a custom acrylic operant chamber (Experiment 1) 37 and in standard five-choice operant chambers (Experiment 2-3; Med Associates, St. Albans, VT).
Experiment 1: Electrophysiological correlates of behavioral flexibility
One cohort of rats performed Experiment 1 (N = 22). This experiment evaluated reward feedback-related neural activity following TBI associated with behavioral flexibility performance on the PbR task. Neural activity oscillates at periodic frequencies that may serve to coordinate communication within and between distributed brain areas. There are several canonical bands spanning frequencies (delta 1-4 Hz; theta 4-8 Hz; alpha 8-15 Hz; beta 15-30 Hz, gamma >30 Hz). Prior work identified distinct aspects of behavior and cognition correlated to the power of certain frequency bands and can also indicate disease states or predict response to treatment. 30 -32,38,39
Prior to surgery, rats were habituated (three sessions) and trained to perform sequential nose port responses for water reinforcers but were naïve to the final PbR task. Due to COVID-related lab closures, rats began habituation 7 months after arrival (∼8.5 month of age). On average, training took 2 weeks to complete (5-15 sessions), at which time surgeries were conducted. Once rats recovered from the second surgery, they began training on the self-paced PbR task (2 weeks post-injury; Fig. 1A). The behavioral training and simultaneous electrophysiology recording occurred in 60-min sessions two-four times/week in Sham (n = 10) and TBI (n = 12) rats.
On the PbR task, different probabilities of reinforcement were randomly assigned to each choice port at the start of a session (“correct” nose port = 80% reinforcement, “incorrect” nose port = 20% reinforcement). Each trial began with houselights off and the middle nose port LED on.
Responses to the middle nose port–initiated trials
LEDs in the choice ports to the left and right of the middle nose port illuminated to signal an available choice. A response in the “correct” port led to 2 sec (20 μL) of water 500 msec after the response on 80% of trials. Selecting the “incorrect” port led to 2 sec (20 μL) of water on only 20% of trials. Trials with no reinforcer resulted in tone presentation (500 msec) and illumination of the houselight to signal lack of water. There was a 5-sec intertrial interval after the outcome. The probabilities associated with each choice port were reversed when eight out of the last ten responses in a moving window were “correct” choices (regardless of reinforcement outcome). For rats with implants still intact (n = 17), the experiment ended at 11 weeks post-injury (minimum 8 weeks post-injury). Analyses were based on 275 behavioral sessions (average 11 sessions/rat; Fig. 1A). Local field potential data was pre-processed offline using custom MATLAB scripts and functions from EEGLAB. 30,31,36 Data were aligned to choice event, artifacts removed, and waveforms extracted. More details are available in the Supplementary Material. Confirmation of electrode placement is detailed in Figure S1 in the Supplementary Material.
Experiment 2: Effects of behavioral intervention on behavioral flexibility
Because the rats in Experiment 1 were aged, we also wanted to evaluate whether effects of injury generalized to young rats or if those findings were due to age-related susceptibilities. This experiment also extended beyond the prior to evaluate common paradigms for assessing behavioral flexibility and implemented a salient behavioral intervention to encourage tracking of reversal contingencies to improve function after TBI, and finally evaluated neural activity in post-mortem tissue using immediate early gene markers. One set of rats went through Experiment 2 parts a through c (N = 41).
Experiment 2a: attentional set-shifting
To establish a baseline for flexibility deficits prior to long-term testing, the AST was carried out as previously described, 9 at 5 weeks post-injury in Sham (n = 22) and TBI rats (n = 19). Rats were approximately 2.5 months old at the time of injury. Rats were habituated (two sessions, 10 sucrose pellets freely available) to the chamber and then trained to lever press using an autoshaping procedure where a pellet was delivered every 35 sec on average, but 10 sec prior to pellet delivery, both levers extended, and any presses were immediately reinforced. Three sessions were conducted until 40 presses occurred, or pressing was hand shaped by the experimenter. Rats then underwent retractable lever press training and a side preference assessment. AST testing began the next day. Each phase of AST consisted of 200 discreet trials per day, or until 10 consecutive correct responses occurred. The initiation of a trial was marked by the illumination either the left or right stimulus-light which varied pseudorandomly. After 3 sec, both the left and right levers were extended, and the rat was allotted 10 sec to make a response. Phase 1 (“set”), the cue discrimination task, reinforced responses to the location of the stimulus light. Phase 2 (“shift”), the response discrimination task, reinforced responses to one side (left/right), regardless of light position. Phase 3 (“reversal”), the response reversal task, reinforced responses to the side opposite the previous phase (e.g., left to right; Fig. 1B).
Experiment 2b: effects of behavioral intervention on probabilistic reversal learning
One week after completing the AST, rats began PbR testing and continued for 5 weeks according to previous protocols. 11 Each PbR session consisted of 200 discrete trials separated by a 15-sec intertrial interval; trials began on presentation of both levers and lasted 10 sec. During the first trial, differential probabilities of reinforcement were assigned to each lever (“correct” lever = 80%, “incorrect” lever = 20%) as described in Experiment 1. After eight consecutive “correct” choices (regardless of reinforcement), the probabilities associated with each lever reversed (Fig. 1B). For the behavioral intervention, approximately half the rats had a cue light illuminated above the “correct” lever 3 sec prior to extension and throughout the trial, resulting in four groups: Sham (n = 12), TBI (n = 10), Sham-Cue (n = 10), and TBI-Cue (n = 9). Rats were matched for AST performance (trials to criterion on “shift”) and pseudorandomly assigned to cue or control condition.
Experiment 2c: histology
Further details are available in the Supplementary Material. During Week 11 post-injury, and at 90 min after behavioral testing started (task lasted 45-50 min, providing c-Fos peaks reflecting middle of task), rats were transcardially perfused. Slices were stained with c-Fos (1:500) and anti-TH (1:1000) and quantified in three regions of interest: prelimbic cortex, lateral orbitofrontal cortex (lOFC), and nucleus accumbens core. TH identifies dopaminergic, noradrenergic, and adrenergic neurons. Because TH+ cells were sparse outside of the accumbens and because dopaminergic and noradrenergic projections to frontal regions were also of interest, any cell with multiple colocalized pixels clustered together was counted as putative synapses onto active cells marked with c-Fos.
Experiment 3: Effects of behavioral intervention on impulsivity
The following experiments evaluated whether a behavioral intervention to encourage waiting for the elapsed delay could rescue impulsive deficits and if improvements persisted after the intervention was removed. One set of rats went through Experiment 3 A and B (N = 45).
Experiment 3a: differential reinforcement of low-rate behavior
Rats were approximately 2.5 months old at the time of injury. Two weeks post-injury, rats were habituated and underwent autoshaping as described in Experiment 2, but with only the left lever extended. Three sessions were conducted until 40 presses occurred, or pressing was hand shaped by the experimenter. Rats then performed three sessions of fixed ratio-1 responding. At 4 weeks post-injury, rats began testing on 60-min sessions of DRL-20 task for 5 weeks. On the DRL task, presses were only reinforced if they were spaced by greater than 20 sec. Any press made prior to that reset the timer (Fig. 1C). 16 For the behavioral intervention, approximately half the rats had a cue light illuminated after the 20 sec timer had elapsed, resulting in four groups: Sham (n = 13), TBI (n = 10), Sham-Cue (n = 11), and TBI-Cue (n = 11).
Experiment 3b: persistence of behavioral intervention effects
After 5 weeks of DRL task performance, behavior was stable. To determine whether the cue training persistently improved impulse control, the salience of the cue was degraded by turning on the cue light at all times for all rats. Rats were then re-tested for 2 additional weeks.
Statistical analysis
Additional details are in the in the Supplementary Material. The critical p value for all statistics was set at 0.05.
Electrophysiology data were analyzed using linear mixed-effects models (normalized power) using frequency band as the slope parameter of the random effect for each subject (Frequency|Subject, to account for interdependence) and Time, and Injury (Sham/TBI), Frequency (delta, theta, alpha, beta, low-gamma, and high-gamma), Trial (Reward/No Reward) and their interactions as the fixed effects. Another linear mixed effects regression was used to examine normalized beta power across electrodes using individual Subject intercepts and Time as the random effect, and Injury (Sham/TBI) and Electrode (12 total locations) and their interactions as the fixed effects. Finally, a linear regression was used to determine the relationship between peak beta power during reward-feedback (500-2500 msec after response) and behavioral measures in the PbR task. Post-hoc tests were Bonferroni corrected. All data were analyzed using SPSS and visualized using Graph Pad Prism software.
All other data were analyzed using R statistical software (http://www.r-project.org/; version 4.0.2) with the lme4, lmerTest, and stats libraries. Most repeated-measures behavioral data were analyzed using linear mixed effects regression using individual subject intercepts as the random effect, and Injury (Sham/TBI), Intervention (Non-Cued/Cued; Experiments 2-3 only), Week and their interactions as the fixed effects. The AST data were analyzed with repeated measures analysis of variance (ANOVA; Injury × Phase). A planned comparison was made with a t-test for the last session of performance in Experiment 3b between the TBI and TBI-cue group to determine if the intervention had a lasting beneficial effect. Histological data were analyzed as a mixed-effects ANOVA with individual subject as the random effect and Injury, Intervention, and their interactions as fixed effects.
Inter-response time (IRT) distributions in the DRL task were analyzed by fitting a compound curve consisting of an exponential decay and gaussian distribution (Fig. 1C). From this, disinhibitory responding was calculated by taking the area under the curve up to the 5-sec point and the ability to time the delay were estimated from the t0 value of the equation. These were then used as outcome measures as above. Finally, a planned comparison was made with a t-test for the last session of performance in Experiment 3b between the TBI and TBI-cue group to determine if the intervention had a lasting beneficial effect. Prior studies divided IRTs into early and late, 40,41 but this approach models the interdependency between the two distributions which we recently demonstrated affects false positive rates. 42
Results
Brain injury impaired probabilistic reversal learning (Experiment 1)
We assessed behavioral performance on the self-paced PbR task after frontal TBI as measured by the number of reversals, number of trials, trials to the first reversal (initial discrimination block), probability of staying given a win (“Win-Stay” or [p(Stay|Win)]) and probability of staying given a loss (“Lose-Stay” or [p(Stay|Loss)]). All measures were analyzed using a linear mixed effects model [Fixed: Injury (Sham, TBI) × Time, Random: Subject]. For the primary outcome measure, the number of reversals, the interaction of Injury × Time was significant (p = 0.001; Fig. 2A; Tables S1 and S2 in the Supplementary Material). TBI rats performed fewer reversals than Sham rats and took longer to improve behavior across sessions (p < 0.001). Since the task was self-paced, fewer reversals may reflect overall fewer trials completed, however injury did not significantly change trial count between groups (Injury p = 0.055; Fig. 2A).

Experiment 1: Performance on the self-paced probabilistic reversal learning (PbR) task 3-11 weeks post-injury.
Although it was not significant, we were concerned over divergence in total trials. To closely examine learning deficits related to behavioral flexibility, we also assessed the number of trials prior to reversal in the initial discrimination block and the probability of choosing the same option given a win or a loss. Trials in the initial discrimination block may be advantageous to measure the most ambiguous set of trials where rats must learn the reinforcing contingencies. TBI rats took more trials to complete the first reversal, and therefore had more difficulty discriminating compared with Sham rats (p < 0.001; Fig. 2B). There was no difference between groups in Win-Stay behavior (Fig. 2C). However, the Lose-Stay behavior was greater in TBI than Sham groups (p = 0.033; Fig. 2D). There was also a main effect of Time (p = 0.002; Fig. 2D) such that the probability of making the same choice after a loss decreased across weeks of testing.
Beta power was modulated by rewarded outcome and impaired after TBI (Experiment 1)
First, to identify neuronal communication signals most affected by the injury, we examined differences in normalized power between frequency bands during reward-feedback (500-2000 msec after response) on the lOFC electrode. We selected the lOFC electrode because it is a cardinal reward-processing region in the frontostriatal network and is located outside the focal injury site. Normalized power across frequencies was analyzed with a linear mixed effect model [Fixed: Injury (Sham, TBI) × Trial (Reward, No Reward), Frequency (Delta, Theta, Alpha, Beta, LGamma, HGamma). Random: (Frequency|Subject), Time].
There were significant Injury × Trial and Trial × Frequency interactions (p's < 0.001; Fig. 3A). On Reward trials, Sham rats had significantly greater beta frequency power and delta frequency power (p = 0.019; p = 0.029; Fig. 3A; Table S3 in the Supplementary Material) compared with TBI rats. However, on No Reward trials, there was no difference at any frequency band (Fig. 3A; Table S3 in the Supplementary Material). The contrast of [Reward-No Reward] trials was only significant at beta frequencies (p = 0.011; Fig. 3A; Table S3 in the Supplementary Material). To expand upon these analyses, we plotted the beta power over time relative to the Reward or No Reward signal (Fig. 3B). lOFC Beta during the outcome period differentiated between presence or absence of reward over time in Sham rats, but the difference was attenuated after TBI (Fig. 3B). Interestingly, the power modulation ramped up rapidly during the reward feedback period (500 msec after response) and maintained for several seconds, but remained low in TBI rats, indicating a less salient neural response to presence or absence of reward.

Experiment 1: Local field potential activity during the reward-feedback period (500-2500 msec after response) of the probabilistic reversal learning (PbR) task.
Given the importance of beta frequency oscillations identified, we sought to probe additional brain locations to evaluate whether these effects were region specific. First, we analyzed all 12 electrodes across the frontostriatal network in a linear mixed effect model [Fixed: Injury (Sham, TBI) × Electrode (12 electrodes) × Trial (Reward, No Reward). Random: Subject, Time]. These were selected to include areas focal and local to the site of injury, as well as distal that receive projections from prefrontal cortex (see Fig. 1A for region names and abbreviations). The model revealed a main effect of Electrode (p < 0.001; Fig. 3C) and main effect of Trial (p < 0.001; Fig. 3C; Tables S4 and S5 in the Supplementary Material). There were also significant Injury × Electrode, Injury × Trial, and Trial × Electrode interactions (p = 0.013; p < 0.001; p < 0.001; Fig. 3C; Table S4 in the Supplementary Material). Breaking these interactions down, on Reward trials TBI rats had significantly lower beta power in lOFC and the nearby anterior insula (Ains) electrode (p = 0.011; p = 0.037). However, across No Reward trials, TBI rats showed a significant increase in beta power on multiple electrodes, including A32D, LFC, ALM, vOFC, lOFC (p's < 0.032). When these were combined to the contrast of [Reward-No Reward], only the lOFC and Ains electrode effects remained significant (p's = 0.040; Fig. 3C; Table S3 in the Supplementary Material). To better understand if these effects were merely localized to the injury (no significant individual effects in distal circuits), we next examined beta frequencies across the grouping regions of focal, local, or distal electrodes to the injury site. There was a significant decrease in beta frequencies on the contrast of [Reward-No Reward] regardless of the location of recording (Focal t(4) = 5.84, p = 0.004; Local t(8) = 3.53, p = 0.008; Distal t(6) = 3.85, p = 0.008; Fig. 3D). These data suggest TBI broadly impacted beta power signaling during outcome processing.
Finally, we tested the relationship between beta power and PbR performance. Peak beta power during the reward-feedback period had a moderate positive relationship with the probability of staying given a win (p(Stay|Win); p = 0.026, Fig. 3E; Table S6 in the Supplementary Material). However, peak beta power had no relationship with probability of staying given a loss (p(Stay|Loss); p = 0.489, Fig. 3E; Table S6 in the Supplementary Material). This indicates that beta power is not only related to discriminating winning versus losing trials (at least in sham rats), but that it also relates to future behavior based upon winning outcomes.
Behavioral intervention rescued deficits in probabilistic reversal learning (Experiment 2)
Prior to the behavioral intervention, rats were tested on the AST for baseline deficits in flexibility. There were no TBI effects, other than a small increase in omitted trials in the reversal phase (p = 0.001; Fig. S2 in the Supplementary Material; Table S7 in the Supplementary Material). Rats were then pseudorandomly assigned (based on AST trials to criterion in the “shift” phase) to behavioral intervention or normal conditions on the PbR task (Fig. 1B). All outcome variables on the PbR task were analyzed using linear mixed effects regression [Fixed: Injury (Sham, TBI) × Intervention (Non-Cued, Cued) × Time, Random: Subject; Tables S8 and S9 in the Supplementary Material).
For the primary outcome measure, the number of reversals, the three-way interaction of Injury × Intervention × Time was significant (p = 0.042; Fig. 4A). Evaluation of this effect revealed that because TBI rats started lower than Shams, they had higher rates of learning (p = 0.026). In contrast, behavioral intervention restored reversals in TBI-Cue rats to Sham levels from the start of testing and showed no difference in learning rate (p = 0.473). For total omissions on the PbR task, there was an Injury × Time effect (p < 0.001) such that TBI rats started high but declined over time (Fig. 4B).

Experiment 2: Performance on the probabilistic reversal learning (PbR) task.
To better understand the pattern of choice, the probability of choosing the same option given a win (“Win-Stay”; [p(Stay|Win)]) or a loss (“Lose-Stay”; [p(Stay|Loss)]) were examined (Fig. 4C, 4D). For Win-Stay behavior, there was a significant Intervention × Time effect (p < 0.001), such that Cued rats increased the likelihood of choosing the same option again relative to Non-Cued rats. There was also a significant Injury × Time effect (p = 0.041), such that TBI decreased the likelihood of choosing the same option again relative to Sham. For Lose-Stay behavior, there was a significant effect of Intervention × Time (p < 0.001), such that Cued rats increased the likelihood of choosing the same option again relative to Non-Cued rats. There was also a main effect of Injury (p = 0.012) such that TBI rats were more likely to stay after a loss.
Behavioral intervention increased c-Fos expression in the orbitofrontal cortex (Experiment 2)
To evaluate how the behavioral intervention and reversal learning were affecting neural activity, rats were euthanized after completing the task and examined for c-Fos expression. The number of c-Fos+ cells were analyzed in mixed-effects Poisson regressions [Fixed: Injury (Sham, TBI) x Intervention (Non-Cued, Cued), Random: Subject; Table S10 in the Supplementary Material] and c-Fos+ cells colocalized with TH+ were analyzed in two-way ANOVAs [Injury × Intervention].
For c-Fos+ cells, there were no differences in the prelimbic or nucleus accumbens core (p's > 0.107). However, in the lOFC, there was a significant Injury × Intervention interaction (p = 0.006; Fig. 5B), such that TBI severely reduced the number of active cells, but the behavioral intervention improved this. For TH-colocalized cells, there was a significant main effect of Injury (p = 0.039; Fig. 5A), such that TBI increased c-Fos+/TH+ cells. There were no other effects for colocalized cells (p's > 0.214).

Experiment 2: Active neurons following the probabilistic reversal learning (PbR) task. Panel
Behavioral intervention reduced impulsivity after TBI (Experiment 3)
A separate set of rats was evaluated on an impulse control task to determine the effects of a behavioral intervention on an additional frontally-mediated function. The primary measure of impulse control, percent reinforced responses, were analyzed using linear mixed effects regression [Fixed: Injury (Sham, TBI) × Intervention (Non-Cued, Cued) × Time, Random: Subject; Tables S11 and S12 in the Supplementary Material]. The three-way interaction of Injury × Intervention × Time was significant (p < 0.001; Fig. 6A). A breakdown of this effect revealed that while Sham rats improved over time, this effect was not present in TBI rats (p < 0.001), and the behavioral intervention led to no difference between TBI-Cue and Sham-Cue rats (p = 0.309). Because data were percentage-based, a secondary measure of total responses was also analyzed in the same fashion to verify effects were not driven by low response rates in TBI rats. The three-way interaction of Injury × Intervention × Time was significant (p < 0.001). Comparison of this effect revealed the same findings: that injury impaired TBI rats relative to Sham over time (p < 0.001), but there was no difference between TBI-Cue and Sham-Cue rats (p = 0.484). Only six instances of less than 100 presses were recorded (lowest: 65), and the average was 260.64 presses per session (data not shown).

Experiment 3: Performance on the differential reinforcement of low rate (DRL) task
To breakdown disinhibitory and ability to wait components of the task, individual IRT distributions were fit to the compound exponential + Gaussian curve (Fig. 1C) using 1 week's worth of data per subject. The raw curves are shown in Figure 7. Parameter values were then used to estimate proportion of disinhibitory responding using the area under the curve and ability to wait for the delay using the t0 parameter. These values were analyzed as described above. For disinhibitory responding, only the Intervention × Week effect was a significant contributor to the model (p = 0.023; Fig. 6B) such that Intervention reduced disinhibitory responses. For the timing t0 parameter, the 3-way interaction of Injury × Intervention × Time was significant (p = 0.006; Fig. 6C); TBI rats were significantly less likely to wait (i.e., left-shifted values or shorter IRTs) across weeks relative to sham (p < 0.001), while behavioral intervention prevented this effect in TBI-Cue rats relative to Sham-Cue (p = 0.968).

Experiment 3: inter-response time (IRT) distributions from the differential reinforcement of low rate (DRL) task were used to identify disinhibitory versus timing aspects of impulsivity. Panels from top to bottom show performance across each post-injury week and the final session. The first horizontal break (between Weeks 8 and 9) represents the cessation of the behavioral intervention, and the second break (after Week 10) isolates the last session of post-injury Week 10 that was analyzed for persistence of rehabilitative effects (in Fig. 6). The dotted line indicates the 20 sec mark after which presses were reinforced. Note differing Y axes to better display data. Lines represent group means (± standard error of the mean shaded). Color image is available online.
Behavioral intervention effects persisted after discontinuation for TBI Rats (Experiment 3)
To evaluate persistence of effects, cues were turned on at all times to degrade their relevance. After 2 weeks, a planned comparison between TBI and TBI-Cue groups was conducted with a t-test for percent reinforced responses, disinhibitory responding area under the curve, and the timing t0 parameter on the final session. There was a significant improvement in the TBI-Cue group in percent reinforced responses (t(17.98) = -2.97, p = 0.008; Fig. 6D) and fewer total presses (t(17.54) = -2.46, p = 0.025). However, when the variables from the IRT distribution were analyzed, there was no significant improvement in disinhibitory responding (t(18.61) = -0.94, p = 0.358; Fig. 6E), but there were significant improvements in the t0 parameter, the ability to appropriately wait for an interval (t(17.58) = 2.15, p = 0.046; Fig. 6F).
Discussion
There is a need for effective cognitive rehabilitation paradigms to treat individuals suffering from chronic psychiatric symptoms after TBI. Animal models using behavioral interventions to approximate cognitive rehabilitation can investigate physiological and behavioral mechanisms and optimize therapeutic strategies. Chronic impulsivity and decision-making deficits occur in patients with brain injuries 2,43 and in animal models of injury. 14,44 Cognitive rehabilitation paradigms in patients with TBI show some potential 5,6 ; however, there are also criticisms of the rigor and degree of therapeutic effects for current methods. 7 Ultimately, a stronger understanding of the neuroplastic processes underlying these effects could augment therapies through pharmacology, neuromodulation, or modified behavioral paradigms.
In the current study, both behavioral and neurophysiological results suggested insensitivity to reinforcement was underlying behavioral flexibility deficits after frontal TBI. Blunted valence signaling in frontostriatal brain areas was particularly evident in loci associated with outcome processing such as the lOFC and anterior insula (Fig. 3C). TBI rats were poorer at adjusting behavior to changing contingencies on the PbR task (Fig. 2). This was in line with reward-feedback impairments indicative of poor reinforcement salience at beta frequencies (Fig. 3). Therefore, a simple behavioral intervention to engage rats in the target behavior was evaluated: “correct” responses were cued. This successfully rescued deficits in behavioral flexibility (Fig. 4A) and even increased activity of cells in the lOFC (Fig. 5B), indicating potential for plastic reorganization of these circuits after TBI. However, it was key to demonstrate that this would translate to lasting effects and generalize to other frontally-dependent behaviors. To evaluate this, we assessed an intervention paradigm on an impulsivity measure. Rats with TBI were more impulsive, showing reduced inhibition and poor waiting during the DRL task (Fig. 6 and 7). Our behavioral intervention improved these deficits and, critically, improvement persisted when intervention was discontinued (Fig. 6D-F). Together, these data establish a framework for evaluating behavioral interventions and investigating physiological substrates such as improved reinforcer salience in reward-related networks.
In Experiment 1 (Fig. 2 and 3), local field potentials captured brain dynamics on a mesoscopic scale, which offers translational potential to human EEG. 33,39,45 We identified a reward-feedback signal at beta frequencies that was impaired following TBI. Beta frequency activity is related to aspects of reward value and outcome evaluation. 32,46 -48 In Sham rats, beta oscillations modulated activity based on outcome: higher for rewarded, lower for non-rewarded (Fig. 3A-D). In injured rats, beta activity failed to differentiate between presence or absence of reward (Fig. 3A-D); a deficit that was evident at locations very close (“focal”), near (“local”), and away from (“distal”) the injury site (Fig. 3D). The effect was most prominent in the lOFC and anterior insula (Fig. 3C). Peak beta activity was positively related to Win-Stay, and not Lose-Stay behavior, suggesting importance of beta activity in guiding future decisions (Fig. 3E). In humans, insular beta oscillations increased during salient outcomes (positive or negative) and correlated with other cortical areas during reward presentation. 49 Beta oscillations may be inversely related to reward-prediction error signal, 50 and dopamine or norepinephrine disruptions following TBI may modulate changes in beta-driven reward feedback.
We also observed decreased reward-related delta power following injury (Fig. 3A), consistent with a previous report of task-related delta linked to positive rewards. 51 The electrophysiology closely tracked impaired flexibility on the PbR task (Fig. 2A-D). TBI rats were able to learn the task, but completed fewer reversals and demonstrated poor initial discrimination suggesting impairments may be driven by outcome evaluation in general. These findings map onto clinical populations with brain injury. The ability to adapt to changing outcomes is impaired after TBI 24 and patients displayed difficulty identifying the contingencies governing behavioral tasks. 25 Because of deficits in outcome evaluation, we hypothesized a behavioral intervention targeting engagement in correct actions would be effective. This simple approach has mirrors in motor rehabilitation; for example, in constraint-induced movement therapy after stroke, affected limb use is forced by restraining the functional limb. 52
We used a simple intervention: cueing correct responses, which had the function of engaging the rat in the targeted behavior. This substantially improved flexibility (Fig. 4A) and impulsivity (Fig. 6A). In the PbR task, associating cues with the higher-probability option enabled rats to stay with choices through wins and losses (Fig. 4C-D), indicating that TBI rats can use salient signals to guide correct behavior. Prior to rehabilitation, there was no injury effect on our three-session version of the AST (Fig. S2 in the Supplementary Material), highlighting the importance of task selection for deficit identification. Had we found an effect, we could have used this to stratify and match rehabilitative treatment. However, an AST variant using within-session changes (more similar to PbR task) did detect frontal TBI deficits. 10
In intact rats, reversal learning requires integration of the OFC, PFC, and ventral striatum. 11,53,54 Engagement in target behaviors under the intervention on the PbR task recovered a deficit in OFC activity, as measured by c-Fos staining (Fig. 5B), providing further evidence for the OFC in mediating outcome salience deficits. The frontostriatal regions involved in behavioral flexibility are sensitive to changes in dopamine, with impairments from both increased and decreased dopamine activity. 55,56 Interestingly, despite reports on decreased dopamine signaling after TBI20, there was a significant injury-related increase in c-Fos+/TH+ co-labeled cells in the injury-adjacent prelimbic cortex, which could indicate changes in dopamine or norepinephrine regulation of this region (Fig. 5A).
In the DRL task, using a cue light to indicate the 20 sec timer elapsed substantially improved the percent of responses reinforced (Fig. 6A), again confirming the ability of TBI rats to use salient cues. To better understand this high-resolution behavior, we fit a model to the distribution of IRTs (Fig. 1C) and extracted parameters for disinhibitory impulsivity (i.e., the area under the initial curve) and waiting-related impulsivity (i.e., the timing t0 parameter). The behavioral intervention reduced disinhibitory responding and improved waiting for the time interval (Fig. 6B, 6C; Fig. 7). Critically, improvements in TBI rats persisted even after cues were decoupled from reinforcing contingencies (Fig. 6D), which was partially mediated by improvements in timing (Fig. 6F).
While this is a simplified approach to an animal model of cognitive rehabilitation, it provides a base to refine clinical applications. The development of effective paradigms is a priority for treating TBI and other neurological and psychiatric disorders. A recent successful patient study used a similar strategy to the current research: prompting engagement in target behaviors. 57 Unfortunately, these solutions are still inadequate to the population at large, 58 highlighting the need for rigorous animal experiments that can also explore neuroplastic mechanisms. Recent animal studies indicate this strategy may be applicable across a variety of injury types and locations. Hippocampal deficits were rescued after a diffuse central brain injury with 7-14 days of free exploration in a complex, changing spatial environment. 29 In a model of lateral, moderate TBI, pre-training on the Morris water maze with no external cues improved performance after injury 59 and post-injury training also improved function. 60 Combined with the current study, these data suggest that rehabilitation efforts should be designed to engage specific circuits of interest.
A corollary strategy, environmental enrichment, is also widely studied in TBI, stroke, and related disorders. Introducing novel toys, larger environments, and/or group housing improves motor- and learning-related outcomes after TBI or stroke, though cognitive effects are often less robust than motor. 61 –64 It must also be considered that any extensive cognitive behavioral testing may have similarities to enrichment manipulations. For example, we do not currently know what the effect of AST training in Experiment 2a was on PbR performance in Experiment 2b. These effects will need to be parametrically assessed in future studies.
Many challenges remain to optimize behavioral interventions for TBI and similar conditions. In the field of stroke, the initiation, duration, and intensity of exercise for rehabilitation is still heavily debated, with cautionary tales against too early and too late of intervention. 65,66 In the current study, we delayed treatment until the chronic period when there are pervasive deficits in impulsivity and decision-making, but no gross deficits in motivation or motor function 14,44 so as not to bias cognitive measurements (though motor function was not assessed in the current rats). The current findings suggest the window for behavioral intervention may be wider than motor (Fig. 6D), however this timeframe needs to be defined, and earlier intervention may be more efficacious. Other timing-related approaches such as training on fixed interval schedules reduced impulsive choice in intact rats, 27,28 suggesting there may be many potential strategies to treat impulsive deficits.
Finally, in pursuit of replicable and generalizable science, experiments in this manuscript were conducted in parallel at different sites, and with critical differences in procedure (see the “Transparency, Rigor, and Reproducibility Summary” section for detail). Varying procedural details ultimately provides stronger confidence for a general effect because it reduces the chance of false positives occurring due to procedural artifact. 67 Experiment 1 was conducted in aged animals and with chambers and behavioral procedures optimized for electrophysiology. Despite these differences, the effect of brain injury (reducing flexibility) was the same. Moreover, while very different probes of pathology were applied to the studies (electrophysiology vs. histology), both identified the OFC as a nexus of this impairment. More still needs to be done to evaluate the nuances of how critical variables like age contribute to these findings and to verify that they further replicate beyond these experiments.
Overall, the current data provides a starting point to investigate cognitive rehabilitation in animal models of disease, identifies potential physiological mediators (Fig. 2, 3, and 5B), and most critically, demonstrates lasting functional recovery (Fig. 6D, 6F). More research will be needed to isolate the physiological substrates and determine how behavioral interventions may be combined with other environmental, pharmacological, or neuromodulatory interventions. While challenging to combine these disparate approaches, basic research will need to incorporate these various strategies to better promote circuit reorganization in models of disease and develop effective treatments for clinical conditions.
Transparency, Rigor, and Reproducibility Summary
The current study was performed in two different locations (Experiment 1: UCSD, Experiment 2-3: WVU). Both used the same rat supplier and breed, but likely sourced from different locations. Despite different lab locations, personnel, and slight differences methodology and reinforcer for Experiment 1 versus Experiment 2 (water vs. sucrose reinforcer, self-paced vs. fixed trials, 8-month vs. 3-month-old rats), the results were largely concordant. While this is not direct replication, it provides additional confidence in effect generalization; that is the effects of TBI were observed even with these protocol and outcome differences. These experiments were in male rats only as they were conducted before both laboratories began regularly using females. A total of seven rats died during or shortly after surgery, five rats were excluded due to lack of damage from the TBI, and three were excluded due to lack of electrophysiology signal. Experiment 1 was performed as a single cohort; Experiment 2 was performed across three cohorts; Experiment 3 was performed across two different cohorts. A block of four brains was damaged during histological preparation for Experiment 2, resulting in lower numbers in the TBI group. Data will be made fully available (see below) and a preprint was uploaded to BioRxiv (available at https://doi.org/10.1101/2023.07.02.547397). Electrophysiology data (Experiment 1) is also re-used in a separate publication 32 with only the sham controls from the current study to optimize processing parameters and integrate data analysis from additional implant sites. TBI and Sham were tested and recorded concurrently in the current study.
Data Availability
Data from Experiment 1 will be made publicly available at the Open Data Commons for Traumatic Brain Injury database (behavioral; search Dhakshin Ramanathan) and DANDI: Distributed Archives for Neurophysiology Data Integration (electrophysiology; search Dhakshin Ramanathan). Data from Experiments 2 and 3 will be made publicly available at the Open Data Commons for Traumatic Brain Injury database (search Vonder Haar). Associated analysis code will be publicly posted to the corresponding author's Github (https://github.com/VonderHaarLab/).
Footnotes
Acknowledgments
We would like to thank Anastasios Lake, Virginia Milleson, Kristen Pechacek, and the other members of the Injury and Recovery Laboratory for helping with behavioral testing as well as Tianzhi Tang, Sidharth Hulyalkar, and Alyssa Terry for help with behavioral training, electrophysiology, and data analysis.
Authors' Contributions
Miranda Koloski: Designed research, performed research, analyzed data, wrote manuscript. Christopher O'Hearn: Designed research, performed research, analyzed data, wrote manuscript. Michelle Frankot: Performed research, analyzed data. Lauren P. Giesler: Performed research, analyzed data. Dhakshin Ramanathan: Designed research, wrote manuscript. Cole Vonder Haar: Designed research, analyzed data, wrote manuscript.
Funding Information
This research was funded by the National Institutes for Health (R01-NS110905, P20-GM109098, P30-CA23100, T32-MH018399), VA Office of Research and Development (Career Development Award to DR, Career Development Award IK2BX006125 to MK), VA Excellence for Stress and Mental Health, the Southern Regional Education Board, and West Virginia University.
Author Disclosure Statement
No competing financial interests exist.
Supplementary Material
Supplementary Material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
