Abstract
This article argues that the dual-process position can be a useful first approximation when studying human mental life, but it cannot be the whole truth. Instead, we argue that cognition is built on association, in that associative processes provide the fundamental building blocks that enable propositional thought. One consequence of this position is to suggest that humans are able to learn associatively in a similar fashion to a rat or a pigeon, but another is that we must typically suppress the expression of basic associative learning in favour of rule-based computation. This stance conceptualises us as capable of symbolic computation but acknowledges that, given certain circumstances, we will learn associatively and, more importantly, be seen to do so. We present three types of evidence that support this position: The first is data on human Pavlovian conditioning that directly support this view. The second is data taken from task-switching experiments that provide convergent evidence for at least two modes of processing, one of which is automatic and carried out “in the background.” And the last suggests that when the output of propositional processes is uncertain, the influence of associative processes on behaviour can manifest.
Preface
This article is based on the inaugural Mackintosh Lecture given at the 20th Associative Learning Symposium held at Gregynog, Easter 2016. It stems from discussions that Nick Mackintosh and I had during a visit he paid to Exeter a few years before during which we agreed to update the McLaren, Green, and Mackintosh (1994) position on implicit and explicit learning, which was eventually published as McLaren et al. (2014). At that time, it became clear that we had more to say than could be contained in that update and that in making the case for the multiple (as opposed to single) process view of learning we had, in large part, neglected to say exactly what our take on it was. This is an attempt to redress that omission by making clearer the nature of the “dual” or multiple process theory of learning that Nick and I subscribed to, as well as offering some evidence for it. As such, it is sadly missing one author, but I would like to acknowledge the role that Nick played in developing these ideas over the course of many years.
Ian McLaren
The debate as to whether human learning, memory, and cognition (but particularly human learning, which is our focus here) are best considered to be due to a single set of propositional processes or rather a combination of these with other, associative processes is ongoing (see, for example, De Houwer, 2009; Mitchell, De Houwer, & Lovibond, 2009). This article is an attempt to chart a course through this debate and provide a dual-process account that avoids many of the well-founded criticisms often levelled at this class of theory by advocates of the single process position. At the same time, it does not shy away from placing associative processes at the very centre of our dual-process account and postulates that propositional processing is built upon associative foundations (for an earlier and very brief statement of this position, see the last few pages of McLaren et al., 2014). The basic computational process is taken to be association, but by implementing this in a complex, controlled, recurrent architecture, propositional, or what we shall term cognitive processing, emerges. The argument is that we are capable of propositional thought because the more basic, associative elements that make up our mind combine in such a fashion as to allow symbolic computation. This is the key idea that underpins what follows and has been at the core of the first author’s work over the past 30 years.
It has always been apparent that the modal dual-process account (see, for example, McLaren et al., 1994), which implied (even if it did not explicitly state) that associative and cognitive processes ran in parallel and independently of one another, was only a useful approximation to the true state of affairs. Why would two completely independent systems for learning evolve in the first place? Surely one would exploit the potential of the other—the view taken here—with rule-based processing being constructed from associative computational elements. But if this is the case, then it raises a number of questions that need to be answered. Why has the assumption of independent processes operating in parallel proven so useful? Can we find evidence that challenges this view while still supporting a dual-process account? The rest of this article will try to answer these questions by expanding on the novel interpretation of the dual-process position offered here.
We will also try to answer some other questions that have proved problematic for this debate over many years. Why is evidence of implicit learning so hard to come by (cf. Shanks & St John, 1994)? Why are “associative” effects often shown, on careful analysis, to be rather small or, in some cases, non-existent? And more specifically, why is human Pavlovian conditioning, in a number of experiments, demonstrably driven by conscious, cognitive expectancy rather than being a simple function of the reinforcement schedule as in the case of the rat (see Lovibond & Shanks, 2002, for all of these issues)? Surely, associative processes should apply equally in both cases (rat and human) and inevitably lead to the same outcome. The answer to all these questions emerges naturally from our position: that we are propositional entities constructed from an associative substrate. If this is the case, and our ability for propositional thought is to be of any effective use, then it must be given primacy in controlling behaviour as a default. This will require either active suppression of any associative outputs or the facilitation of cognitive processing so that it largely excludes the possibility of associative processes influencing behaviour to any marked extent. The net result will be to make it hard to detect evidence for associative processes under normal circumstances, and even when such effects are found, they will tend to be small. Furthermore, if propositional processing is the basis for control of much of behaviour, then conscious expectancy will correlate well with that behaviour, even when it contradicts what might be expected by an appeal to associative learning based on the contingency between stimulus and outcome.
Before embarking on a more detailed exposition of these points, and considering some evidence we have available that bears on them, a few other considerations are worth airing. One is why there is such widespread belief that associative processes are part of our psychological makeup. The answer to this question is apparently very simple; everyone has day-to-day access to evidence that it is the case. Or rather, they think they have. We can all call to mind times when we have done something without consciously thinking of doing it. From this observation, we deduce that conscious, propositional processing is not necessary for control of behaviour, and so the case for associative processes is made. There are at least two things wrong with this logic. One is that this point is about memory, and not necessarily about learning. There are many propositional theorists happy to allow that retrieval of previously learned episodes or instances can drive behaviour without any necessity for further propositional analysis at the time of retrieval (e.g., Mitchell et al., 2009). The associative theorist at this point might be tempted to say that these instances are associatively retrieved, and so were associatively learned, arguing that it makes little sense to postulate associative memory without associative learning. This is an internally consistent account of the phenomenon, and may even be true, but it does not have to be the case. It is quite possible that learning requires conscious awareness and propositional analysis at the time of learning, that the products of that learning do not tap into awareness in the same way, and that retrieval is not associative in the sense that it required earlier associative learning. A memory that is content addressable as a result of earlier propositional processing is entirely viable and would produce a similar result.
We must also beware of immediately identifying the process of automatisation with associative processing. This would be one explanation for how things become automatised, and there is a great deal of evidence that supports a distinction between the controlled and the automatic. Associative processes may themselves be automatic, yes, but the transition from the controlled to the automatic does not have to be a transition from the propositional to the associative. In fact, taking the view expressed here makes it quite unlikely that this would straightforwardly be the case. An interpretation of automatisation as associative learning “taking over” from propositional processing implies two relatively separate and independent systems, which is exactly the view that we are arguing against. It would make more sense for automatisation to be viewed as some more complex combination of the propositional and the associative from this standpoint. This is an issue that we will defer until a later article.
It is clear, then, that there is work to be done to convince the wider psychological community of the correctness of the position articulated in this article. It is not as obvious as might be thought that a dual-process account of human learning is correct, and it is fair to say that the role of associative processes in human mental life could be construed as somewhat uncertain at present. How, then, are we to proceed? The answer adopted here is to use the theoretical framework given earlier to specify circumstances where associative processes might reveal themselves. We can expect to find this evidence when people are, in some sense, uncertain and so cannot apply the usual level of cognitive control to their responses. Other circumstances where cognitive control of responding might be weak are after a lot of practice leading to automatisation of the response (but here the interpretation of the effect is an issue), where learning is incidental (so that people are not consciously using the relevant information to control responding—note that this can be different to being unaware of that information), or where the instructions given favour associative processing (a form of mental judo where we pit propositional processes against themselves to reveal associative influences). In what follows, we offer examples of some of these (automatisation being the exception). The first experiment focuses on human Pavlovian conditioning using a bi-conditional discrimination (AX+, AY–, BX–, BY+) employing electrodermal conditioning and expectancy and skin conductance response (SCR) as dependent measures. This provides evidence that conditioning can be under the control of conscious cognitive expectancy in some circumstances, but not in others, even in the same participants in the same experiment. The second experiment looks at another bi-conditional discrimination, but this time in the context of a task-switching experiment employing reaction time (RT) and errors as our measures of performance. The focus in these experiments is both on learning the stimulus → response mappings in the experiment and on how the current trial is affected by the preceding trial. Here, we are able to offer evidence for an important feature of this account, the notion that while associative processes might not be in control of behaviour at some point, they are automatic in their operation and so associative learning continues “in the background” at the same time. Our last experiment concerns the Perruchet effect, which many consider to provide some of the best evidence for a dual-process account of learning. In this experiment, the focus is entirely on the influence of earlier trials on later ones, and the basic effect is that this influence dissociates for measures of expectancy and measures of performance (RTs and motor-evoked potentials [MEPs]). In our treatment, we attempt to carefully dissociate components of this effect and conclude that we can only make sense of our data if we are able to appeal to both propositional and associative processes.
Human Pavlovian conditioning: the bi-conditional discrimination
Our departure point for this experiment was the work of Peter Lovibond (see, for example, Lovibond, 1992, 2004; Lovibond & Shanks, 2002). One of his key findings is that Pavlovian conditioning with humans, typically using an electrodermal paradigm, produces a conditioned response (CR, typically a change in skin conductance) in those who become aware of the contingencies between conditioned stimulus (CS, typically a visual stimulus) and unconditioned stimulus (US, in this case the shock), but no CR in those who do not possess this knowledge (following earlier work by Dawson & Biferno, 1973). This knowledge can be assessed by post-experiment interview/questionnaire or by online measurement of expectancy of shock on a trial-by-trial basis—the result is the same. Only those participants aware of the contingencies show conditioning, and so he concludes that conditioning is mediated by conscious expectancy and is propositionally driven by an analysis of the form: “The CS has occurred, it will be followed by a shock, and that’s going to be unpleasant.”
We have observed a similar correspondence between online expectancy and CR in this paradigm, in that those participants that showed a CR also expected a shock. Our problem has been that we have had very few participants (approximately 2%, the percentage is a good deal higher in Lovibond’s studies) who were unaware of the contingencies, which meant we were unable to meaningfully test whether they produced a CR or not. There are a number of possible reasons for this difference between our studies and Lovibond’s, but as will become clear, we have no reason to doubt Lovibond’s empirical claim. Our first experiment tries to replicate his effect by resorting to a somewhat more complex design, a bi-conditional discrimination, which others have already shown is difficult for animals (including humans) to learn (Harris & Livesey, 2008; Harris, Livesey, Gharaei, & Westbrook, 2008). The motivation for using this design was that this complexity/difficulty would result in fewer participants becoming explicitly aware of the contingencies and so allow us to test whether conscious cognitive expectancy really does predict whether a CR will occur or not. We added an additional refinement to this study. We continued after training with a test phase in extinction that involved the introduction of novel stimuli. This was done to create uncertainty in the minds of our participants once they realised that the experimental conditions had changed. We give only a brief account of the experimental procedures here that nevertheless should be sufficient to give a clear idea of how our results were obtained.
Experiment 1
The experiment had a 2 (CS1/CS2) × 2 (yellow/blue background) mixed-measures design. Each of the 72 participants was pseudo-randomly assigned to one of the counterbalanced sets of the two conditions and received £5 in exchange for taking part. The two CSs used were a grey cylinder and a grey square embedded on either a yellow or blue background—both colours chosen to ensure the best possible perceptual discrimination. The red background presented during test was selected to minimise generalisation from these colours (see Figure 1).

Schematic showing counterbalanced stimulus and background combinations for training (each participant received either 1.1 or 1.2; six trials of each CS with that background colour and outcome contingency) and test (one of 2.1-2.6; one trial with each CS on a given background in the order shown). In addition, order of CS presentation was counterbalanced at test.
The method and procedure followed those used by McAndrew, Jones, McLaren, and McLaren (2012). During the experiment, two dependent variables were obtained: SCR, measuring participants’ autonomic response, and conscious expectancy ratings (on a scale from 1 to 5). Skin conductance was measured on one hand using finger electrodes on the third and fourth finger, and shocks designed to be ‘uncomfortable but not painful’ normally 5-20mA lasting 500 ms were delivered to the index finger of that hand through stainless steel electrodes. Conscious expectancy was measured using the other hand with five buttons labelled as follows: 1, “There will definitely not be a shock”; 2, “There might not be a shock”; 3, “Not sure either way”; 4, “There may be a shock”; and 5, “There will definitely be a shock.”
The training phase was identical (except for counterbalancing) for all participants. There were two blocks of trials that contained three presentations of each compound, making 12 trials per block. On each trial, a shape appeared for 5 s, on a coloured background (yellow/blue). During stimulus presentation, participants rated their expectancy of shock. On CS+ trials, the shock occurred during the last 500 ms of the compound presentation. The background colour of the next trial was on screen during the intertrial interval. For skin conductance to return to baseline, and to prevent people from predicting the beginning of the next trial, the intertrial interval varied randomly between 30, 35, and 40 s. To avoid habituation to the shocks, participants were recalibrated between the two blocks. The test phase comprised a single presentation of each compound and one presentation of each shape on the new red background, making six trials in all, with the order counterbalanced across participants (see Figure 1). At the end of the experiment, a structured interview assessed the participants’ awareness of the experimental contingencies, dividing them into “Aware” and “Unaware” groups.
Results
The post-experimental interview found 45 participants who were aware of the contingencies and 27 who were not. This classification was arrived at by simply asking participants to describe the contingencies between the two stimuli, background colour, and shock. If they could describe at least one component of the contingencies (e.g., pink square on yellow background leads to shock, brown cylinder does not), they were classified as “Aware.” Analysis is based on these two groups of participants (Aware and Unaware) using awareness as a factor in the analysis. 1
Training: expectancy data
Figure 2 gives the mean expectancy ratings for Aware and Unaware participants averaged over training. These ratings were analysed using a 2 (CS1 vs. CS2) × 2 (yellow vs. blue background) × 2 (Aware vs. Unaware) repeated-measures analysis of variance (ANOVA), in which CS1 was the CS+ on the Blue background and CS2 was the CS+ on the Yellow background. If the bi-conditional discrimination has been learned successfully, the CS × Background interaction should be significant, as expectancy of a shock to the CS should be dependent on the background it was presented in. No main effects were significant, but the three-way interaction between CS, Background, and Awareness was significant, F (1, 68) = 83.15, p < .001,

The graph shows mean expectancy ratings (y-axis, scale of 1 [do not expect shock] to 5 [expect shock]) for the CS+ and CS– (averaged over background colour) in the Aware and Unaware groups during training (as defined by the post-experiment interview).
A significant CS × Background interaction was observed in Aware participants, F (1, 44) = 250.67, p < .001,
Training: skin conductance data
During training, the raw skin conductance measure was obtained by subtracting the average SCR during the 5,000 ms before the onset of the CS (“pre-CS”) from the average SCR during the first 4,500 ms of the CS presentation. To control for individual variability in the baseline response, all SCR data were scaled logarithmically (see McAndrew et al., 2012, for details). The log-transformed data were analysed using a repeated-measures ANOVA with the same factors as the expectancy analysis.
The three-way interaction between CS, Background, and Awareness was once again significant, F (1, 68) = 11.27, p = .0013,

Mean log-transformed SCR (y-axis: Log(∆SCR) in µS) for CS+ and CS– averaged over both backgrounds in Aware and Unaware participants during training. Aware participants conditioned successfully, indicated by higher SCR to CS+ than CS–. Unaware participants showed no differential conditioning.
Test: expectancy data
Expectancy ratings (see Figure 4, top panel) taken during the test phase were analysed using a 2 (CS1/CS2) × 2 (yellow/blue) × 2 (Aware/Unaware) repeated-measures ANOVA. The data obtained for the CSs presented on the red background were omitted from analyses as they did not generate any differential effects on either measure, and this enabled us to use the same approach taken in analysing the training data. As expected, the three-way interaction between CS, Background, and Awareness was significant, F (1, 68) = 16.28, p < .001,

Top panel: Mean expectancy rating (y-axis, scale of 1 [do not expect shock] to 5 [expect shock]) for the CS+ and CS– in the Aware and Unaware groups during test. Bottom panel: Mean log-transformed SCR (y-axis, log(∆SCR) in µS) for CS+ and CS– in Aware and Unaware participants during the test phase. In both cases, the values for the CS+ and CS– are averaged across both backgrounds.
Test: skin conductance data
The log-transformed skin conductance changes (see Figure 4, bottom panel) used for the analysis of test trials were the difference between the “post-US” and the “pre-CS” period. The post-US period was defined as the 5 s after CS termination (i.e., just after the shock would have been delivered in training). This was not used during training because any SCR due to CS presentation would have been confounded with the direct effects of shock, but we have previously found this measure to be more sensitive when testing in extinction.
A repeated-measures ANOVA was run with CS, Background, and Awareness as factors. This time there was a significant CS × Background interaction, F (1, 68) = 4.69, p = .034,
Discussion
An interesting picture emerges from our analysis of this experiment. During training, awareness of the contingencies governs whether participants show differential expectancies or skin conductance changes indicating that they have solved the bi-conditional discrimination. If they are aware of the contingencies, then they show an effect; if they are unaware of them, they do not. This is what we would expect if cognitive expectancy was driving the SCR measure. An explanation of these data that simply pointed to expectancy of shock as causing the autonomic response accompanying that combination of stimulus and background covers the facts. In fact, it covers the facts rather better than simply explaining the pattern of responding in the Aware participants; it can also provide an explanation of the data provided by the Unaware participants. If we look at their expectancy ratings, we see that they give a middling score around 3, which is distinctly higher than that to CS– in the Aware group (but lower than that group’s rating for CS+). If we take that as indicating that they expect to get shocked about half the time (which was the case) but do not know when the shock will occur (i.e., they believe that it is stochastic), then we can explain this rating by appealing to this belief and also explain their results on the skin conductance measure (see Figure 3). This last is also distinctly higher than that to CS– in the Aware group, which suggests that they have learned to expect shocks to some extent, but are simply unable to predict when they will occur. In short, we can argue that one group (Aware) knows when shocks will be delivered, and another group (Unaware) knows that shocks will be delivered, but not when. Both are equally frightened of the shocks, and so the Unaware group simply gives the same reaction to CS+ and CS–, a reaction similar to that to CS+ for the Aware group. The picture here, then, is one of control of autonomic responding by conscious cognitive expectancy in both groups, an account entirely congruent with a single, propositional system governing learning and performance.
A different picture emerges on test, and it is here that the single process, propositional account of learning encounters difficulty. If we agree that expectancy was driving skin conductance changes in training, then what should we expect on test? Given the expectancy ratings taken at test (which are very similar to those obtained during training), the answer must be that we would expect very similar results to those observed in training, but actually the pattern is quite different. Instead of the significant difference in differential expectancy being accompanied by a similar pattern in the autonomic measure, we now have a decoupling of these two measures, with skin conductance changes in both groups being broadly similar irrespective of the strong differences in expectancy.
This is quite an intractable result for a single process, propositional account to accommodate. We have already noted that, on this account, given the difference in differential expectancy between Aware and Unaware groups, we would expect this to show up in SCR. The crucial result here is not so much that there is evidence of conditioning in the Unaware participants, as this could be explained by appealing to learning later on during training that only shows up on test. But on a single process account, this would imply that expectancy, taken during test, should also show differential effects in this group (and typically larger ones than for this autonomic measure). It’s the decoupling of the two measures at test when they clearly were correlated during training that is problematic and would not obviously be predicted by a propositional account.
A dual-process account of the type envisioned here, however, has no difficulty with these results. It is happy to concede that performance during training was under the control of cognitive expectancy, but acknowledges that the uncertainty induced by the introduction of a novel background and the altered contingencies at test changes this. Instead of propositional processes dominating performance, the uncertainty means that they relinquish control, and as there is not enough time during test for a new set of propositions to be generated to control behaviour, the associative learning that has taken place “in the background” during training can now manifest. As the contingencies were the same for Aware and Unaware participants, the extent of the associative learning is the same for both groups, and so the effects on skin conductance are the same. Cognitive expectancy, however, is not driven by associative learning but by beliefs, and so defaults to roughly the values arrived at during training (again because there is not sufficient time for them to alter in response to changing circumstances).
For this account to be viable, at least two things require further examination. One is the role of uncertainty in allowing the expression of associative learning in controlling behaviour, and we will return to this point when considering our last experiment. The other is the notion of “learning in the background” as applied to associative processes, which simply instantiates the idea that such learning is automatic and will occur as a result of exposure to the relevant contingencies. On our account, it is the expression of this learning that is by no means guaranteed and is in fact unlikely if propositional processing is in control of behaviour (see, for example, Jones & McLaren, 2009, for a similar argument). Our case would be considerably strengthened if we could demonstrate that associative learning does occur “in the background,” and this is the issue we examine next.
Task switching: the bi-conditional revisited
To tackle this issue, we now consider a quite different paradigm to electrodermal conditioning that nevertheless uses the same bi-conditional design. This is cued task switching, in which a cue is given which signals the type of decision to be made to a subsequent stimulus. A typical procedure for this task involves telling the participants the rules for the task and what cues will be available to signal which set of rules apply. The participant is then asked to respond as rapidly and as accurately as possible once the stimulus appears, making this a standard RT task of the type often used by experimental psychologists. In the paradigm we will consider, two types of decision have to be made to the stimuli, which are one of the four digits, 1, 2, 7, or 8. The first task, cued by either a blue or a green circle, is to decide whether the stimulus (the digit) is odd or even. The second task, cued by either a yellow or a red circle, is to decide whether the stimulus is higher or lower than 5. These decisions demand binary responses, which involve pressing either a left (for “odd” and “lower”) or a right (for “even” or “higher”) key. The experiment proceeds by presenting cues, stimuli requiring a response, and then giving feedback for that response before moving on to the next trial. The decision rule cued sometimes changes from one trial to the next, a “switch,” or stays the same, which is designated a “repeat” trial. Typically, in these experiments, performance is better on repeat rather than switch trials—something taken to indicate the need to reconfigure task set when switching from one decision rule to the other. This difference is known as the “switch cost” and is one of the key measures of performance in this type of experiment.
The effect cueing has on the response required depends on the stimulus presented on that trial. If the stimulus in our example is either 1 or 8 (known as “congruent” stimuli), then the cue is, in effect, irrelevant because a “1” is both odd and lower than 5 requiring a left response for either task, and 8 is both even and higher than 5 requiring a right response in all cases. But if the stimulus is either 4 or 7 (“incongruent” stimuli; see Figure 5), then the cue does determine the response to be made on that trial to that stimulus, and the response (right or left) will differ according to the task cued. The difference in performance to the congruent and incongruent stimuli is known as the “congruency effect” and is defined as mean RT or errors for the incongruent stimuli minus the same for the congruent stimuli. It’s typically positive, indicating that performance to the incongruent stimuli is somewhat poorer than to the congruent stimuli, a result that is often also taken to indicate persistence of task set from one trial to another. This will not affect performance to congruent stimuli (because the response required is not affected by task) but will cause interference on some occasions for the incongruent stimuli (when the response required to the current stimulus by the task in play on the previous trial is different to that required by the currently cued task).

The basic cued task-switching paradigm used in this article (shown right) and an alternative perspective on it afforded by associative learning (shown left).
But, as can be seen from inspection of Figure 5 (left side), there is another way of construing this task that reveals it to be a combination of the bi-conditional discrimination already considered in our first experiment, plus a simple component discrimination, if we take the cue as playing the role of the context (i.e., background colour in our earlier Pavlovian conditioning experiment), and the stimulus (digit) as the CS. The congruent stimuli simply act as CSs paired with invariant responses—this is the simple discrimination. The incongruent stimuli, in combination with the cues, amount to a bi-conditional because the response required to a stimulus varies according to the cue that it is paired with. This realisation led to the following idea, exploited in Forrest, Monsell, and McLaren (2014). Giving our participants the rules (the Task condition) made the paradigm a conventional task-switching experiment, but if instead we simply gave them the mappings from cue + stimulus to response or even required them to learn these mappings by trial and error (our cue + stimulus → response or CSR condition), then this was now much more akin to an associative learning experiment. Forrest et al. (2014) did exactly this and found that the pattern of performance in the two conditions was quite different, with those in the CSR condition showing a significantly larger effect of stimulus congruence than those in the Task condition, that is, performance to congruent stimuli (the simple discrimination) was much better than to incongruent stimuli (the bi-conditional), and this difference was significantly larger than that observed in the Task condition. There was the converse effect for switch cost (mean performance on switch trials – mean performance on repeat trials), however, with this measure being considerably (and significantly) higher in the Task condition than in the cue + stimulus → response condition where the mappings had to be learned in the absence of any overarching rule. Forrest et al. (2014) concluded that there was good evidence for two different modes of processing being used by participants when doing what was effectively the same task, with the performance of the group learning the individual mappings in line with that produced by associative models of learning.
Experiment 2
We used this paradigm to see whether participants were able to change from one mode of processing to the other when instructed to do so (to anticipate—they were) and whether while explicitly using one mode they were able to learn about the task in a similar fashion to participants explicitly engaging with the other mode. The design is shown in Table 1.
The design of the task-switching experiment.
CSR: .
Task indicates that participants were asked to use the rules shown in Figure 5. CSR indicates that they had to learn the cue + stimulus → response mappings by trial and error.
In essence, participants would start the experiment using either CSR or Task instructions, and then halfway through the experiment they would be asked to either change to the other set of instructions (half our participants) or to stick with their current instructions (the other half). We used the same procedures as in Forrest et al. (2014) and the cues, stimuli, and responses already discussed in our introduction to this experiment. On each trial, a cue colour (one of the two assigned for that task) would be presented either 100 or 1,200 ms before the stimulus digit appeared. Once that had happened, participants would then have to respond using their left or right index finger to press a corresponding (“z” or “/”) key. The response to stimulus interval was kept constant at 1,700 ms by controlling the ITI appropriately. There were 20 blocks of 49 trials each, in which the probability of a switch from one task to the other was one-third so that people did not tend to anticipate the switch. Each of the four groups had 16 participants in it (after exclusions for not following instructions, this was checked by means of a post-experiment structured interview). The results for the congruency measure are shown in Figure 6 as this is a measure that directly captures the difference between performance on the simple discrimination and the bi-conditional.

Results for the congruency effect (i.e., median performance on incongruent stimuli minus that on congruent stimuli) over the two halves of our experiment broken down by group. The oval draws attention to the two points for Task-CSR and CSR-CSR that do not differ significantly, although the former has only received half the training under CSR instructions of the latter. The implication is that the Task-CSR group has learned the associations between cue + stimulus → response “in the background” while operating under Task instructions.
Results and discussion
The two groups that did not experience a change in instructions over the course of the experiment showed different patterns of performance. Task-Task had the Task instructions throughout, and showed a relatively stable and small congruency effect as we would expect based on our previous studies (Forrest et al., 2014). CSR-CSR, however, produced a much larger congruency effect (again, as expected) and showed a marked decline from the first half of training to the second half. We can interpret this in terms of them more easily learning the simple discriminations involving the congruent stimuli (which would produce a large initial congruency effect as performance on the incongruent stimuli would be relatively poor at this stage) and then acquiring the bi-conditional discrimination involving the incongruent stimuli over the course of training, thus reducing the difference in performance between them and the congruent stimuli. Group CSR-Task shows a change from a relatively large congruency effect under CSR instructions to a relatively small one under Task instructions as might be expected from the patterns of performance in our two “pure” groups, but the group of real interest here is Task-CSR. This group shifts from a low congruency effect under Task instructions to a higher one when learning the individual mappings, but the key issue is how high? Do they exhibit the kind of effect that might be expected if the mappings that now have to be learned are entirely novel to them, similar to that displayed by the CSR-CSR group in the first half of training, or do they produce the same magnitude of effect as the CSR-CSR group in the second half of training after they have made progress in learning the bi-conditional? The answer is that they do the latter; their congruency effect in the second half of training is not significantly different from CSR-CSR in the second half of training, but is significantly different from the first half value for this group, F(1, 30) = 4.41, p < .025, d = .595. This implies that they are roughly as good at using this instruction set as participants that have practised it for 10 blocks. Can we say, then, that they have been able to learn the CSR mappings “in the background” while using the Task instructions?
We think the answer to this question is yes and that the strongest evidence for this position comes from a novel use of the state-trace methodology introduced by Bamber (1979) and further developed by ourselves (Yeates, Wills, Jones, & McLaren, 2015). Figure 7 plots the median RT performance (we get similar results for errors) on each block for the incongruent stimuli that form the bi-conditional discrimination against the median performance for the same block on the congruent stimuli that comprise the simple discrimination. We do this separately for each group, starting with the 2nd block (the 1st and 11th are discarded as practice blocks during which instructions are provided), and then we connect up the points for each group in block order using arrows to show the transition from one block to another. This gives us detailed information about performance on congruent and incongruent stimuli on a block-by-block basis for each group that is visualised as a graph in Figure 7. We call these state-trace trajectories, and they reveal some interesting characteristics of performance on this task under the two instructional sets.

This shows the state-trace trajectories for the four groups in this experiment. The left panel shows how the trajectory for Task-CSR compares with Task-Task and CSR-CSR, and the right panel serves a similar function for CSR-Task. Filled symbols denote CSR instructions, and open symbols denote Task instructions. Solid lines denote no change in instruction, and broken lines indicate a change in instruction halfway through the experiment. The arrows track the progression from early to late blocks.
The first is that the two sets of instructions produce different functions that lie in different parts of the congruency space. Performance under Task instructions (open symbols) lies on one roughly linear function (the lower one on each graph) and that under CSR instructions (filled symbols) on another (the upper function on each graph). This supports the idea that our instructional manipulation leads to the use of different sets of processes to perform the task (cf. Forrest et al., 2014). It’s not hard to see how this would come about. Taking the Task instructions case first, performance in these groups is generally better (they have been told the rules and so know how to do the task) which means RTs are generally lower. The CSR groups have to learn the mappings, and this takes time and leads to slower RTs (and more errors). As the experiment progresses all the groups benefit from practice reducing RTs on both sets of stimuli, but the CSR groups benefit more (they have more to learn) and so show generally steeper gradients.
The really striking thing about this plot, however, is how it captures the transition from Task to CSR and from CSR to Task instructions. The former (left panel) progresses until roughly halfway along the Task function while Task instructions are in force and then cuts across to a similar point (i.e., halfway) on the CSR function when the instructions change to learning the individual mappings. The latter (right panel) progresses along the CSR function, but then quite clearly shifts to the start of the Task function at the beginning of the second half of the experiment when the instructions change. This graphically illustrates “learning in the background” for participants trained under Task conditions then switched to CSR instructions and strongly implies that no equivalent occurs when trained under CSR conditions and then switched to Task instructions. It makes sense that performance using a rule in the CSR-Task group during the second half of the experiment would not benefit from practice under CSR instructions when they were not using that rule. This is all that is needed to explain why they go back to the “start” of the Task function on our plot when their instructions change from CSR to Task. But the only plausible way to explain why this does not happen for the Task-CSR group, and that instead they transition to the point on the CSR function that the CSR-CSR group has arrived at when beginning the second half of the experiment, is to argue that they must have benefited from practice under Task conditions—that is, that this practice, while using rules to make decisions about responses, nevertheless led to some learning of the individual mappings required under CSR instructions. Our conclusion is that if we agree with Forrest et al. (2014) that performance under CSR instructions is mediated by associative processes, then there is good evidence here for their automatic operation even while engaged in rule use, resulting in learning “in the background” that manifests when circumstances permit, but no evidence for rule learning while engaged in associative processing. Once instructional set permits, the latent learning of the individual mappings can manifest in the control of performance after a switch from Task to CSR conditions.
This brings us to the final issue to be addressed in this article—When do circumstances permit human behaviour to be influenced by associative processing? We will focus on one of the answers that we have offered—that it is when the outcome predicted by propositional processing is in some sense uncertain that this is likely to happen. This is not the only set of circumstances that afford control of behaviour by associative processes, but as we have already noted, it does seem to capture some aspects of the results in our first experiment. Our next experiment looks at this in the context of one of the key sources of evidence for associative processing in humans—the Perruchet effect.
The Perruchet effect: RT, MEPs, and expectancy
The Perruchet effect (Perruchet, 1985) is one of the most straightforward and robust pieces of evidence for a dual-process view of learning in humans. A stimulus is followed by an unpredictable (50:50) occurrence of an outcome, and expectancy for that outcome and another response measure dependent on the outcome are measured online. The essence of the effect is that while expectancy of the outcome tends to decrease over a run of outcome occurrences (the so-called “gambler’s fallacy” result, Burns & Corpus, 2004), the other response is facilitated, making it unlikely that one is driving the other. In Perruchet’s original experiments, the paradigm was eyeblink conditioning to a tone (the CS) that was followed by a puff of air (the US) 50% of the time. The other response (the CR) was blinking, and the expectancy measure a rating taken during the trial. He found that as expectancy of an airpuff decreased over runs of reinforced trials, the blink CR increased in both vigour and frequency in his participants.
This phenomenon can also be demonstrated using the electrodermal conditioning paradigm discussed earlier (see McAndrew et al., 2012). Here runs of trials where the CS (a coloured shape) is followed by the US (shock) lead to a decrease in the expectancy of shock but an increase in SCR to the CS. Conversely, when runs of the CS followed by no US occur, expectancy of shock goes up but the CR diminishes. McAndrew et al. introduced a slightly different method of analysis for their data that considered the mean CR on negative runs (trials which had been immediately preceded by at least one extinction trial, that is, −1, −2, −3) versus that on positive runs (trials which had been immediately preceded by one or more reinforced trials, that is, +1, +2, +3). They called this the Extinction/Excitation factor and constructed a complementary factor of Level which averaged over this factor to measure the effect of run length within either positive or negative runs. The effect of the Extinction/Excitation factor was non-significant in their study, but there was an increase in CR with Level (and a corresponding decrease in expectancy) where Level 1 corresponded to the mean for runs of +1 and −3, Level 2 to that for +2 and −2, and Level 3 to that for +3 and −1. It was this within-run type of effect that was taken as evidence of the Perruchet effect, and Perruchet (2015) has also used this type of analysis in a recent review. We will use this effect over Level (with Level 1 corresponding to the mean of performance to the +1 run length and the largest negative run length for which we have useable data, and maximum Level corresponding to the mean of the largest positive run length and the −1 run) as our index of what we will call the “true” Perruchet effect in what follows.
There can be an effect of Extinction/Excitation (i.e., negative vs. positive runs), however, which is typically included in the standard run length analysis of the Perruchet effect. This would now seem to be a quite separate component of the overall effect from that attributable to Level. Weidemann, McAndrew, Livesey, and McLaren (2016) have offered an analysis of eyeblink conditioning in humans that makes just this point, showing that there is a strong effect of Extinction/Excitation that is relatively unaffected by variations in the stimulus used as the CS. They used a procedure whereby one stimulus was used as CS+ and the other as CS– to produce the standard 50:50 reinforcement sequence across all trials used for the Perruchet effect. The similarity of CS+ to CS– did not greatly affect the Extinction/Excitation effect which was quite large in all their experiments, but an increase in CR with Level was only detectable if the CS+ and CS– were very similar to one another (i.e., they were effectively acting as one CS as in the conventional Perruchet effect), and not if they were quite dissimilar (thus promoting differential conditioning).
Perhaps the best evidence for the Extinction/Excitation component of the Perruchet effect, however, comes from Verbruggen, McAndrew, Weidemann, Stevens, and McLaren (2016). The procedure used was very similar to that shown in Figure 8, but in this experiment, the runs of reinforcement and non-reinforcement were completely predictable. For five trials, a brown cylinder was followed by the requirement to press a response key (Go), and then five trials were given in which no response was needed. This cycle repeated throughout the experiment, and they measured RTs, expectancy of having to press the key, and MEPs (MEPs measured at the finger in response to sub-threshold stimulation of motor cortex) for the hand used to make the response. They found that participant’s expectancy of pressing the key (which was entirely veridical as the runs were completely predictable) rather surprisingly did not predict the MEPs on at least some of the trials. In particular, the first Go trial of a run (which corresponds to run length −5) had a much lower MEP than the other Go trials (which are run lengths +1, +2, +3, and +4) even though its expectancy rating was similar to theirs. In fact, the MEP for the first Go trial was more akin to that for the NoGo trials corresponding to the other negative run lengths (−1, −2, −3, and −4) which had much lower expectancy ratings, whereas the first NoGo trial of a run (run length +5) had a much higher MEP than the other NoGo trials and was instead more like that of the other Go trials (which corresponded to run lengths +1, +2, +3, and +4; see Figure 10, leftmost lower panel, for a summary of their results that facilitates comparison to those of the experiment reported later in this section). Another way of putting this is to say that the plot of MEP by run length quite naturally captured the form of the data, while the plot by sequence position (i.e., first, second, third, etc. Go trial or first, second, third, etc. NoGo trial) did not do this for MEPs, but did quite naturally accommodate variations in expectancy.

Go and NoGo trials were presented in quasi-random order to give runs of trials of a given type following a binomial distribution with a maximum length of five. The diagram gives an example of the trial sequence changing from NoGo to Go. Each trial started with the presentation of a blank screen. After a variable time interval, a brown cylinder appeared, and participants rated the extent to which they thought the NoGo stimulus would appear. After 5 s, the Go stimulus “peanut butter” or NoGo stimulus “brown sugar” appeared. The Go stimulus remained on screen until a response was made, whereas the NoGo stimulus disappeared after 2 s. A TMS pulse was delivered at one of two different time points in a trial, either 2.5 s into the blank interval (Pulse 1) or immediately after the participant had indicated their expectancy rating (Pulse 2).
How are we to explain this? Obviously, expectancy is not driving the MEPs in this experiment, although conscious controlled processes are driving overt responding (this must be the case given that participants made very few mistakes). Our analysis is that the MEPs reflect processes other than those due to cognitive control and instead index more automatic effects. Once we have taken this position, the most straightforward explanation for these results is simply that if a response has been given on one trial, then this automatically facilitates responding on the next trial. Thus, the MEP is boosted for positive runs, and this applies even to the first NoGo trial (which is preceded by a Go trial), but not to the first Go trial (which follows a NoGo trial). The interpretation that naturally follows from this analysis is that the Extinction/Excitation effect is actually a priming effect, whereby the CR (pressing a key in this case, blinking in the eyeblink paradigm) is facilitated by repetition. It is only in the case of electrodermal conditioning, where rapid habituation to the US occurs, and so there may be no discernible priming of the SCR, that we have been unable to demonstrate this effect.
Here, we report a Perruchet sequence RT experiment (Destrebecqz et al., 2010; Perruchet, Cleeremans, & Destrebecqz, 2006) using the same procedure as in Verbruggen et al. (2016). The idea was to see whether the response priming that was found in that experiment also appeared under these conditions and whether there was a component of the usual run length effect corresponding to response priming, and another corresponding to the Level effect (i.e., the “true” Perruchet effect). It also gave us the opportunity to add MEPs to our usual dependent variables of RT and Expectancy. If MEPs more directly assay automatic (potentially associative) effects, this would help us interpret the results for the other response measures.
Experiment 3
Our experiment had 16 participants with a mean age of 20 years. We were confident (based on our previous work) that we could obtain a Perruchet effect on the RT and expectancy measures with this N using these procedures, but it remained to be seen if we would have sufficient power to get reliable MEP effects. We, in part, addressed this issue by focusing on data for runs up to ±3, as this ensured more observations at each run length, improving the reliability of the MEP data. All participants were right-handed, screened for any relevant health issues, and paid £15. The procedure used is shown in Figure 8. Participants were told that a brown cylinder would be presented as a warning cue, and then they had to make a response with their left hand if the outcome that followed was “peanut butter” but not if it was “brown sugar.” During the presentation of the brown cylinder (the CS), they were asked to rate their expectancy of the outcome being “brown sugar” from 1 (very low) to 9 (very high) using their right hand to press one of nine buttons.
MEPs for the left hand were recorded in response to a TMS pulse that was delivered on each trial either 2.5 s into the ITI (Pulse 1) or immediately after an expectancy rating during CS presentation (Pulse 2), as in Verbruggen et al. (2016). The first pulse is intended to capture the effects of responding (or not) on the previous trial during the ITI, while the CS is not present. The second pulse can then be used to assay the impact of CS presentation on MEPs by comparison with the first pulse. The delivery of a pulse during the CS (Pulse 2) was contingent on the participant making a rating, which could be at any point during the 5-s CS period; on average, this occurred at 1.28 s after CS onset. However, if a rating was not made (0.97% of trials), then a pulse was delivered as the CS terminated to ensure a pulse was delivered on each trial. Testing took place over 12 blocks with an average of 29 trials per block, with an enforced break between each block to allow for readjustment of the coil and for the participant to rest. There were four practice trials prior to the start of the experiment, two Go trials and two NoGo trials presented in a randomised order; no stimulation was given during these trials.
Results and discussion
We recorded RTs, expectancy ratings for the outcome that did not require a response (1 = lowest, 9 = highest), and MEPs on each trial. We focused attention on the outcome not requiring a response because McAndrew et al. (2012) had shown that this was an effective procedure for obtaining the RT version of the Perruchet effect. Raw expectancy ratings were transformed into 10—rating (we have shown in previous experiments that this gives a measure of expectancy for the outcome that did require a response). Thus, a score of 1 now indicates that participants definitely did not expect to make a response (i.e., a NoGo trial), and one of nine that they definitely did expect to respond (i.e., a Go trial). Figure 9 shows the results for RT and expectancy on Go trials up to run lengths of ±4 (there are no +5 data for Go trials given the maximum run was 5 in a row). It also shows the effects across Level and Extinction/Excitation computed on the −3 to +3 runs. This was done because of the relatively few data points at the ±4 run lengths, especially when split by pulse for the MEP analysis.

Top panels show RT and lower panels show expectancy. The leftmost panels give the means by run length where, for example, −4 denotes trials preceded by 4 NoGo trials and +1 denotes trials preceded by 1 Go trial. The middle panels give the results by Level, where those for Level 1 correspond to the average of −3 and +1 run lengths, for Level 2 to −2 and +2 run lengths, and so on. The rightmost panels show the overall means over positive and negative runs. Error bars give SE for the mean.
RT and expectancy analysis
We can see the typical Perruchet effect in the run length plot shown in the left panels of the figure. As run length changes from −4 to +4, RT decreases (participants speed up) at the same time as expectancy of having to make a response declines. This general pattern of effect manifests in both Level (middle panels) and Extinction/Excitation (right panels). Analysed across runs of +3 to −3 (for consistency with other analyses), there was a significant decreasing linear trend across run type, F (1, 15) = 14.81, mean square error (MSE) = 0.220, p = .002,
MEP analysis
Figure 10 (top panels) gives our MEP results analysed in the same way (i.e., by run length, Level and Extinction/Excitation) as the RT and expectancy data and also includes the data from Verbruggen et al. (2016) analysed in similar fashion for comparison (lower panels, although note that in this case Levels 1-5 can be reported as each run length has the same amount of data contributing to it). The electromyographic (EMG) signal was recorded on each trial in the 10 to 90 ms interval after stimulation. The amplitude was defined as the difference between the maximum and minimum EMG signal in this 80 ms interval. Trials on which the coil had drifted more than 7 mm away from the target hotspot (6.75% of trials) were excluded from analysis to ensure that TMS stimulation remained focused in the motor hotspot.

Top panels show MEP data for the current experiment and lower panels show MEP data from Verbruggen et al. (2016) cast into the same form. The leftmost panels give the MEPs by run length, where −4 denoted the MEP on a trial preceded by 4 NoGo trials and +1 the MEP for a trial preceded by 1 Go trial. The middle panels give the results by Level where those for Level 1 correspond to the average of −3 and +1 run lengths, for Level 2 to −2 and +2 run lengths, and so on. The rightmost panels show the overall means over positive and negative runs.
Following the approach taken by Verbruggen et al. (2016), we analysed each pulse type separately. There was a significant increasing linear trend over run length for Pulse 1, F (1, 15) = 7.23, MSE = 4,536,249, p = .017,
There can be little doubt that we have found some of the effects observed by Verbruggen et al. (2016) in our data, both in our RT and our MEP results for both Pulse types, although the function relating run length to expectancy is quite different. One commonality is that the MEPs for Pulse 2 are significantly lower than those to Pulse 1 in both experiments, and we follow Verbruggen et al. (2016) in attributing this to the development of inhibition around the time of Pulse 2 to prevent premature responding (see also Duque & Ivry, 2009; Duque, Lew, Mazzocchio, Olivier, & Ivry, 2010). Another commonality is that in both measures performance was facilitated (lower RT, higher MEP) on trials that followed Go trials relative to those after NoGo trials, that is, for positive runs compared to negative runs. The lack of any effect on expectancy supports the claim that it is a form of response priming that is manifesting here and that the facilitation is not the product of conscious preparation. The significant effect for the Pulse 1 data supports this claim and cannot easily be explained by reference to CS–US associations either. Taken together with Verbruggen et al. (2016) and Weidemann et al. (2016), the evidence is now very strong for this component of the Perruchet effect.
But we may have more than this. When we look at the effects over Level, we find the “true” Perruchet pattern here for some (but not all) of our measures. RT significantly declines over Level and so does expectancy—so as people come to expect to not have to respond, responses actually speed up. And this RT and expectancy pattern is accompanied by an increase in MEP across Level for Pulse 2, but not for Pulse 1. Admittedly, we do not have a significant interaction in this experiment to support our claim that the effect for Pulse 2 is different to that for Pulse 1, F(1, 15) = 2.08, p = ns, but, given that the result for Pulse 2 in Verbruggen et al. (2016) is a numerical decline across Level, the marginally significant increasing linear trend in our data is perhaps more impressive than it might seem at first glance. There is evidence in this experiment (under conditions of uncertainty) for a different pattern for Pulse 2 when contrasted to that observed in Verbruggen et al. (2016) under conditions of complete predictability, and the trends in these two experiments are significantly different if we allow ourselves to make this comparison, F(1, 15) = 4.57, p = .04. It would seem that the rather strong response priming effect is able to manifest under both sets of conditions for Pulse 1, but that it is only when outcomes are uncertain that we see RT varying across level in the same fashion as MEP and then only for Pulse 2 (and relatively weakly). Note that if MEP were to directly translate into RT, then we would expect there to be an increase in RT across Level in Verbruggen et al. (and apart from the first Go trial being slow, no effect on RT was observed). Thus, we do have some evidence consistent with the position that weak associative effects will be more easily detected under conditions of uncertainty—when propositional processing is not able to settle on a definite belief as to the expected outcome.
General discussion and conclusion
At the beginning of this article, we asked a number of questions that have helped shape its content. Perhaps the primary question that acted as a precursor to all the others was why would two learning systems, one associative and one propositional, evolve in parallel? Our answer to this question is that they wouldn’t and they didn’t. We are supporters of the dual-process position, yes, but our interpretation of that position is that associative processes provide the fundamental computational substrate and that propositional processes are built out of them by deploying these associative processes in complex architectures (see McLaren et al., 2014; Verbruggen, McLaren, & Chambers, 2014, for similar statements)—two sets of processes, but one system for learning.
If this last point is not entirely clear, then consider the following. In our introduction, we referred to what might be thought of as the “traditional” dual-process account, in which associative learning and propositional learning co-existed side by side and operating independently of one another. Using our current terminology, this would amount to the “two processes, two systems” version of the dual-process approach to human learning and performance. What we are suggesting instead is that this simple separation of processes into systems is not the case, and that what we have is one system that under some conditions (most of the time) will appear to consist of propositional processes, but under other circumstances can show characteristics of associative processing. And the way that we explain how this comes about is to construct the propositional from the associative.
If this is so, then why has the parallel independent process assumption often either explicitly made or implicitly assumed by dual-process theorists proven so useful? In essence, this is a question about the nature of the interaction between cognitive and associative processes, and we have been at pains to make a case for that interaction being asymmetric and powerful. We are convinced that cognitive processes can prevent the expression of any associative learning. They don’t have to, but they can do so, and this is the default. Otherwise, we would be at the mercy of events and our environment. As an example of what might happen if this were not the case, if you saw a chair, you would inevitably sit in it because of the long-standing association between stimulus and response. If this is not to be the outcome, then the expression of associative learning has to be inhibited by cognitive control in most circumstances. However, associative processes do support learning in the background. This learning might not inevitably be expressed, but it does automatically take place. Given this asymmetry between learning and performance for our two sets of processes, it’s not hard to see how they might quite often mimic dual independent processes operating in parallel. In particular, “knocking out” cognitive processes in a variety of ways might well reveal a seemingly independent associative component to performance.
Is there evidence that challenges the type of dual-process account that postulates independent systems running in parallel? We believe we have provided some here. Our first experimental example showed that human behaviour could be correlated with beliefs—and then later decoupled from them. To re-iterate and clarify our explanation of the data from Experiment 1, we see it as a demonstration of the transition from the cognitive to the associative. In the first phase of Experiment 1, during training, cognitive control of behaviour is visible. Our participants arrive at one of two possible sets of beliefs based on their experience during training: Either they realise that one CS is paired with shock on one background colour, and vice versa for the other, or they form the impression that shocks occur about 50% of the time but the CS does not predict the shock. Both sets of beliefs are correct and consistent with the programmed contingencies, but the former is obviously a more complete characterization of those contingencies than the latter. Nevertheless, their responding as assayed by expectancy ratings and SCR changes reflects these beliefs quite precisely. Our “Aware” group knows what the contingencies are; rate the CS+ highly, the CS– significantly lower; and show a similar pattern in terms of SCR change. The “Unaware” group are actually aware of some aspects of the contingencies, but not others, and consequently rate both CS+ and CS– fairly highly (as 50% partially reinforced stimuli) and also show no differential responding in terms of change in SCR, but quite large changes in SCR to each stimulus. We conclude that both groups are using expectancy of shock to drive their SCR responses to the stimuli at this point in the experiment.
But at test, the situation changes. Participants are given new information that challenges their beliefs and, in time, would cause them to change them. A new background colour appears, and no further shocks occur. What are they to make of this? Our argument is that they continue to give expectancy ratings in line with their earlier beliefs because they have not had time to adjust them; at this point, they simply do not know what to believe, only that things have changed. But this triggers a relaxation of control, and expectancy now becomes decoupled from changes in SCR, which instead can now reveal the influences of the associative processes which were active and supporting learning during training, but whose expression was masked by the cognitive control exerted at that point in time. The net result is that (weak) conditioning is revealed at the same level in both groups of participants as they had been exposed to the same contingencies for the same length of time.
This pattern of results is hard to explain on a dual-process account where both sets of processes operate independently and in parallel. If the autonomic behavioural response (SCR) were to be taken to be due to the combination of independent associative and propositional mechanisms, with the latter only effective in the case of the Aware participants, then admittedly we could make some progress in explaining our data. If the associative contribution was small, then this could explain why differential conditioning was so much stronger in the Aware group and correlated with expectancy. But expectancy was still high for this group at test, so this difference between the two groups should have persisted during test and did not. This points to a lack of independence between the two hypothetical systems, with, in this case, the cognitive giving way to the associative. Our argument is that this lack of independence is more plausibly and elegantly explained by associative and cognitive processes being bound up within the same computational system, rather than postulating separate systems and then constructing some post hoc method of interaction between them.
Our second experimental example provided evidence for associative learning “in the background,” something we needed to postulate in our explanation of the results of Experiment 1. In this experiment we were able to show that people could either learn which task rule governed the contingencies between stimulus and response based on a cue that signalled which task was in play (the Task condition), or they could more laboriously learn a set of cue + stimulus → response mappings (the CSR condition). These two, rather different sets of instructions, led to different performance profiles on the same set of contingencies. This, in itself, is an interesting result, and can be used to argue for different sets of processes governing human learning and performance, but what makes it quite compelling evidence for two different modes of processing in humans is the fact that changing from one set of instructions to the other had asymmetric effects on performance. If participants changed from Task to CSR instructions, then essentially they picked up on the CSR version of the experiment as though they had previously been trained on it instead of on the Task version. But if they were changed from CSR to Task, this was not the case. In these circumstances, they had to learn the Task version “from scratch.” Our analyses, both in terms of conventional statistical tests on RTs and accuracy, and the more recently developed technique of using state-trace plots, support this contention. It is this asymmetry, which argues against a standard dual-process account.
At first sight, one might think that the results of Experiment 2 are largely explicable by an appeal to associative and cognitive processes running independently and in parallel. On this account, each of our different conditions would have some mixture of cognitive and associative processes contributing to learning and performance, with a greater contribution from associative processes under CSR instructions. We agree that the ideas of independence and automaticity underpin the claim that associative learning can happen “in the background.” But if this is the case, then, on the standard dual-process account, we should see evidence of their effect even when rule-based instructions are in play in the Task condition, which makes it very strange that when switching from CSR to Task instructions our participants apparently return to a near-naïve level of performance. Where is the expected associative contribution to performance? We would argue that the most plausible explanation is that once propositional processing is engaged (as exemplified by rule use), associative contributions to performance are suppressed. Why should this be so? As we have already argued, the most obvious answer to this is that if propositional thought is made possible by controlled associative processing, then it is not surprising that when this control is engaged the expression of associative learning per se is inhibited by that same control. To a first approximation, we are arguing for two modes of operation here, one propositional and one associative, with the former suppressing the latter. But this suppression is not some artificial add-on that instantiates competition between two different systems. It is, instead, a consequence of one set of processes (associative) acting as the substrate for the other (propositional). It is this arrangement that leads to the asymmetry in our data, as the more cognitive mode of operation allows for learning on an associative basis, but the associative mode does not, by definition, engage any propositional processes.
Our final experimental example considered the Perruchet effect, which, defined as the effect over Level, is already known to pose real problems for a single process account; and we contrasted this with a predictable version of our design that produced strong evidence for response priming. Ironically, the challenge that this dataset poses for the dual-process account is that the response priming effect is clearly a large part of the Perruchet effect defined over run length, and there is the possibility that this effect may not be associative in nature. We cannot rule this possibility out, but note that there is an explanation in terms of associative learning for the response priming observed in the two conditions. In fact, there are two types of explanation for the response priming effect most clearly seen in the Predictable condition that we can think of. One is to simply attribute it to residual activation of units that support the response—a non-associative and non-propositional account. These units are activated when a response is made, that activation persists for some time, and when the response next has to be made, the units’ activation starts from a higher baseline than if the response had not occurred on the previous trial, hence priming. This explanation fits the facts and is particularly suited to explaining the strong response priming found at Pulse 1 in the ITI, but some questions remain about its ability to cover all of our data. First, can enough activation persist to produce the strong priming effect also seen much later in the trial at Pulse 2? And second, how does this account explain the decline in MEP across Level? Of course, some post hoc explanation of this pattern in our data is possible. We can imagine that the units in question become progressively fatigued due to chronic activation, and this leads to less and less priming over the run (though this would not explain the effect for negative runs).
All this notwithstanding, there is another explanation for response priming in terms of associative processes that quite naturally explains the trend across Level, can deal with delay, and applies a similar analysis to that used for the Perruchet condition (a build-up of associative strength over runs) to the Predictable condition. The only caveats here are that we need to postulate an internal representation that is available to associate with outcomes when they occur and that includes some representation of what occurred on the previous trial. This internal representation can be modulated by external stimuli (the context, the CS), but exists (i.e., is active) in some form at all times. The component due to the previous trial is perhaps best modelled using recursion as in the Simple Recurrent Network (SRN) (Elman, 1990), although simple associative chaining will suffice in this instance. In simulations that we have run of this account using a feedforward backpropagation network (Rumelhart, Hinton, & Williams, 1986), the role of the internal representation is taken by the hidden units en masse, as they have a resting (zero input) activation of 0.5, and in addition, we have units representing the outcome of the preceding trial as input for the current trial. The idea is simple: When a response is required or elicited, the internal representation and the units representing the previous trial associate with some representation of either that response or to the outcome that provokes the response. As a consequence, over time, the component of the internal state representing the previous trial will pick up considerable associative strength, as the outcome on the previous trial predicts the outcome on the current trial 80% of the time. It is this component that gives the basic (and large) response priming component in the Predictable condition. When we probe with Pulse 1 in the ITI, we see response priming dependent on internal state activation at that time, and this will give a large MEP on those trials preceded by a response due to that component of the input activation being present. Pulse 2 results are not a problem for this account as associations persist over time, although the effect might be expected to be weaker as more time will have passed allowing some extinction to occur. The real benefit that this explanation of response priming offers, however, lies in its explanation of the trend over Level, and we turn to this next.
When the CS occurs, the representation of the CS also associates with the outcome that follows it. But, in doing so, it competes with the other representations (which are typically more salient) for associations with the outcome or response representation. Initially, it will tend to be overshadowed as a consequence, but will gradually build up an association over a run of positive trials. Because the CS representation is not active during the ITI, it will not contribute to Pulse 1 effects; these will be driven entirely by the internal state including previous outcome representations. The result of the competition between CS and the other input representations, however, is that the latter will gradually lose strength over a run of positive trials exactly because the CS representation acquires greater associative strength over that run, and the combination of CS and other input representations is being trained to the same fixed asymptote. Hence, we are able to explain the decline in Pulse 1 MEPs for positive runs in Figure 10. There should be relatively little effect over negative runs for this pulse, as the strong association that will have formed between units representing a response on the previous trial and the outcome will not be activated. Instead, any component due to the previous trial will now be predicting no outcome. The pattern for Pulse 2 can be expected to be different to that for Pulse 1 because in this case, the CS representation is added to the other representations and the net effect should be relatively stable. Hence, there should not be a marked decline for positive or negative runs. This last claim may seem to run counter to the data for Pulse 2, but note that actually there is no overall downward trend in the Pulse 2 data for negative runs because of the higher MEP to −1 trials.
Hence, the large response priming effect can be attributed to the build-up of strong associations between representations of the previous outcome and the outcome on the current trial, driven by the fact that if the previous trial required a response then the next trial is very likely to and vice versa; and the trend over Level can be attributed to cue competition between the CS and these other input representations. This simple account using well-established principles of associative learning can, to a first approximation, explain the Predictable condition results and also generate the Perruchet effect seen when there is no reliable relationship between one trial and another (it is basically the standard associative explanation of this phenomenon). In the case of the Perruchet design, it is worth noting that both CS and state representations will work together to produce the increasing linear trend seen over run length and Level—which can explain why it is still possible to get a Perruchet effect in this type of RT paradigm without using a CS (see Mitchell, Wardle, Lovibond, Weidemann, & Chang, 2010, analysis of this). Of course, we will still expect to see response priming in the ITI due to the association that will form from state to outcome, but there will not be any reliable prediction based on the outcome of the previous trial to drive this effect this time, and so response priming will be considerably weaker, and any cue competition effects will be weaker as well.
This concludes our analysis of the data we have presented in this article, but we have a few final thoughts for researchers working in this field. The first is for readers that find our arguments for this version (or indeed any version) of a dual-process account unconvincing. We urge you to reconsider, to the extent that you explicitly take into account the possibility that both sets of processes may contribute to behaviour when designing your experiments. Of course, one can simply assume that all learning is propositional in nature and be confident that the data obtained from any experiment will be susceptible to some post hoc cognitive interpretation. Anything can be explained with propositions after the fact (and anything can be explained with associations as well—both types of system are complete computationally, cf. McCulloch & Pitts, 1943), but this does not offer much by way of predicting behaviour, and does not explain why the types of effect considered in this article should exist. It did not have to be the case that human learning and behaviour turned out this way; no propositional account demanded that it be so. So simply asserting (correctly) that there are propositional explanations available for these data does not really move us forward.
Another version of this argument is to take the view that propositions are, in some sense, the computational primitives for human learning and that associations are constructed out of them. It could be argued that this is also a dual-process view, but one that is complementary to that espoused here that has associations as the computational substrate for propositions. Both accounts are dual process, both can explain these and other results, so do we really care which is correct? Does it matter? The answer given here is yes, it does, because while both sets of processes are in play in both accounts, only one of them (to our mind) can successfully explain why, when circumstances are such as to undermine the usual levels of learning and performance achievable by humans, that what emerges seems best explained in terms of associations rather than propositions. As cognition is pared back, what emerges is associative processing in its raw form. Surely this would not be expected on the basis of an account that had propositions as its computational primitives. In essence, the thesis offered here is that implicit in the connectionism and reinforcement learning traditions (e.g., Botvinick, Niv, & Barto, 2009; McClelland & Rumelhart, 1985; Sutton & Barto, 1981, 1989; O’Reilly & Frank, 2006), and as such, it is worth pointing out that it differs importantly from the more traditional AI perspective that is fundamentally propositional in nature.
Given this, setting out to investigate associative processes in humans without bearing this dual-process analysis in mind can result in falling into the trap of designing experiments that “should” tap into associative processing (according to associative theory) but then apparently do not. Design is, on its own, not enough in these circumstances. Procedure is everything here. It is not enough, for example, to run a blocking design with humans, get blocking, and then claim that this due to error-correcting associative learning. It may be, but unless you have taken the necessary precautions, it may also be (and probably is) due to propositionally based inference (see Le Pelley, Oakeshott, & McLaren, 2005; McLaren, Forrest, & McLaren, 2012, for examples of how to dissociate blocking due to associative processes from that due to cognitive inference). We hope that we have made a convincing case that special efforts have to be made to separate the cognitive and the associative if we start from a position where both might be in play. To some extent, the reason why the very notion of associative processing in humans is under challenge is because researchers have claimed this type of processing for circumstances where it does not have to be the case that it is controlling behaviour at all (and later on is shown not to be). Here, we have attempted to give some useful guidance on how to disentangle the cognitive and the associative, but in the long run, we would argue that this should not be the ultimate goal of our investigations. The real challenge is to better understand how cognitive and associative processes work together to produce the full range of human learning and behaviour.
Footnotes
Acknowledgements
The lead author would like to express his gratitude to his co-authors for their help with the work reported in this paper.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This work was supported by studentships from the Economic and Social Research Council to A.M. and W.B., an EPS grant to I.M., and a starting grant to F.V. from the European Research Council (ERC) under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC Grant Agreement No. 312445.
