The task is not enough: Processing approaches to task-based performance

Abstract

This article reports on three research studies, all of which concern second language task performance. The first focuses on planning, and compares on-line and strategic planning as well as task repetition. The second study examines the role of familiarity on task performance, and compares this with conventional strategic planning. The third study examines the effect on task performance of different types of post-task transcription. The three studies are also examined in relation to one another for the broader generalizations that they permit. These suggest that repetition can be stronger in its effects than on-line or strategic planning, but that planning is more potent in its effects than simply familiarity with the material being spoken about. In addition, what is termed supported on-line planning and post-task transcription are associated with less error in performance. The three studies are discussed in terms of the wider second language performance processes of complexifying, rehearsing and monitoring. These processes are linked to the Levelt model of speaking, and applied to the need to analyse tasks in a manner consistent with pedagogic goals.

Keywords

TBLT (task-based language teaching)language learning task task performance task conditions repetition planning accuracy fluency complexity

I Introduction

Much current research explores how tasks can be important for our understanding of second language performance, and also for the effectiveness of instruction. In a way, the transition from the label ‘communicative language teaching’ to the label ‘a task-based approach’ reflects a greater centrality for research, for empirical work, and for the recognition of the accountability to evidence of claims that are made about pedagogy. This change of label is also associated often with a greater psycholinguistic orientation to research, and to attempts to link what happens during tasks to theories of second language performance. One aspect of this is that any earlier hopes that tasks in themselves would contain all that is needed for sustained second language development, and so parallel first language acquisition, have given way to accounts where it is recognized that within tasks, there needs to be some degree of focus-on-form (FonF: Long and Robinson, 1998), so that simply completing the task is not sufficient (important though it is), but that there have to be ways in which second language learners also are induced not to forget form, and instead to wrestle with form–meaning connections so that what is developed is not simply strategies of communication but also control over a developing interlanguage system.

Even so, if we assume a FonF-informed approach, there remains the question of how tasks and task performance can best be researched. Obviously, there are many theoretical viewpoints, and each of these offers priorities and directions for research (see, for example, Robinson et al., 2009; Skehan, 2009a; Van den Branden et al., 2009). In addition, any particular theoretical viewpoint will progressively focus on questions that are seminal for that particular perspective on second language acquisition and performance, e.g. the importance of negotiation of meaning and recasting, or the relevance of information processing limitations. The difficulty that arises, as a result, is that there are many disparate studies in the field, and so the contribution that is made by any one new study may not therefore be as clear as it might be. We are too dependent on what might be called ‘one shot’ studies: studies that may be sensible in their own terms, but which do not make clear enough connection with any broader sense of progress in the field. In addition, there is the general problem that different studies use different measures, rendering comparison and meta-analyses particularly difficult (Norris and Ortega, 2006).

To combat any reliance on one-shot, independent studies, the present article will report on a series of investigations all done within the same general framework (Skehan, 1996), and using broadly the same set of measures. In this way, it is hoped that the different studies will complement one another, lead to cumulative progress, and also make for easier comparison because of the common measures. The general framework is shown in Figure 1. Essentially, all the framework does is locate research studies as focusing on the before, during, and after phases of task use (Skehan, 1998). The ‘before’ phase has, so far, emphasized planning (although many other possibilities exist), while the ‘after’ phase has mostly focused on the effects on performance of anticipating a post-task activity that will come. The table makes clear, however, that the range of research options is much greater with the ‘during’ task phase, since here the task itself – as well as the conditions under which it is done – all come into play.

Figure 1

A framework for task-based research

There has been considerable work in each of these areas. Some of this work has been theoretically motivated. Studies of task characteristics and task conditions, for example, have been conducted to explore the respective claims of the Trade-off (Skehan, 2009a) and Cognition Hypotheses (Robinson, 2001). But the framework in Figure 1 also functions non-theoretically, as an organizing device so that individual studies can be related to one another more easily. That is how the studies reported in this article function. The emphasis, as in much research, is on contributing to our understanding of the nature of tasks and how they can be implemented. But the opportunity for making connections with actual instructional situations is slightly easier, because the linked studies relate to pre-task, during-task and post-task stages of task implementation, which correspond to ways in which teachers think about tasks, as do the task selection and calibrating features.

Three studies will be reported on here. The first explores different types of planning, with greatest emphasis on the distinction between strategic planning and on-line planning. This study also explores the effects of task repetition. The second is also concerned with strategic planning, but contrasts this with task familiarity, which is explored, in this study, as a form of planning. The third study focuses on the post-task stage, and examines whether the anticipation of having a linked activity to do after a task is finished might change the way in which the task is done. We turn next to reviewing the respective literatures underlying each of these studies.

II Literature review

The literature on the effects of planning on task performance is now extensive (Ellis, 2005). Broadly, strategic planning, i.e. time given for planning before the task itself is started, is associated with consistent significant effects for language complexity and fluency, and slightly less consistently with accuracy (Ellis, 2009). The effect sizes for complexity and fluency are also greater than those for accuracy (Skehan and Foster, in press). These effects operate across a range of tasks, including personal information exchange, narratives, and opinion/discussion tasks. They emerge for monologic and interactive tasks, although there is the generalization that planning has a greater effect when tasks are more demanding, and has less effect when a task is based on familiar information (Skehan and Foster, in press). Most studies of strategic planning have used a planning time of 10 minutes (with note-making encouraged or required, but with the notes taken away before the actual task performance; Crookes, 1989). Some studies have used other lengths of planning time and, in general, also show significant effects of planning, although these are related to how many minutes are made available for it (Mehnert, 1998; Wigglesworth, 1997).

If, then, we assume a contrast in the effects of strategic planning on complexity and fluency, on the one hand, and accuracy, on the other, some suggestions by Ellis (2005) are relevant. He distinguishes between strategic and on-line planning, where on-line planning is presumed when there is a lack of time pressure in performance. He proposes that, under these conditions, planning is possible during ongoing performance, and that in such a case on-line planning will be particularly effective in connecting with greater accuracy. He based these claims on two studies, by Ellis and Yuan (2004) and Yuan and Ellis (2003), with spoken and written performance respectively. This analysis proposes that planning is not simply one thing, and that different sorts of planning can be associated with different performance goals, with complexity more effectively influenced by strategic planning, and accuracy by on-line planning.

The literature on strategic planning has tended to assume that the operative variable is the time that is provided, pre-task, to enable participants to prepare. But this, in turn, assumes that the speaker is going to talk about something initially unfamiliar and that the planning time will enable them to get themselves into a position to speak more effectively about this ‘new’ topic. But that, of course, is not the only form of preparedness that exists. Two alternative forms of preparedness can be discussed fairly straightforwardly. First, there is the possibility that whatever has been spoken about has already been spoken about. For example, we often retell stories and we assume that the stories get better in the retelling. Ellis (1987) explored preparedness in this way, with a condition where participants spoke a narrative after previously having written about it. An original influence of studying planning from some time ago, Ochs (1979) also conceptualized planning in this way. A particular form of this type of preparation is where a task is repeated (Bygate, 2001; Plough and Gass, 1993). In this case the previous performance is part of the research study, and so it is controlled in some way, with the advantage that the two performances – original and repeated – can be compared. So we could also think of planning in terms of previous engagement with the material involved (Carrell and Eisterhold, 1983). Related, but distinct, is a second sense of preparedness: familiarity with the content domain involved. In other words, there may be areas of experience that are very familiar, or events that have occurred many times, or domains that have been studied more formally. In each case, the ideas that are to be expressed will not need excessive conceptualization, since they are available, perhaps as schemas, in long-term memory. In such case, the preparation has already taken place through the participant’s previous life.

In general, studies of preparedness have focused on strategic planning. It is clear, however, from the previous paragraph, that this is not the only way to conceptualize preparation; other ways of being ready to speak exist. That leads to the interesting question as to whether these different forms of preparedness have distinguishable effects upon performance. It will be a goal of the studies to be reported here to explore if this is the case, i.e. to see whether the following impact upon performance in different ways:

preparedness as time to assemble what one is going to say;

preparedness as having expressed similar thoughts before;

preparedness as a repeated performance;

preparedness as familiarity with information.

In some contrast, one can also explore the impact of post-task activities. These activities come in many forms. Pedagogically, such activities may focus on language made salient by the earlier task performance to consolidate, practise, and extend (Skehan, 2007; Willis and Willis, 2007). Such an approach would use the task performance as the vehicle for subsequent learning. A different orientation would be to explore whether anticipation of a post-task activity might change the way a task is done and how attention is directed during the task. This was the starting point for two studies by Skehan and Foster. In Skehan and Foster (1997), a post-task activity, the ‘threat’ that participants in an actual classroom working on tasks in pairs would subsequently need to engage in a public performance of the (private) task they were doing led to greater accuracy on one task-type, decision-making, but not on a personal information exchange task or a narrative. Building on this, in a subsequent study, Foster and Skehan (2011) used a different post-task condition, the need to transcribe some of one’s own task performance (Lynch, 2001, 2007), and showed significant effects for accuracy in the earlier task performance for both a narrative and a decision making task, and also for complexity on the decision making task. They interpreted these results in terms of attention directed selectively to form during task performance because of the anticipation that the task, as it were, was not over when the task phase finished, but that the later activity induced learners to focus, to some degree, on form because they knew they would be confronted by their own performance later, and so were more attentive to error during the actual performance.

In the next section, we will use and extend this brief literature review as we describe three studies. Each is based on doctoral research, but each was conducted within the framework briefly outlined earlier.

III The three studies

Study 1: Strategic planning, on-line planning, and repetition: Wang (2009)

We noted earlier that there is a contrast, in the planning literature, between strategic and on-line planning. Wang (2009), the study on which this section is based, suggests that there have been some problems in earlier research with on-line planning. In Ellis and Yuan (2004) and Yuan and Ellis (2003), on-line planning was defined by the time made available for a task to be done, where the time allocated was based on how much time had been taken in a pilot study where participants were unpressured for time. It was reasoned that if participants were given less than was typically taken in unpressured conditions, this would prevent opportunity for on-line planning, whereas if they were given a more generous time allocation, this would make on-line planning possible. In other words, engaging in on-line planning was not based on any direct evidence, e.g. retrospection, or performance indicators of on-line planning activities (Skehan and Foster, 2005), but instead was inferred from the time conditions used. A problem arises here because neither group in the studies, whether strategic or on-line, used all of the time that was allocated to them. Average time taken was below what was available, suggesting that participants might not have been pressured as much as the researchers had expected.

To address this issue, Wang (2009) used a different approach to designing materials. She used narrative retellings, based on wordless Mr Bean videos. She was able to use video-editing software to slow such videos by 60% without compromising the naturalness of the developing story (and observers of the manipulated versions of such videos did not realize they had been slowed). In this way she would use the same storyline and events, but with two different videos, each requiring controlled, and different time pressures, i.e. the original vs. the original plus 60% more time. This overcomes the difficulty noted with the other on-line planning studies of lack of standardization of time available. In addition, Wang (2009) used a repetition condition, in which some participants retold the same video story (and this was at original speed), but after an interval of one minute. A range of research questions was proposed, and to address these, six conditions were used, as shown in Figure 2.

Figure 2

The experimental conditions in Wang, 2009

Most research with strategic planning has used input that can be read and worked on during a planning period of (say) 10 minutes. In Wang (2009), the input was a video narrative (to exert better control over time) and so this convention of providing input reading time was inappropriate. Instead ‘watched’ conditions were used in which the video was viewed silently to ensure familiarity, broadly, if not exactly, equivalent to the 10 minute planning periods in other studies. (Note making was not allowed here.) In addition, for the second experimental condition, the ‘watched’ phase was followed by a strategic planning phase in which participants reflected on the video they had watched, and had time to think how they would simultaneously narrate the video while it was running.

Obviously, all conditions could be contrasted with the control group. In addition, it was possible to compare the repetition condition with all other conditions. Even more selectively, it was possible to compare the two on-line planning conditions with the two strategic planning conditions, and then at an even finer level of detail, to compare different versions of the strategic planning and of the on-line planning conditions. We are now at a point in the planning literature where we have some idea of the general and positive effects of planning. Wang (2009) tried to be even more focused in locating precisely where planning effects occur.

The results for the study, for the various conditions, are given in Table 1. The table shows results for complexity (clauses per analysis of speech unit, or AS unit), accuracy (percentage of error free clauses), and fluency (average length of pauses at the end of AS units), for all comparisons with the control group. For each comparison, the effect size is shown numerically and verbally where significance is obtained, otherwise the non-significance is indicated.

Table 1

Effect sizes for the different planning conditions

	Watched	Watched strategic planning	Repetition	On-line planning	Watched on-line planning
Complexity	0.94 Large	1.14 Large	1.89 Huge	ns	1.16 Large
Accuracy	ns	ns	3.47 Huge	ns	0.89 Large
Fluency	0.85 Large	ns	2.92 Huge	ns	ns

Notes: ns = non-significant; following Cohen (1988), effect sizes of 0.2 to 0.5 are considered small; 0.5 to 0.8 medium; 0.8 to 1.2 large; and beyond that, huge

Various points can be made about these results. First, we can consider the two leftmost columns. These show that strategic planning, broadly, has its usual results, that complexity and fluency are raised, with the effect sizes large. The watched strategic planning result for fluency is not significant though, which contrasts somewhat with the existing literature. The lack of significance for accuracy, in contrast, is consistent with other assessments in the literature (Ortega, 2005).

A very interesting set of results is provided in the rightmost two columns. Strikingly, none of the ‘pure’ on-line planning results generate significance, even with accuracy. On the other hand, both complexity and accuracy show large effect sizes for the watched on-line planning condition (whereas fluency does not reach significance). In other words, under carefully controlled time-pressure conditions, on-line planning alone does not lead to significant differences from the control group. However, on-line planning clearly does have an effect, but it appears something more is needed to trigger its effectiveness. In this case, it is the opportunity to watch the video before re-watching and telling the slowed video story. It seems that on-line planning does function in the way Ellis claims, but it requires some degree of support. One can regard the watched part of the condition here as a sort of low level strategic planning, since it does lead to some familiarity with the general outline of the story, and, presumably, of some level of detail. It is interesting here that Complexity is also raised, suggesting that under these conditions, form-in-general is enhanced, not simply accuracy. Wang (2009) based her research partly on the Levelt (1989) model of first language speaking. This posits three broad stages in speaking. The first, Conceptualization, is concerned with developing the ideas to be expressed; the second, Formulation, ‘clothes’ the ideas in language elements, both lexical and then syntactic, and the third stage, Articulation, produces actual phonological realization of the plans output by the Formulator; for further discussion, see Skehan, 2009a. Possibly, in Wang’s research, we have an example of the watched part of the condition helping conceptualizer operations (Levelt, 1989) while the on-line part of the condition supports more effective formulator use.

The remaining column to be considered, repetition, is also very interesting, because here there are huge effect sizes in every area. The puzzle is to account for (a) the size of the effects, and (b) the fact that all performance areas are strongly affected, even if the effect size for complexity is not quite as large. To complete the task the first time requires some conceptualization, some formulation, some articulation: all the phases of the Levelt model. It may be that the engagement during the first performance does enable better ‘packaging’ of ideas in the second (Bygate, 2001), and so raises complexity. But the effect sizes in the other two areas suggest massive facilitation for formulation. Language seems to have been primed and attention seems to be directed to ongoing performance in a very effective manner. The impact of articulation in this priming is a new finding in the field.

Study 2: Familiarity and planning: Bei (2010)

Earlier, we discussed the way that preparedness can come in many forms. Strategic planning is one of these, but in the literature it has tended to be used in research designs where something new is given to speakers as a topic, and then performance under conditions allowing planning are compared to those where no such opportunities are available. But preparedness can also mean engaging with material that has been encountered before, and that may be known well. This links strongly with the literature (mostly in reading and listening comprehension) on the relevance of schematic knowledge for understanding material (Carrell and Eisterhold, 1983). It may also connect with the second language task performance studies on structured vs. unstructured tasks, especially narratives (Skehan and Foster, 1999; Tavakoli and Skehan, 2005).

Bei (2010) wanted to explore the effects of such preparedness on second language performance. He also wanted to compare this performance with the effects of conventional planning. In his research design, he compared two groups of participants: medicine majors and computer science majors studying in universities in Hong Kong, and who would be judged to be at intermediate to low advanced levels. He gave each of these groups the same two tasks: to describe the functioning of, and treatment for, a computer virus, and to do the same thing with a virus in the human body. In this way, each group had a matched condition, e.g. medicine majors and a human virus, and each had a mismatch condition, e.g. computer science majors and a human virus. He had the additional independent variable of planning, defined in the conventional way as 10 minutes to prepare, with notes required, but with these notes taken away for the actual performance. In this way, he was able to explore if there was a different effect of strategic planning when the focus for the planning is already familiar (the match condition) vs. when it is for unfamiliar material (the mismatch condition). The design also enables comparison of the main effects of the two major variables, familiarity and planning, to see if they differ in their importance, as indexed by relative effect sizes. The results for this study are presented in Table 2.

Table 2

Familiarity and planning effects on second language performance

	Familiarity		Planning
	p	D	p	D
Speech rate	0.001	0.26	0.001	0.58
Average mid-clause pausing	0.020	0.28	0.001	0.71
Accuracy	0.020	0.22	ns	–
Complexity	ns	–	0.02	0.39
Lexical sophistication	0.001	0.41	ns	–

Notes: ns = non-significant; following Cohen (1988), effect sizes (shown as D) of 0.2 to 0.5 are considered small; 0.5 to 0.8 medium; 0.8 to 1.2 large

The measures of accuracy and complexity here are the same as those used in Study 1. But there are some differences elsewhere. There are two measures of fluency. One is speech rate, in words per minute. The other is a measure of pausing, as before, but here the measure focuses specifically on pauses mid-clause, rather than pauses anywhere. This (Skehan, 2009b) has been shown to be a more sensitive measure in capturing differences in non-native speaker performance. In addition, there is a measure of lexical sophistication, or the capacity to use less frequent words in performance (Skehan, 2009b). This has been shown to be distinct from lexical diversity (Read, 2000), and from syntactic complexity, and is measured here by a version of Meara’s Lambda index (Bell, 2003; Meara and Bell, 2001), which provides an estimate of extent of use of rarer words.

Familiarity does produce some significant effects here, for both of the fluency measures, for accuracy, and for lexical sophistication, but all these generate small effect sizes. In other words, speaking about something one is familiar with does produce performance advantages in these various measures, but the advantage is surprisingly small. Given the lack of an effect for complexity, it can also be argued that the advantage is located within the formulator, and concerns on-line processing. Being prepared – in the sense of engaging with well-organized material that has previously been thought about – seems helpful, but only to a limited extent. Second language speakers show no greater syntactic complexity, some greater lexical complexity as they incorporate specialist words, and slightly greater control, in the form of fluency and accuracy. In contrast, ‘conventional’ planning produces more impressive effects. Significance is attained for fluency, with medium effect sizes, and complexity, with a small effect size. Neither accuracy nor lexical sophistication produce significant effects. Possibly here these planning effects are more concerned with conceptualizer operations. The effects are also fairly standard when compared to the general planning literature. Conventional planning enables faster speech, with more natural pausing, and also slightly more syntactically complex speech. Somewhat curiously, being given preparation time seems more effective here than speaking about something familiar.

In passing, it should be noted that, relative to many of the studies that have been done of planning, these participants were of a reasonably high proficiency level, and so it is possible that different, and stronger, effects might have been found at lower levels. These students might have been at a level that enabled them to function on-line so that they were able to overcome lack of familiarity and lack of planning opportunity to some degree, and so the strength of effects might have been attenuated.

Study 3: Post-task effects on performance: Li (2010)

The third study to be considered is of a post-task influence. Li (2010) drew upon Skehan and Foster (1997) and Foster and Skehan (2011) to try to extend their work with post-task influences upon performance. Foster and Skehan (2011) had used a post-task transcription condition to explore whether anticipation of a post-task activity to follow a task would lead learners, through their awareness of what was to come, to prioritize accuracy and to use attention selectively to achieve this. The original Foster and Skehan (2011) study had simply required participants to transcribe one minute of their own performance. Li (2010) reasoned that one could explore different options with transcription to see whether they might have different influences on the language that was used in the actual task. This was done for both a decision-making and a narrative task, administered in a counterbalanced order. She proposed two general variables as having relevance here. The first concerns the participant structure for the revision. She contrasted individual transcription (the condition that was operative in Foster and Skehan, 2011), where each individual learner transcribed their own performance, alone, with a condition where the transcriptions were done in pairs. At any one time, what was being transcribed was of one person’s performance, but both members of the pair that did the task collaborated in the transcription, taking turns to focus on one another’s work. The second variable concerned whether there was revision. This contrasted a condition where participants simply transcribed with one where, after the transcription, they rewrote the text but with the requirement that they try to identify any mistakes and correct them, in so doing producing an ‘ideal’ version of what they had written. In fact, Li (2010) also used a teacher transcription/ideal version condition, but this was a remarkably ineffective condition. Any significances which were found were for conditions where participants themselves were directly involved with their own performances.

Broadly, Li (2010) confirmed the results from Foster and Skehan (2011): compared to the control group, giving learners a post-task condition generated significant post-task effects, principally for accuracy but also for complexity, with these effects being for narrative and decision making tasks. The significance level attained was always beyond p < .01, and effect sizes were generally very large and sometimes huge. Once again, using a post-task condition seemed to induce the use of selective attention, and this seemed to be towards form. But in addition to this general finding, she was also able to show some more selective influences between different experimental conditions. These were:

transcription done by an individual produced a significantly higher lexical sophistication score than transcription done in pairs;

transcription done in pairs produced a significantly higher structural complexity score (clauses per AS unit) than did transcription done by individuals;

for the decision-making task only, transcription done with revision generated greater accuracy in task performance than did transcription done without revision.

These are very interesting results. First, it is worth reiterating that these effects are effects on actual task performance of the anticipation of something that will happen later. Contrasts between the experimental conditions are contrasting effects of different anticipations, i.e. anticipating transcribing alone produces slightly different effects than anticipating transcribing in a pair, for example. Second, some of the selective effects are intriguing. Lexical sophistication and structural complexity both concern complexity, but one is lexical and the other is syntactic. Remarkably, they have been influenced in reverse directions regarding individual vs. pair based transcription, which adds to the insight that they are not the same thing (Skehan, 2009b). Similarly, it is striking that the revision condition, albeit with only one task, selectively influences accuracy specifically. This seems to correspond most clearly to the original insight for researching post task effects, since that too was directed only at accuracy in performance.

IV Integrating the studies

We have now examined the findings from three studies, each conducted within the framework outlined in the introductory section. The three studies have focused on different areas, but each has used broadly comparable performance measures. We can try to summarize the results by presenting them together, using the different independent variables that were studied, i.e. familiarity from Bei (2010), repetition from Wang (2009), and strategic planning from Wang (2009) and Bei (2010). This is shown in Table 3.

Table 3

Summary results of the three studies

	Fluency	Accuracy	Complexity
			Syntactic	Lexical
Familiarity	✓	✓	ns	✓
Strategic planning	✓	ns	✓	ns
Repetition	✓	✓	✓	na
Supported on-line planning	ns	✓	✓	na
Post-task	ns	✓	✓	✓

Notes: ns = not significant; na = not applicable, i.e. no data

The table brings out that all performance measures can be influenced, but by a range of independent variables. In addition, displaying data in this way enables wider generalizations to be made. To do this, one needs to focus on some psycholinguistic processes that underlie performance on tasks by second language speakers. These are:

Working with ideas, extending and organizing ideas: This is concerned with the content of what is being said, and relates most naturally to Levelt’s conceptualizer stage in speech performance. It is also the area where there may be least difference between native and non-native speakers, i.e. each group can think just as effectively about what they want to say.

Rehearsing: Here the focus is on learners preparing themselves for subsequent performance but where they do this by assembling what they think will be the exact language that they use, or at least an approximation of this language.

Monitoring: In this case we are concerned with actual performance, and with the allocation of attention, selectively, to monitor what is being said just before it is said.

We can now link these processes to the findings summarized in Table 3 and to the three studies. In so doing, we can see that each process implicates more than one finding, and that the three studies all contribute to the larger picture that emerges.

Ideas/extending is concerned with the content of what is to be said, and is taken here to be indexed by increases in measures of language complexity. So we can explore which of the three studies have had an impact in this regard. Interestingly, four independent variables each demonstrate relevant relationships. These are familiarity, from Bei (2010), strategic planning, from Wang (2009) and Bei (2010), repetition, from Wang (2009), and post-task transcription from Li (2010). In other words, there are multiple influences that push up complexity. Familiarity produced a small effect size with lexical sophistication in Bei (2010); strategic planning produced a small effect with structural complexity in Bei (2010), but large effects in Wang (2009); repetition produced a huge effect size in Wang (2009); and post-task transcription produced large effect sizes in Li (2010). All these variables have an impact, therefore, but with considerable variety. Speaking about a well-understood and familiar area increases complexity a little, but it seems that preparation with unfamiliar material, either through strategic planning or previous performance, has a much greater effect. It is interesting that preparation in the form of actually speaking seems to be the most effective form of intervention, and greater than spending time formally preparing. The memory of what one has done in actual performance seems to have a greater impact, and seems to prime participants so that they say more complicated things (Bygate, 2001). Finally, anticipation of a post-task transcription also seems to direct attention to producing more complex language.

Rehearsing was at issue only in Wang (2009), and is reflected in higher accuracy and fluency scores. What is interesting is that two sources of rehearsing are possible. The first is planning, again. This brings out that planning can be concerned with more than one thing (Ellis, 2005). We have seen an impact of planning on conceptualization, and language complexity. But with Wang (2009) it is assumed that planning can also be directed at rehearsal, with this probably depending on the processing preferences of the speaker. So one can use the time available to anticipate the specific language that will be needed in the task and then to assemble it. In this way, planning may not impact upon the ideas to be expressed, but instead will make ready and prime actual language. This clearly will be dependent on memory factors. But much more significant in operation here, and with effect sizes which are huge for accuracy and fluency is repetition. To rehearse language, it seems most effective to use the very language that can be re-used (Bygate, 2001). The act of first use then seems to prime later use, to sensitize it, and thereby to enable the speaker to exploit the greater accessibility of the language and avoid errors that were made the first time around. It may also be that speakers are more likely to automatize language when it is being reused in this supported way. Actually using the language in order to prepare seems far more effective for memory than setting down the language, in the form of notes, so that it can be drawn on, given that the notes will be taken away.

The final process to consider here is monitoring. This implicates Wang (2009), with on-line planning, and Li (2010), with the post-task transcription condition. In each case, the conditions during the actual performance induce selective attention towards accuracy. In Wang (2009), the watched on-line planning condition (and note that the on-line condition by itself did not produce an effect) seemed to create the conditions for an effective capacity to direct attention to accuracy. The watched component of the condition enabled the broad task demands to be understood, and the macrostructure of the story to be appreciated. Then, under the on-line condition, with the slowed video, a greater amount of attention was available, and at least some of this was directed towards avoiding error. In other words, there was time available for speakers to monitor their performance and make accuracy an important goal. In Li (2010), effective monitoring was achieved slightly differently. There was no easing in the processing conditions, as was the case in watched on-line planning in Wang (2009). Instead, learners seemed to direct attention towards accuracy selectively, because they valued the importance of avoiding error in order to avoid, in turn, the problem of transcribing their own errors at the post-task stage. Anticipation of the post-task, in other words, triggered monitoring. Monitoring therefore assumed greater importance than it would in normal communication.

One can summarize these studies in the following propositions:

Developing conceptualization and more advanced language seems to involve exploiting familiarity and strategic planning.

Developing greater accuracy and greater control seems to involve rehearsal through strategic planning and repetition, and monitoring through on-line planning and using post-tasks.

V Extending a framework for second language oral tasks

In Skehan (2009a), a framework was proposed to organize influences upon task-based second language performance. The framework was based on task research findings, and so can be claimed to be empirically founded. It was organized around a ‘spine’ based on Levelt’s model of first language speaking, with stages of Conceptualization, Formulation: Lemma retrieval, and Formulation: Syntactic Encoding. Then, ‘supportive’ influences and ‘demanding’ influences were proposed. The former comprise easing influences, which reduce the processing demands of a task, or ‘focusing’ influences, which push the learner towards greater accuracy. The latter consist of ‘complexifying’ influences, which push the learner to make the task more difficult, and ‘pressuring’ influences, which require the learner to complete the task under more difficult conditions. This framework is reproduced in Table 4. However, the table from Skehan (2009a) has been extended, to incorporate the findings from the three studies reported in this article. These additions are shown italicized, with an initial to indicate which process, described in the last section, is relevant. This highlights what is additional in this table relative to Skehan (2009a).

Table 4

Task research, the Levelt model and performance

Complexifying/ pressuring influences	Stages of the Levelt model	Easing/ focussing influences
Planning, extending (I/E) Complex, cognitive operations Complex information type	Conceptualizer	Concrete/static Less information Easier cognitive operations Familiarity of information (I/E) Repetition (I/E)
Infrequent lexis Non-negotiable task	Formulator: Lemma retrieval	Planning: organising Dialogic Repetition (R)
Time pressure (M) Heavy input pressure Monologic	Formulator: Syntactic encoding	Planning: rehearsing (R) Structured tasks Dialogic Post-task condition (M) Supported on-line (M) Repetition (R)

Notes: I/E = Ideas, Extending: R = Rehearsal: M = Monitoring

It is important to stress that the influences shown in Table 4 are based on research studies exploring task-based second language performance. The table, in other words, is empirically based. In other words, using the theoretical spine derived from the Levelt model enables the range of influences, and associated psycholinguistic processes, to be organized in a way which contrasts how tasks can be made more difficult, by being complexified or by being pressured, or how they can be facilitated, either by being eased, in general, or by being directed towards accuracy. Robinson (2007), in contrast, sees conceptualizer demands (as these impact upon task complexity), as pushing formulator operations without attentional limitations, an issue which is discussed more fully in Skehan, (2009a, 2009b). The present approach contrasts markedly with this account, because it assumes the different influences have to be researched both separately, and then additively, as the most effective method of accounting for performance differences on tasks.

This has a number of implications. First, there are implications for the way we decide upon task difficulty. More complex and more pressured tasks, following the influences covered in the table, are likely to make tasks more difficult. This clearly has implications for language testing (Skehan, 2009c). It also has implications for pedagogy. It is a long-standing problem in using task-based approaches to instruction that we need to have a stronger sense of task difficulty, especially as this may impact upon task sequencing. The suggestions in the table give many ideas as to how tasks can be calibrated in this way. They can be eased, for less proficient learners. Equally, they can be made more difficult for more advanced learners who need to be stretched (Luo, 2007). Analysing tasks through the components of the table can be very important in pedagogic decision making. They enable the teacher to think about how different pedagogic goals, e.g. promoting accuracy, can be supported through task choice, e.g. familiar information tasks with low lexical demands, or through task implementation, e.g. supported online planning, or post-task activities.

Second, there are implications for how tasks can be implemented. Many of the components of the table concern task types and task characteristics, but other components are concerned with the conditions under which tasks are done. Here we have seen that planning is multifaceted, and can enhance complexification or accuracy. There is scope here to learn how to direct planning to particular performance areas, rather than simply view it as a generally desirable part of the teacher’s range of resources. Similarly, we have seen how monitoring can raise accuracy, whether this is brought about by on-line planning or by the skilful use of post-task activities. Tasks and task characteristics are an important starting point, but their actual impact on performance and pedagogy can be changed through the conditions under which they are done.

Third, we can link the components of the table to a view of pedagogy. In that respect, it is useful to distinguish between knowledge construction, on the one hand, and knowledge activation and use, on the other (Samuda, 2001). Knowledge construction is concerned with developing an underlying knowledge system, and is, in turn, likely to involve:

noticing

hypothesizing

complexifying, extending

restructuring, integrating

All of these are likely to be most influenced by the Complexifying stage from the framework in Table 4, and the associated influences. So tasks which push for complexification are likely to create conditions for growth in the underlying interlanguage system, provided that appropriate supportive conditions are operative; for discussion, see Skehan (2007, 2011). In contrast, knowledge activation and use is not concerned with developing a system so much as learning to use a system effectively and in real time. This is likely to involve:

repertoire creation, disponibilité (i.e. availability, accessibility);

achieving supported control, avoiding error;

automatizing;

lexicalizing.

Here we are concerned with formulator operations, in Leveltian terms, and so the sets of influences in the second and third rows of Table 4 are relevant. These make suggestions as to how, given an interlanguage system of some sort, the learner becomes more able to exert control over that system.

Looking at tasks and pedagogy in this way, one has a framework for the way a teacher can maximize the chances, through task choice and task implementation, that new language can be brought into focus (knowledge construction) and then control can be gained over that language (knowledge activation and use). Table 4 can be interpreted in two ways. The first is simply as an empirically based account of influences on performance. The second is a framework for a principled approach to pedagogy using a task-based approach to instruction.

Footnotes

Acknowledgements

The authors would like to thank two anonymous Language Teaching Research reviewers who provided very constructive comments on an earlier version of this article. They would also like to thank the Research Grants Council of Hong Kong for the financial support that made the writing of the article possible.

References

Bei

(2010). The effects of topic familiarity and strategic planning in topic-based task performance at different proficiency levels. Unpublished PhD thesis, Chinese University of Hong Kong, China.

Bell

(2003). Using frequency lists to assess L2 texts. Unpublished PhD thesis, University of Swansea, UK.

Bygate

(2001). Effects of task repetition on the structure and control of oral language. In Bygate

Skehan

Swain

(Eds.), Researching pedagogic tasks (pp. 23–48). London: Longman.

Carrell

Eisterhold

(1983). Schema theory and ESL reading pedagogy. TESOL Quarterly, 17, 553–73.

Cohen

(1988). Statistical power analysis for the behavioural sciences. 2nd edition. Hillsdale, NJ: Lawrence Erlbaum.

Crookes

(1989). Planning and interlanguage variation. Studies in Second Language Acquisition, 11, 367–83.

Ellis

(1987). Interlanguage variability in narrative discourse: Style shifting in the use of the past tense. Studies in Second Language Acquisition, 9, 1–20.

Ellis

(2005). Planning and task-based performance: Theory and research. In Ellis

(Ed.), Planning and task performance in a second language (pp. 3–34). Amsterdam: John Benjamins.

Ellis

(2009). The differential effects of three types of task planning on the fluency, complexity, and accuracy in L2 oral production. Applied Linguistics, 30, 474–509.

10.

Ellis

Yuan

(2004). The effects of planning on fluency, complexity and accuracy in second language narrative writing. Studies in Second Language Acquisition, 26, 59–84.

11.

Foster

Skehan

(2011). Anticipating a post-task activity: The effects on accuracy, complexity and fluency of L2 language performance. Unpublished manuscript, University of Auckland, New Zealand.

12.

Levelt

(1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press.

13.

(2010). Focus on form in task based language teaching: Exploring the effects of post-task activities and task practice on learners’ oral performance. Unpublished PhD thesis, Chinese University of Hong Kong, Hong Kong, China.

14.

Long

Robinson

(1998). Focus on form: Theory, research, and practice. In Doughty

Williams

(Eds.), Focus on form in classroom second language acquisition (pp. 15–41). Cambridge: Cambridge University Press.

15.

Luo

(2007). Re-examining factors that affect task difficulty in task-based language assessment. Unpublished PhD thesis, Chinese University of Hong Kong, Hong Kong, China.

16.

Lynch

(2001). Seeing what they meant: Transcribing as a route to noticing. ELT Journal, 55, 124–32.

17.

Lynch

(2007). Learning from the transcript of an oral communication task. ELT Journal, 61, 311–19.

18.

Meara

Bell

(2001). P_Lex: A simple and effective way of describing the lexical characteristics of short L2 texts. Prospect, 16, 5–19.

19.

Mehnert

(1998). The effects of different lengths of time for planning on second language discourse. Studies in Second Language Acquisition, 20, 52–83.

20.

Norris

J.M.

Ortega

(2006). The value and practice of research synthesis for language learning and teaching. In Norris

J.M.

Ortega

(Eds.), Synthesizing research on language learning and teaching (pp. 3–50). Philadelphia, PA: John Benjamins.

21.

Ochs

(1979). Planned and unplanned discourse. In Givon

(Ed.), Syntax and semantics: Volume 12: Discourse and semantics (pp. 51–80). New York: Academic Press.

22.

Ortega

(2005). What do learners plan? Learner-driven attention to form during pre-task planning. In Ellis

(Ed.), Planning and task performance in a second language (pp. 77–109). Amsterdam: John Benjamins.

23.

Plough

Gass

(1993). Interlocutor and task familiarity: Effects on interactional structure. In Crookes

Gass

(Eds.), Tasks and language learning: Integrating theory and practice (pp. 35–56). Clevedon: Multilingual Matters.

24.

Read

(2000). Assessing vocabulary. Cambridge: Cambridge University Press.

25.

Robinson

(2001). Task complexity, cognitive resources, and syllabus design: A triadic framework for examining task influences on SLA. In Robinson

(Ed.), Cognition and second language instruction (pp. 287–318). Cambridge: Cambridge University Press.

26.

Robinson

Cadierno

Shirai

(2009). Time and motion: Measuring the effects of the conceptual demands of tasks on second language speech production. Applied Linguistics, 30, 533–54.

27.

Samuda

(2001). Guiding relationships between form and meaning during task performance: The role of the teacher. In Bygate

Skehan

Swain

(Eds.), Researching pedagogic tasks (pp. 119–40). London: Longman.

28.

Skehan

(1996). A framework for the implementation of task based instruction. Applied Linguistics, 17, 38–62.

29.

Skehan

(1998). A cognitive approach to language learning. Oxford: Oxford University Press.

30.

Skehan

(2007). Task research and language teaching: Reciprocal relationships. In Fotos

Nassaji

(Eds.), Form-focused instruction and teacher education: Studies in honour of Rod Ellis (pp. 55–69). Oxford: Oxford University Press.

31.

Skehan

(2009a). Modelling second language performance: Integrating complexity, accuracy, fluency and lexis. Applied Linguistics, 30, 510–32.

32.

Skehan

(2009b). Lexical performance by native and non-native speakers on language-learning tasks. In Richards

Daller

Malvern

D.D.

Meara

(Eds.), Vocabulary studies in first and second language acquisition: The interface between theory and application (pp. 107–24). London: Palgrave Macmillan.

33.

Skehan

(2009c). Models of speaking and the assessment of second language proficiency. In Benati

(Ed.), Issues in second language proficiency (pp. 202–15). London: Continuum.

34.

Skehan

(2011). Researching tasks: Performance, assessment, pedagogy. Shanghai: Shanghai Foreign Language Education Press.

35.

Skehan

Foster

(1997). Task type and task processing conditions as influences on foreign language performance. Language Teaching Research, 1, 185–211.

36.

Skehan

Foster

(1999). The influence of task structure and processing conditions on narrative retellings. Language Learning, 49, 93–120.

37.

Skehan

Foster

(2005). Strategic and on-line planning: The influence of surprise information and task time on second language performance. In Ellis

(Ed.), Planning and task performance in a second language (pp. 193–216). Amsterdam: John Benjamins.

38.

Skehan

Foster

(in press). Complexity, accuracy, fluency and lexis in task-based performance: A meta-analysis of the Ealing Research. In Housen

Kuiken

Vedder

(Eds.), Dimensions of L2 performance and proficiency: Investigating complexity, accuracy and fluency in SLA. Amsterdam: John Benjamins.

39.

Tavakoli

Skehan

(2005). Planning, task structure, and performance testing. In Ellis

(Ed.), Planning and task performance in a second language (pp. 239–73). Amsterdam: John Benjamins.

40.

Van den Branden

Bygate

Norris

(2009). Task-based language teaching: A reader. Amsterdam: John Benjamins.

41.

Wang

(2009). Modelling speech production and performance: Evidence from five types of planning and two task structures. Unpublished PhD thesis, Chinese University of Hong Kong, Hong Kong, China.

42.

Wigglesworth

(1997). An investigation of planning time and proficiency level on oral test discourse. Language Testing, 14, 85–106.

43.

Willis

(2007). Doing task-based teaching. Oxford: Oxford University Press.

44.

Yuan

Ellis

(2003). The effects of pre-task planning and on-line planning on fluency, complexity and accuracy in l2 monologic oral production. Applied Linguistics, 24, 1–27.