Abstract
To be able to solve complex information problems in a digital environment is a key 21st century skill. Technology users usually expect to achieve their goals in a fast and accurate way. However, the actual relationship between time-on-task and task outcome is currently not well understood. We analyzed data from a large-scale international study in which representative samples of adults had to solve more or less complex problems using standard computer applications. Our results indicate that different task characteristics influence the relationship between problem-solving performance and time-on-task in specific ways. Spending more time on a task is more likely to compensate an average problem solver when task complexity can be attributed to intrinsic task and technology drivers than when complexity stems from the cognitive/metacognitive activities belonging to information problem-solving processes per se, especially acquiring and evaluating information. Thus, the interpretation of time-on-task should take the source of difficulty into consideration. Implications for personal and professional development are discussed.
Keywords
Communication with others and handling information in a digital environment pervade the everyday activities of individuals living in modern societies. Digital problem solving (DPS) comprises the ability to use digital technology, multiple and multimodal digital resources to perform tasks (Jacobs & Castek, 2018; Organisation for Economic Co-operation and Development [OECD], 2013). Therefore, DPS has become an ever more important ability. DPS is considered a key element of 21st century skills that is linked to several life-achievements (e.g., Scherer et al., 2015; Sonnleitner et al., 2013).
Digital problems generally represent complex information problems because most of them solicit not only a mere sequence of routinized actions but more complex cognitive and metacognitive operations, such as handling a large quantity of information, using strategies to go through sub-goals, and monitoring the sub-processes may all be needed to come to a solution (Azevedo et al., 2010; Fischer & Funke, 2011; Greiff, Wüstenberg, et al., 2014; Lazonder & Rouet, 2008; OECD, 2009; Whitelock-Wainwright et al., 2020; Winne, 2011). Handling a large quantity of information may include the acquiring, evaluation and integration of sometimes contradictory information from different sources, locations and presentation forms (e.g., Rouet et al., 2017). Numerous studies on multimedia learning or DPS use the cognitive load theory (Sweller, 2011) to show that these processes can generate extraneous cognitive overload and challenge working memory capacity (Anmarkrud et al., 2019; DeStefano & LeFevre, 2007; Kalyuga, 2007). Consequently, individuals may experience difficulties or engage in inefficient behavior while solving information problems with computers (e.g., Lazonder & Rouet, 2008; Raes et al., 2010). Therefore, the assessment of DPS skills gained increased attention recently (e.g. OECD, 2009). Most frequently, the assessment of skills is based on task outcomes. However, the time spent on task is considered an important feature of the solution processes (e.g., Chang, 2015; De Boeck & Jeon, 2019; Goldhammer & Kroehne, 2014; He et al., 2019; Naumann, 2019). A huge benefit of the digital environment is that next to the outcome, the computer delivery mode allows the automatic and objective recording of DPS processes and their timing. Several modelling approaches examined how time-on-task relates to accuracy, characterizes the problem-solving behavior and the task itself. Yet, no systematic method emerged that is capable of evaluating the relationship between time-on-task, problem-solving processes, task features and accuracy (De Boeck & Jeon, 2019; Greiff, Krkovic, et al., 2014).
Our investigation contributes to the research on time-on-task as covariate models (see De Boeck & Jeon, 2019). In these models, response time is an origin variable and response accuracy is the dependent variable. These models show that time-on-task affects the outcome of DPS tasks, reveal signs of speed-accuracy trade-off and dual processing but are not enlightening about how task characteristics influence the length of problem-solving efforts that are compensated by an increase in accuracy of non-routine task solution processes. In this paper, we study how task characteristics, as determinants of different task difficulty dimensions, moderate the effect of time-on-task on accuracy of complex information problem-solving by analyzing a large dataset from the Problem-solving in Technology-rich Environments module of Programme for the International Assessment of Adult Competencies (PIAAC PS-TRE; OECD, 2009). Thus, this paper also advances our understanding on how task features and solution sub-processes contribute to the cognitive load associated with DPS.
Theoretical Background and Hypotheses
According to the speed-accuracy trade-off, the accuracy of task solution processes increases until a certain point as a curvilinear function of time-on-task (e.g., Chen et al., 2018; Goldhammer, Steinwascher, et al., 2017; Heitz, 2014; Wang & Hanson, 2005). Moreover, the positive effect of time-on-task was shown to become stronger in the case of harder non-routine tasks (Goldhammer et al., 2014). However, task accuracy is not always positively linked to the amount of time individuals are engaged with the task (Goldhammer et al., 2013; Goldhammer, Naumann, et al., 2017; Petscher et al., 2015; Scherer et al., 2015; Vörös & Rouet, 2016). Higher ability test-takers was shown to spend less time on reading tasks and correctly accomplish reading tasks in less time than lower ability test-takers (Petscher et al., 2015; Su & Davison, 2019). Finally, it was proposed that the link between time-on-task and accuracy depends on the quantity of routinized sub-tasks of problem-solving processes (Goldhammer, Naumann et al., 2017; Naumann & Goldhammer, 2017).According to dual-processing models, controlled processes are executed slowly under attentional control while automated processes are fast and do not require attention (Mayer, 1992; Schneider & Shiffrin, 1977). Yet, apart from the results on task difficulty, there is very little empirical evidence about how task characteristics, such as the quantity and form of information to be processed, the length and clarity of instruction or the digital tools and commands to be used, influence the effect of time-on-task on digital problem-solving accuracy (De Boeck & Jeon, 2019; Greiff et al., 2018). For example, the results of Chen et al. (2018) reveal that the time interval during which the higher probability of arriving at the correct outcome compensates for the efforts of problem-solving is shorter for more knowledge based tasks than in the case of mainly cognitive ability based tasks. In this paper, we argue that the effect of time-on-task on accuracy varies by task characteristics and problem-solving sub-processes.
To test our hypotheses, we analyze the results of PIAAC PS-TRE tasks. The purpose of the PS-TRE domain was to assess adults’ ability to use digital resources to access and process information in purposeful non-routine settings. Thus, PS-TRE tasks can be considered complex information problems (see Rouet et al., 2016). The PIAAC PS-TRE framework identified three core dimensions that may affect the difficulty of a problem: technology (amount and novelty of applications, tools, commands and diversity of representations), cognitive (types, complexity and amounts of cognitive operations; goal setting and progress monitoring, planning/self-organizing, access/evaluating information, and making use of information), and task (likelihood of impasses, minimum number of steps to solve the task, number of options at various stages, number of constraints to be satisfied, and complexity of computations, amount of transformation required to communicate a solution) dimensions (OECD, 2013). Additionally, initial task complexity (explicitness and length of instruction, opening screen complexity) and text difficulty were also determined as task difficulty dimensions.
We hypothesize that the dimension of task difficulty influences the extent to which longer time-on-task is able to compensate an average problem solver for the increasing task difficulty. Thus, the time spent on the solution may increase the chance of finding the correct outcome in different pace and until a different point when the source of complexity is cognitive, technology based or related inherently to the task itself. More precisely, we argue that time-on-task may compensate for difficulties deriving from technological and task dimensions in a greater extent than in the case of higher complexity stemming from cognitive dimensions. Digital environments are designed in a way that individuals can discover and experiment with functions. It is always possible to discover how the system, in which the tasks are embedded, works. Individuals can test functions or click on icons, use the menu system and check out what happens. Digital environments are designed in a way to offer a scaffold helping to overcome technological difficulties. Time may also compensate in a relatively large extent for difficulties linked to task features coupled with heightened time intensity, such as a large number of solution steps, longer instruction, more information to read. Conversely, time-on-task may make up for cognitive load stemming from sub-activities primarily related to problem-solving processes -i.e.: goal setting, acquiring information, monitoring the solution process, source evaluation, contrast or integrate information, and navigate- less efficiently. For example, concept maps were shown to ease the cognitive load of navigating through digital information, promote the understanding and the recall of text contents (Vörös et al., 2011; Whitelock-Wainwright et al., 2020).However, typically, there are no external supports to overcome cognitive difficulties related to DPS processes. Therefore, we formulate the following three hypotheses.
H1: During DPS, the positive effect of time-on-tasks may get stronger in the case of difficulties linked to the usage of electronic environment (technology).
H2: During DPS, the effect of time-on-tasks may get stronger in the case of difficulties linked to inherent task complexity.
H3: The positive time-on-task effect may weaken for tasks with high cognitive load of problem-solving.
H4: The positive effect of time may fade when the level of proficiency is higher.
Material and Methods
PS-TRE task were delivered on a laptop computer, typically, in the respondents’ homes. All computer-test-taker interactions were logged during the completion of the tasks.
Material
Fourteen PS-TRE tasks reflecting different levels of presumed difficulty were created. The solution of tasks involved either the manipulation of a mail client, a word processor, a spreadsheet, a web browser or the combination of those environments. In a basic task for example, test-takers located the ID number of a member of a club in a spreadsheet table enumerating 200 names. There was a single constraint to be satisfied: match the name with the membership number in the spreadsheet. Respondents could use the Search or Sort tool or just scroll down the list to find the required entry. In a medium task, test-takers were directed to buy a book for their friend’s birthday that was in two weeks. Participants had to activate the links in the list of five web sites to go to the corresponding pages and locate the book meeting the three conditions in the problem statement. In an advanced task, test-takers read e-mails from different departments in order to make meeting room reservations through a simulated online system. The task involved three environments and the solution process consisted of about 20 steps. A conflict between two requests presented an impasse that the respondents had to solve (see OECD, 2013, for screen examples). The items were divided into two modules. Each PS-TRE test-taker was randomly assigned to one or both modules.
Table 1 contains the task proficiency levels and the difficulty level of tasks by task characteristics that were identified as task difficulty sources along the three difficulty dimensions (see the Theoretical Background and Hypotheses). Higher numbers represent higher difficulty levels of the task characteristics.
PS-TRE Task Characteristics.
Note. Tasks are coded as in the OECD data file (https://webfs.oecd.org/piaac/puf-data/).
a,bThese characteristics were defined by an expert group of four university professors. The other characteristics are based on OECD (2013).
Participants
Here, we study the PS-TRE data of Canada. We chose Canada simply because the number of test-takers in this country is much higher than in other countries. In Round 1 of the PIAAC, 26,683 test-takers took the test in Canada. The data collection took place from 1 August 2011 to 31 March 2012. The target population of the PIAAC study included the whole of non-institutionalized population, aged 16–65 years, residing in the given country at the time of data collection. A multi-stage sampling design with a probability sample representative of the target population was used. PIAAC consisted of three domains; literacy, numeracy and PS-TRE. Respondents were randomly assigned to take one or two of the three domains. Respondents to the PS-TRE domain also had to possess basic computer, literacy and numeracy skills, and to agree to being tested on a computer (OECD, 2013).
For the purpose of this study, only the results of the 20,978 native English or French speakers were taken into consideration. Moreover, we considered only those subjects who were assigned to solve at least one module of the PS-TRE test and the results of those tasks where the time to first action was above zero; i.e.: the respondent reached the task. Finally, we analyzed the results of 66,504 tasks solved by 8,100 test-takers.
Data Transformation and Analyses
Generalized linear mixed models (GzLMM) with binary logistic link function were used to investigate the role of time-on-task, task difficulty and task characteristics in the outcome of PS-TRE tasks. GzLMM is a flexible tool to analyze grouped data as it allows both random and fixed effects. To analyze the data, we built up our basic GzLMM model with a fixed intercept, task and person as random intercepts and a by-person and by-task time-on-task fixed effect (M1)(De Boeck, 2008; Goldhammer et al., 2014). Thus, in model M1, the probability that a person p solves task t correctly equals to β0 + β1tpt + b0p + b0t + εpt whereas
β0 = universal fixed intercept;
β1tpt = by-person and by-task time-on-task fixed effect;
b0p = by-person random intercept;
b0t = by-task random intercept; and
εpt = residuals.
So, we built up a Rasch model extend with a fixed time-on-task effect. To allow the time-on-task effect to vary across tasks, the fixed time-on-task effect was extended with a by-task adjustment (M2). Hence, our Model M2 = β0 + β1tpt + b1ttpt + b0p + b0t + εpt. Additionally, with the comparison of M2 to a restricted M2 (M2r) where the correlations between the by-task adjustment to the time-on-task and the random task intercept were not allowed, we tested if the random variances of the time-on-task effect are linked to the time intensity of the given task (see Goldhammer et al., 2014). As our primary interest goes toward the question of what task characteristics made a PS-TRE task difficult and moderated the effect of time-on-task on task outcome, we did not formulate any hypothesis on the by-person variances of the effect of time-on-task. However, to see if it is necessary to extend M2 with a by-person adjustment, we tested if introducing a by-person adjustment to the fixed time-on-task effect would result in a better fitting model (M3).
Model M2 served as a source model for the second part of our investigation. In that part, we explored the test characteristics that make a task more or less difficult and contributed to the deviations of the time-on-task effect. For these analyses, we studied the impact of only one chosen task characteristic at a time and considered tasks similar in all their other features. To do so, the random variables related to tasks were replaced by the examined characteristic variable in M1 and M2. The result of each full task characteristic model was compared to its corresponding restricted version and it was also tested if the goodness of fit of the full task-characteristic model shows improvement compared to a model without the by-task characteristic random time effect.
Before the analyses, first, we standardized (z-score) and then log-transformed the time-on-task data. The time-on-task above or below three standard deviations was overwritten by the limit value; the mean plus or minus three standard deviations. Task outcomes were considered correct or incorrect.
To test our hypotheses, the glmer function of the R package lme4, the most widely used package to analyze mixed models, was used (Bates et al., 2012).
Results and Discussion
The probability that an average test-taker solves an average task was 37.1% (β = −0.5287, z = −1.726, p < 0.1). However, the difficulty level of tasks seems to fluctuate. Task U01 for example was solved by 61% of respondents while only 11% of test-takers answered item U02 correctly (Var(b0t) = 1.303). We have found a significant positive effect of time-on-task too (M1: β1 = 0.56484, p < 0.001). Still, the results suggest that the time spent on task does not have the same meaning in the case of all tasks. For example, subjects who solved U01a spent on average less time on the task (M = 111,037 ms) than their peers who did not find the right answer (M = 138,472 ms). On the other hand, an average person who arrived at the right solution of task U02 spent nearly twice as much time on it (M = 510,861 ms) than an average person who did not respond it correctly (M = 261,577 ms) (see Table 2).
Descriptive Statistics by Task and Outcome.
Note. Tasks are coded as in the OECD data file (https://webfs.oecd.org/piaac/puf-data/).
Figure 1 shows the variations of time-on-task by the type of answer for each task (incorrect =0, correct = 1).

Time-on-Task Box-Plots Showing the Variations of Time-on-Task by the Type of Answer for Each Task.
M2 indicates that the time-on-task effect varies across tasks Var(b1t) = 0.3139 and the correlation of the by-task random adjustment with task difficulty is negative Corr(b0t, b1t) = −0.46. Moreover, the model comparison tests show that M2 better fits the data than its restricted version (M2r) and our base model (M1). This means that the positive effect of time-on-task was even stronger in the case of difficult tasks. M3 shows that the variance of the time-on-task effect is small Var(b1p) = 0.02196 and its correlation with task difficulty is −1.00. A Correlation of −1 signals overparametrization of the model. In addition, according to the restricted model (M3r), the time-on-task effect does not vary across test-takers (Var(b1p) = 0.00000). Taken together, the results suggest that the data is not enough to show that spending more time-on-task helped low-skilled subjects more than skilled problem solvers. Therefore, we did not extend M2 with the by-person random time-on-task effect. Table 3 displays the results of the first line of analyses.
GLMM Statistics for the First Line of Analyses.
†p < .1. *p < .05. **p < .01. ***p < .001.
In the second line of analyses we have explored the major task characteristics that influenced the outcome and the time propensity of the tasks. All given full task characteristic models fit the data better than their baseline version without the by-task characteristic random time effect. Yet, the random by-task characteristic correlations show considerable dissimilarities (see Tables 4 and 5). The results show that the random time effect of proficiency level (pl) and the cognitive sub-task of acquiring and evaluating information (ae) had positive correlation with the random intercept (Corr(b0pl, b1pl) = 0.12, Corr(b0ae, b1ae) = 0.28). It means that the positive time-on-task effect became weaker in the case of tasks belonging to a higher proficiency level or tasks with high demand on information acquisition and evaluation. Thus, spending more time-on-tasks of higher proficiency levels or representing higher load on information acquisition and evaluation processes did have a weaker positive influence on task outcome than in the case of easier tasks. To put it in a simpler way, compared with easier tasks, in the case of higher proficiency levels and tasks imposing higher cognitive load on information acquisition and evaluation, there were more test-takers who could not solve the tasks even if they spent more time on them. Moreover, we could identify three cognitive task characteristics of problem-solving processes that potentially contributed to the escalation of the time-on-task effect of progress monitoring. However, the correlation of the random effects of all three is very high: monitoring progress (Corr(b0mp, b1mp) = −0.92), goal setting (Corr(b0gs, b1gs) = −0.91) and the number of cognitive processes Corr(b0ncp, b1ncp) = −0.97. Such strong relationships mean that the positive time-on-task effect does not vary along those three task characteristics in a meaningful way. Consequently, H3 and H4 are approved.
Descriptive Statistics by Task Characteristic and Outcome.
GLMM Statistics for the Task Characteristics Models.
*p < .05. **p < .01. ***p < .001.
Intrinsic task and ICT related characteristics contributed clearly to the escalation of the overall positive effect of time-on-task. For all three task characteristics belonging here, the corresponding full Mtc was the best fitting model and the correlation of the random by-characteristic effects is negative (technology load: Corr(b0itl, b1itl) = −0.85; inherent task difficulty: Corr(b0itd, b1itd) = −0.37; initial task complexity: Corr(b0itc, b1itc) = −0.1; text difficulty: Corr(b0ill, b1ill) = −0.86). As a result, H1 and H2 are approved.
The full models on the effect of number of minimum steps (nms) and impasses (ni) showed overparametrization (Corr(b0nms, b1nms) = −1, Corr(b0ni, b1ni) = –1). The corresponding restricted models proved superior to the baseline models, so there is some variance of the positive time effect along those characteristics but the data are not enough to tell more about it.
Overall, it seems that time-on-task can compensate a person with average abilities for difficulties deriving from the intrinsic task and ICT complexity more efficiently than to counterweight the complexity of cognitive/metacognitive processes of solving information problems, especially evaluating and acquiring information. Figures 2 and 3 show the estimated parameters of a randomly chosen test-taker.

Outcome in Function of Time-on-Task at the Different Levels of Intrinsic Task Complexity. Estimated Parameters of a Randomly Chosen Test-Taker.

Outcome in Function of Time-on-Task at the Different Task Difficulty Levels Stemming From Acquiring and Evaluating Information. Estimated Parameters of a Randomly Chosen Test-Taker.
Conclusions
To be able to solve complex information problems in an electronic environment is a key 21st century skill that would be important to teach and learn. Additionally, DPS should be supported by appropriate environment design. However, considering purely the outcome of DPS activities, it is not possible to know when and why individuals fail while trying to achieve their goals. The time spent on specific tasks may help researchers to discover more about the reasons behind failures or inefficient DPS behavior. Our results indicate that the different task characteristics do not contribute the same way to DPS performance and relates to problem-solving efforts differently. Spending more time on a task is more likely to compensate an average problem solver for intrinsic task and technology drivers of difficulty than for the imposed cognitive activities of information problem-solving processes; especially the need to acquire and evaluate information. Thus, our results, showing that more time on a complex problem-solving task increases performance, are in line with the time-accuracy trade-off and dual processing theories (Chen et al., 2018; Goldhammer et al., 2014). Additionally, this study produced new information on how the sub-processes of complex problem-solving and task characteristics moderates the link between time-on-task and accuracy (see De Boeck & Jeon, 2019; Greiff et al., 2018).
In sum, the meaning of time-on-task is not consistent across problem-solving tasks. The interpretation of time-on-task should take the task difficulty along with the source of difficulty into consideration. Teachers, web designers and test-developers would make good use of the analyses of task characteristics along with the outcome and time-on-task data to calibrate task difficulty, to identify potential sub-processes and task features that cause problems, to gain better understanding of individual abilities, and most importantly, to reveal ways to help individuals to enhance their performance and decrease the cognitive load of DPS. PIAAC PS-TRE results (OECD, 2019) reinforce that educational interventions aimed at the development of DPS skills would be justified. As the PIAAC PS-TRE framework document (OECD, 2009) mentions, information-rich problems, where the task complexity lies in the information to be accessed and used, deserve particular attention.
Finally, we have to mention the limitations of our study and we also propose some future research pathways. First, as we did not control the time that test-takers spent on the tasks we can talk about associations between time-on-task and task outcome but we are not able to state that the amount of time spent on a task influences accuracy. Second, PIAAC PS-TRE consisted of fourteen tasks. More tasks that could be divided into more unique categories along the characteristics and designed in a way to be clearly distinguishable along characteristics would be helpful to corroborate the results. It would be also interesting to research what interface designs would lead to higher performance. Interfaces – menu systems and icons for example- are usually designed to offset technical difficulties but typically, are less supportive in helping to overcome difficulties deriving from processing a large quantity of information. Additionally, log-files reveal the processes leading to specific outcomes. Combining analyses on task characteristics, DPS processes and their timing is a promising technique to identify the causes of low DPS performance and counterbalance it by advanced environment design. Third, we did not manipulate or accounted for the motivation of test-takers. The motivation level of test-takers, i.e., how hard they tried to solve the test or how intensively they used their time, arguably has a major effect on the time-on-task and outcome relationship. Further studies should identify subjects under a certain engagement level and spending time on processes not related to the task. Analyzing the data of other countries and other DPS task results would be useful to see if the results are generalizable.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The project was financed by the Higher Education Institutional Excellence Programme of the Ministry of Human Capacities in Hungary, within the framework of the 4th thematic programme “Enhancing the Role of Domestic Companies in the Reindustrialization of Hungary” of the University of Pécs (reference number of the contract: 20765-3/2018/FEKUTSTRAT) and the European Union, co-financed by the European Social Fund (Grant no.: EFOP-3.6.1.-16-2016-00004 entitled by Comprehensive Development for Implementing Smart Specialization Strategies at the University of Pécs).
