Task Characteristics as Source of Difficulty and Moderators of the Effect of Time-on-Task in Digital Problem-Solving

Abstract

To be able to solve complex information problems in a digital environment is a key 21^st century skill. Technology users usually expect to achieve their goals in a fast and accurate way. However, the actual relationship between time-on-task and task outcome is currently not well understood. We analyzed data from a large-scale international study in which representative samples of adults had to solve more or less complex problems using standard computer applications. Our results indicate that different task characteristics influence the relationship between problem-solving performance and time-on-task in specific ways. Spending more time on a task is more likely to compensate an average problem solver when task complexity can be attributed to intrinsic task and technology drivers than when complexity stems from the cognitive/metacognitive activities belonging to information problem-solving processes per se, especially acquiring and evaluating information. Thus, the interpretation of time-on-task should take the source of difficulty into consideration. Implications for personal and professional development are discussed.

Keywords

digital problem-solving time-on-task task difficulty large-scale assessment cognition cognitive load

Communication with others and handling information in a digital environment pervade the everyday activities of individuals living in modern societies. Digital problem solving (DPS) comprises the ability to use digital technology, multiple and multimodal digital resources to perform tasks (Jacobs & Castek, 2018; Organisation for Economic Co-operation and Development [OECD], 2013). Therefore, DPS has become an ever more important ability. DPS is considered a key element of 21^st century skills that is linked to several life-achievements (e.g., Scherer et al., 2015; Sonnleitner et al., 2013).

Digital problems generally represent complex information problems because most of them solicit not only a mere sequence of routinized actions but more complex cognitive and metacognitive operations, such as handling a large quantity of information, using strategies to go through sub-goals, and monitoring the sub-processes may all be needed to come to a solution (Azevedo et al., 2010; Fischer & Funke, 2011; Greiff, Wüstenberg, et al., 2014; Lazonder & Rouet, 2008; OECD, 2009; Whitelock-Wainwright et al., 2020; Winne, 2011). Handling a large quantity of information may include the acquiring, evaluation and integration of sometimes contradictory information from different sources, locations and presentation forms (e.g., Rouet et al., 2017). Numerous studies on multimedia learning or DPS use the cognitive load theory (Sweller, 2011) to show that these processes can generate extraneous cognitive overload and challenge working memory capacity (Anmarkrud et al., 2019; DeStefano & LeFevre, 2007; Kalyuga, 2007). Consequently, individuals may experience difficulties or engage in inefficient behavior while solving information problems with computers (e.g., Lazonder & Rouet, 2008; Raes et al., 2010). Therefore, the assessment of DPS skills gained increased attention recently (e.g. OECD, 2009). Most frequently, the assessment of skills is based on task outcomes. However, the time spent on task is considered an important feature of the solution processes (e.g., Chang, 2015; De Boeck & Jeon, 2019; Goldhammer & Kroehne, 2014; He et al., 2019; Naumann, 2019). A huge benefit of the digital environment is that next to the outcome, the computer delivery mode allows the automatic and objective recording of DPS processes and their timing. Several modelling approaches examined how time-on-task relates to accuracy, characterizes the problem-solving behavior and the task itself. Yet, no systematic method emerged that is capable of evaluating the relationship between time-on-task, problem-solving processes, task features and accuracy (De Boeck & Jeon, 2019; Greiff, Krkovic, et al., 2014).

Our investigation contributes to the research on time-on-task as covariate models (see De Boeck & Jeon, 2019). In these models, response time is an origin variable and response accuracy is the dependent variable. These models show that time-on-task affects the outcome of DPS tasks, reveal signs of speed-accuracy trade-off and dual processing but are not enlightening about how task characteristics influence the length of problem-solving efforts that are compensated by an increase in accuracy of non-routine task solution processes. In this paper, we study how task characteristics, as determinants of different task difficulty dimensions, moderate the effect of time-on-task on accuracy of complex information problem-solving by analyzing a large dataset from the Problem-solving in Technology-rich Environments module of Programme for the International Assessment of Adult Competencies (PIAAC PS-TRE; OECD, 2009). Thus, this paper also advances our understanding on how task features and solution sub-processes contribute to the cognitive load associated with DPS.

Theoretical Background and Hypotheses

According to the speed-accuracy trade-off, the accuracy of task solution processes increases until a certain point as a curvilinear function of time-on-task (e.g., Chen et al., 2018; Goldhammer, Steinwascher, et al., 2017; Heitz, 2014; Wang & Hanson, 2005). Moreover, the positive effect of time-on-task was shown to become stronger in the case of harder non-routine tasks (Goldhammer et al., 2014). However, task accuracy is not always positively linked to the amount of time individuals are engaged with the task (Goldhammer et al., 2013; Goldhammer, Naumann, et al., 2017; Petscher et al., 2015; Scherer et al., 2015; Vörös & Rouet, 2016). Higher ability test-takers was shown to spend less time on reading tasks and correctly accomplish reading tasks in less time than lower ability test-takers (Petscher et al., 2015; Su & Davison, 2019). Finally, it was proposed that the link between time-on-task and accuracy depends on the quantity of routinized sub-tasks of problem-solving processes (Goldhammer, Naumann et al., 2017; Naumann & Goldhammer, 2017).According to dual-processing models, controlled processes are executed slowly under attentional control while automated processes are fast and do not require attention (Mayer, 1992; Schneider & Shiffrin, 1977). Yet, apart from the results on task difficulty, there is very little empirical evidence about how task characteristics, such as the quantity and form of information to be processed, the length and clarity of instruction or the digital tools and commands to be used, influence the effect of time-on-task on digital problem-solving accuracy (De Boeck & Jeon, 2019; Greiff et al., 2018). For example, the results of Chen et al. (2018) reveal that the time interval during which the higher probability of arriving at the correct outcome compensates for the efforts of problem-solving is shorter for more knowledge based tasks than in the case of mainly cognitive ability based tasks. In this paper, we argue that the effect of time-on-task on accuracy varies by task characteristics and problem-solving sub-processes.

To test our hypotheses, we analyze the results of PIAAC PS-TRE tasks. The purpose of the PS-TRE domain was to assess adults’ ability to use digital resources to access and process information in purposeful non-routine settings. Thus, PS-TRE tasks can be considered complex information problems (see Rouet et al., 2016). The PIAAC PS-TRE framework identified three core dimensions that may affect the difficulty of a problem: technology (amount and novelty of applications, tools, commands and diversity of representations), cognitive (types, complexity and amounts of cognitive operations; goal setting and progress monitoring, planning/self-organizing, access/evaluating information, and making use of information), and task (likelihood of impasses, minimum number of steps to solve the task, number of options at various stages, number of constraints to be satisfied, and complexity of computations, amount of transformation required to communicate a solution) dimensions (OECD, 2013). Additionally, initial task complexity (explicitness and length of instruction, opening screen complexity) and text difficulty were also determined as task difficulty dimensions.

We hypothesize that the dimension of task difficulty influences the extent to which longer time-on-task is able to compensate an average problem solver for the increasing task difficulty. Thus, the time spent on the solution may increase the chance of finding the correct outcome in different pace and until a different point when the source of complexity is cognitive, technology based or related inherently to the task itself. More precisely, we argue that time-on-task may compensate for difficulties deriving from technological and task dimensions in a greater extent than in the case of higher complexity stemming from cognitive dimensions. Digital environments are designed in a way that individuals can discover and experiment with functions. It is always possible to discover how the system, in which the tasks are embedded, works. Individuals can test functions or click on icons, use the menu system and check out what happens. Digital environments are designed in a way to offer a scaffold helping to overcome technological difficulties. Time may also compensate in a relatively large extent for difficulties linked to task features coupled with heightened time intensity, such as a large number of solution steps, longer instruction, more information to read. Conversely, time-on-task may make up for cognitive load stemming from sub-activities primarily related to problem-solving processes -i.e.: goal setting, acquiring information, monitoring the solution process, source evaluation, contrast or integrate information, and navigate- less efficiently. For example, concept maps were shown to ease the cognitive load of navigating through digital information, promote the understanding and the recall of text contents (Vörös et al., 2011; Whitelock-Wainwright et al., 2020).However, typically, there are no external supports to overcome cognitive difficulties related to DPS processes. Therefore, we formulate the following three hypotheses.

H1: During DPS, the positive effect of time-on-tasks may get stronger in the case of difficulties linked to the usage of electronic environment (technology).

H2: During DPS, the effect of time-on-tasks may get stronger in the case of difficulties linked to inherent task complexity.

H3: The positive time-on-task effect may weaken for tasks with high cognitive load of problem-solving.

PS-TRE tasks were sorted into three proficiency levels based on test outcomes. As cognitive load is supposed to be the main carrier of the proficiency level of a problem-solving task, we argue that an average respondent may have been compensated by time in a relatively low extent above a certain proficiency level (see Rouet et al., 2016).

H4: The positive effect of time may fade when the level of proficiency is higher.

Material and Methods

PS-TRE task were delivered on a laptop computer, typically, in the respondents’ homes. All computer-test-taker interactions were logged during the completion of the tasks.

Material

Fourteen PS-TRE tasks reflecting different levels of presumed difficulty were created. The solution of tasks involved either the manipulation of a mail client, a word processor, a spreadsheet, a web browser or the combination of those environments. In a basic task for example, test-takers located the ID number of a member of a club in a spreadsheet table enumerating 200 names. There was a single constraint to be satisfied: match the name with the membership number in the spreadsheet. Respondents could use the Search or Sort tool or just scroll down the list to find the required entry. In a medium task, test-takers were directed to buy a book for their friend’s birthday that was in two weeks. Participants had to activate the links in the list of five web sites to go to the corresponding pages and locate the book meeting the three conditions in the problem statement. In an advanced task, test-takers read e-mails from different departments in order to make meeting room reservations through a simulated online system. The task involved three environments and the solution process consisted of about 20 steps. A conflict between two requests presented an impasse that the respondents had to solve (see OECD, 2013, for screen examples). The items were divided into two modules. Each PS-TRE test-taker was randomly assigned to one or both modules.

Table 1 contains the task proficiency levels and the difficulty level of tasks by task characteristics that were identified as task difficulty sources along the three difficulty dimensions (see the Theoretical Background and Hypotheses). Higher numbers represent higher difficulty levels of the task characteristics.

Table 1.

PS-TRE Task Characteristics.

Task	Proficiency level	Monitoring progress	Impasses	Goal setting	Acquiring/ evaluating info	Cognitive processes	Technology load	Inherent task complexity	Minimum no of steps	Text difficulty^a	Initial complexity^b
U19a	1	2	No	2	1	1	1	1	1	2	3
U16	1	2	No	2	1	1	1	1	1	1	1
U01a	1	1	No	1	1	2	1	1	1	2	1
U19b	2	2	Yes	3	2	3	1	2	1	4	4
U01b	2	1	No	1	1	2	1	1	1	2	1
U07	2	3	Yes	4	2	3	2	2	2	3	3
U03a	2	1	No	1	1	2	2	1	1	1	2
U21	2	2	Yes	3	2	3	3	3	2	4	3
U23	2	3	Yes	4	2	4	3	3	2	4	4
U06b	2	2	No	3	3	4	2	2	2	2	2
U06a	3	2	No	2	3	3	1	1	1	3	2
U02	3	3	Yes	4	3	5	3	3	2	1	1
U11b	3	2	Yes	3	2	3	1	1	1	1	2
U04a	3	3	Yes	4	3	5	2	3	2	4	4

Note. Tasks are coded as in the OECD data file (https://webfs.oecd.org/piaac/puf-data/).

^a,bThese characteristics were defined by an expert group of four university professors. The other characteristics are based on OECD (2013).

Participants

Here, we study the PS-TRE data of Canada. We chose Canada simply because the number of test-takers in this country is much higher than in other countries. In Round 1 of the PIAAC, 26,683 test-takers took the test in Canada. The data collection took place from 1 August 2011 to 31 March 2012. The target population of the PIAAC study included the whole of non-institutionalized population, aged 16–65 years, residing in the given country at the time of data collection. A multi-stage sampling design with a probability sample representative of the target population was used. PIAAC consisted of three domains; literacy, numeracy and PS-TRE. Respondents were randomly assigned to take one or two of the three domains. Respondents to the PS-TRE domain also had to possess basic computer, literacy and numeracy skills, and to agree to being tested on a computer (OECD, 2013).

For the purpose of this study, only the results of the 20,978 native English or French speakers were taken into consideration. Moreover, we considered only those subjects who were assigned to solve at least one module of the PS-TRE test and the results of those tasks where the time to first action was above zero; i.e.: the respondent reached the task. Finally, we analyzed the results of 66,504 tasks solved by 8,100 test-takers.

Data Transformation and Analyses

Generalized linear mixed models (GzLMM) with binary logistic link function were used to investigate the role of time-on-task, task difficulty and task characteristics in the outcome of PS-TRE tasks. GzLMM is a flexible tool to analyze grouped data as it allows both random and fixed effects. To analyze the data, we built up our basic GzLMM model with a fixed intercept, task and person as random intercepts and a by-person and by-task time-on-task fixed effect (M1)(De Boeck, 2008; Goldhammer et al., 2014). Thus, in model M1, the probability that a person p solves task t correctly equals to β₀ + β₁t_pt + b_0p + b_0t + ε_pt whereas

β₀ = universal fixed intercept;

β₁t_pt = by-person and by-task time-on-task fixed effect;

b_0p = by-person random intercept;

b_0t = by-task random intercept; and

ε_pt = residuals.

So, we built up a Rasch model extend with a fixed time-on-task effect. To allow the time-on-task effect to vary across tasks, the fixed time-on-task effect was extended with a by-task adjustment (M2). Hence, our Model M2 = β₀ + β₁t_pt + b_1tt_pt + b_0p + b_0t + ε_pt. Additionally, with the comparison of M2 to a restricted M2 (M2r) where the correlations between the by-task adjustment to the time-on-task and the random task intercept were not allowed, we tested if the random variances of the time-on-task effect are linked to the time intensity of the given task (see Goldhammer et al., 2014). As our primary interest goes toward the question of what task characteristics made a PS-TRE task difficult and moderated the effect of time-on-task on task outcome, we did not formulate any hypothesis on the by-person variances of the effect of time-on-task. However, to see if it is necessary to extend M2 with a by-person adjustment, we tested if introducing a by-person adjustment to the fixed time-on-task effect would result in a better fitting model (M3).

Model M2 served as a source model for the second part of our investigation. In that part, we explored the test characteristics that make a task more or less difficult and contributed to the deviations of the time-on-task effect. For these analyses, we studied the impact of only one chosen task characteristic at a time and considered tasks similar in all their other features. To do so, the random variables related to tasks were replaced by the examined characteristic variable in M1 and M2. The result of each full task characteristic model was compared to its corresponding restricted version and it was also tested if the goodness of fit of the full task-characteristic model shows improvement compared to a model without the by-task characteristic random time effect.

Before the analyses, first, we standardized (z-score) and then log-transformed the time-on-task data. The time-on-task above or below three standard deviations was overwritten by the limit value; the mean plus or minus three standard deviations. Task outcomes were considered correct or incorrect.

To test our hypotheses, the glmer function of the R package lme4, the most widely used package to analyze mixed models, was used (Bates et al., 2012).

Results and Discussion

The probability that an average test-taker solves an average task was 37.1% (β = −0.5287, z = −1.726, p < 0.1). However, the difficulty level of tasks seems to fluctuate. Task U01 for example was solved by 61% of respondents while only 11% of test-takers answered item U02 correctly (Var(b_0t) = 1.303). We have found a significant positive effect of time-on-task too (M1: β₁ = 0.56484, p < 0.001). Still, the results suggest that the time spent on task does not have the same meaning in the case of all tasks. For example, subjects who solved U01a spent on average less time on the task (M = 111,037 ms) than their peers who did not find the right answer (M = 138,472 ms). On the other hand, an average person who arrived at the right solution of task U02 spent nearly twice as much time on it (M = 510,861 ms) than an average person who did not respond it correctly (M = 261,577 ms) (see Table 2).

Table 2.

Descriptive Statistics by Task and Outcome.

Item id	Result	n	M_Time (ms)	SD_Time (ms)	M_zTime	SD_zTime
U01a	Not correct	2030	138472.5	106216.1	0.181294	1.087718
U01a	Correct	3217	111037.3	85180.71	−0.12111	0.891944
U01b	Not correct	2138	154288.4	115262.5	−0.30469	1.199317
U01b	Correct	2878	181697.9	106694.4	0.232033	0.677676
U02	Not correct	4067	261577.8	386146.3	−0.12337	0.988859
U02	Correct	562	510861.7	358943.4	0.892414	0.401037
U03a	Not correct	2833	124174.8	496151.8	−0.38575	0.983929
U03a	Correct	2173	185589.5	121682.1	0.498939	0.733501
U04a	Not correct	3688	388835.5	487033.1	−0.10637	1.019589
U04a	Correct	662	565871.4	262339.7	0.610982	0.426228
U06a	Not correct	3554	156948.7	179941.9	−0.0655	1.054497
U06a	Correct	1464	163128.3	87999.37	0.161923	0.742838
U06b	Not correct	2288	134914.1	94924.36	−0.18831	1.043426
U06b	Correct	2153	166307.2	115682.6	0.2046	0.877624
U07	Not correct	2241	103945.2	74352.9	−0.48278	0.988161
U07	Correct	2559	172891.6	91273.79	0.432225	0.741069
U11b	Not correct	3229	140810	330723.1	0.042677	1.043494
U11b	Correct	1062	106555.8	73796.13	−0.12542	0.732719
U16	Not correct	1726	136058.9	111292.7	−0.39878	1.088563
U16	Correct	2819	192297.3	136637.1	0.25136	0.812541
U19a	Not correct	1347	155970.4	110918.9	0.025566	1.175731
U19a	Correct	3663	142364.7	85509.78	−0.00706	0.89583
U19b	Not correct	1928	202333	152004.5	−0.3143	1.117328
U19b	Correct	2625	261571.8	360006.2	0.238116	0.766199
U21	Not correct	2810	192061	105669.2	−0.2064	1.061253
U21	Correct	2253	231717.9	115331.7	0.278562	0.69971
U23	Not correct	2757	115429.3	95787.91	−0.26681	1.064203
U23	Correct	1778	156941.8	91163.35	0.418382	0.671047

Note. Tasks are coded as in the OECD data file (https://webfs.oecd.org/piaac/puf-data/).

Figure 1 shows the variations of time-on-task by the type of answer for each task (incorrect =0, correct = 1).

Figure 1.

Time-on-Task Box-Plots Showing the Variations of Time-on-Task by the Type of Answer for Each Task.

M2 indicates that the time-on-task effect varies across tasks Var(b_1t) = 0.3139 and the correlation of the by-task random adjustment with task difficulty is negative Corr(b_0t, b_1t) = −0.46. Moreover, the model comparison tests show that M2 better fits the data than its restricted version (M2r) and our base model (M1). This means that the positive effect of time-on-task was even stronger in the case of difficult tasks. M3 shows that the variance of the time-on-task effect is small Var(b_1p) = 0.02196 and its correlation with task difficulty is −1.00. A Correlation of −1 signals overparametrization of the model. In addition, according to the restricted model (M3r), the time-on-task effect does not vary across test-takers (Var(b_1p) = 0.00000). Taken together, the results suggest that the data is not enough to show that spending more time-on-task helped low-skilled subjects more than skilled problem solvers. Therefore, we did not extend M2 with the by-person random time-on-task effect. Table 3 displays the results of the first line of analyses.

Table 3.

GLMM Statistics for the First Line of Analyses.

Model	Fixed effect β ₁	Variance of random effect	Correlation of random effects	Model comparison to M1 Chisq (df)	Model comparison to its restricted version Chisq (df)
M1	0.56484***
M2	0.6949***	0.3139	−0.46	1757.9 (2)***	3.2242 (1)^†
M2r	0.6930***	0.3105
M3	0.54942***	0.02195	−1.000	66.048 (2)***	66.046 (1)***
M3r	0.56485***	0.000

^†p < .1. *p < .05. **p < .01. ***p < .001.

In the second line of analyses we have explored the major task characteristics that influenced the outcome and the time propensity of the tasks. All given full task characteristic models fit the data better than their baseline version without the by-task characteristic random time effect. Yet, the random by-task characteristic correlations show considerable dissimilarities (see Tables 4 and 5). The results show that the random time effect of proficiency level (pl) and the cognitive sub-task of acquiring and evaluating information (ae) had positive correlation with the random intercept (Corr(b_0pl, b_1pl) = 0.12, Corr(b_0ae, b_1ae) = 0.28). It means that the positive time-on-task effect became weaker in the case of tasks belonging to a higher proficiency level or tasks with high demand on information acquisition and evaluation. Thus, spending more time-on-tasks of higher proficiency levels or representing higher load on information acquisition and evaluation processes did have a weaker positive influence on task outcome than in the case of easier tasks. To put it in a simpler way, compared with easier tasks, in the case of higher proficiency levels and tasks imposing higher cognitive load on information acquisition and evaluation, there were more test-takers who could not solve the tasks even if they spent more time on them. Moreover, we could identify three cognitive task characteristics of problem-solving processes that potentially contributed to the escalation of the time-on-task effect of progress monitoring. However, the correlation of the random effects of all three is very high: monitoring progress (Corr(b_0mp, b_1mp) = −0.92), goal setting (Corr(b_0gs, b_1gs) = −0.91) and the number of cognitive processes Corr(b_0ncp, b_1ncp) = −0.97. Such strong relationships mean that the positive time-on-task effect does not vary along those three task characteristics in a meaningful way. Consequently, H3 and H4 are approved.

Table 4.

Descriptive Statistics by Task Characteristic and Outcome.

Task characteristics	TC level	Result	N	M Time (ms)	SD Time (ms)	M zTime	SD zTime
Proficiency level	1	Incorrect	5103	142274.9	109492.1	−0.05601	1.140068
	1	Correct	9699	146486.8	107821	0.03022	0.883761
	2	Incorrect	16995	145413.9	227570.9	−0.30471	1.066664
	2	Correct	16419	195775.1	178102.4	0.322498	0.748031
	3	Incorrect	14538	241459.2	379062.2	−0.06803	1.026978
	3	Correct	3750	270318.2	268624.6	0.269299	0.74218
Monitor progress	1	Incorrect	7001	137516.8	327225	−0.19658	1.110488
	1	Correct	8268	155227.3	109181.2	0.164775	0.820608
	2	Incorrect	16882	159689.3	190490.1	−0.14013	1.08263
	2	Correct	16039	185940.3	181859.2	0.154607	0.818348
	3	Incorrect	12753	239084.3	363651.3	−0.21262	1.023676
	3	Correct	5561	248729.2	227888	0.495586	0.675029
Impasses	No	Incorrect	15916	142885.4	241014.5	−0.16925	1.09734
	No	Correct	18367	160280.2	110049.7	0.148236	0.839992
	Yes	Incorrect	20720	213972.3	321551.7	−0.18145	1.045738
	Yes	Correct	11501	235199.7	244061.3	0.336963	0.731077
Goal setting	1	Incorrect	7001	137516.8	327225	−0.19658	1.110488
	1	Correct	8268	155227.3	109181.2	0.164775	0.820608
	2	Incorrect	6627	151309.1	152211	−0.13379	1.100688
	2	Correct	7946	163904.9	109170.3	0.115754	0.848262
	3	Incorrect	10255	165104.7	211405.2	−0.14422	1.070833
	3	Correct	8093	207575.5	230004.9	0.192755	0.78606
	4	Incorrect	12753	239084.3	363651.3	−0.21262	1.023676
	4	Correct	5561	248729.2	227888	0.495586	0.675029
Acquiring/evaluating info	1	Incorrect	10074	139734.4	279671.6	−0.20152	1.121693
	1	Correct	14750	159117.8	111143	0.13865	0.843167
	2	Incorrect	12965	149297.7	193459.4	−0.22103	1.068252
	2	Correct	10277	198824.6	206583.9	0.288937	0.744078
	3	Incorrect	13597	247432.6	358676.3	−0.11456	1.024526
	3	Correct	4841	259985.6	243866.8	0.327116	0.784711
No of cognitive processes	1	Incorrect	3073	144786.8	111549.4	−0.21278	1.146914
	1	Correct	6482	164080.2	113411.2	0.105326	0.87002
	2	Incorrect	7001	137516.8	327225	−0.19658	1.110488
	2	Correct	8268	155227.3	109181.2	0.164775	0.820608
	3	Incorrect	13762	158058.6	203690.7	−0.17169	1.06712
	3	Correct	9963	201053.7	208676.9	0.247173	0.754621
	4	Incorrect	5045	124266	95879.84	−0.23121	1.055451
	4	Correct	3931	162071.2	105392.4	0.301294	0.797933
	5	Incorrect	7755	322097	441607	−0.11528	1.003561
	5	Correct	1224	540613.7	311526.5	0.740202	0.437776
Technology load	1	Incorrect	15952	154116.5	195233.9	−0.1027	1.115104
	1	Correct	17728	168226	173681.4	0.095319	0.823436
	2	Incorrect	11050	210627.8	401555.3	−0.2713	1.020436
	2	Correct	7547	209140.3	170949.6	0.402177	0.771079
	3	Incorrect	9634	199477.6	269229.6	−0.18864	1.033834
	3	Correct	4593	236927.3	192753.7	0.407799	0.686289
Intrinsic task complexity	1	Incorrect	16857	143569.8	273058.1	−0.12606	1.097464
	1	Correct	17276	156226.5	108165	0.124389	0.83115
	2	Incorrect	6457	144296.5	116657.2	−0.32813	1.054702
	2	Correct	7337	202687.1	234792.3	0.295982	0.798466
	3	Incorrect	13322	251898.5	353904.6	−0.16586	1.03053
	3	Correct	5255	278366.1	230319.8	0.433395	0.662609
Minimum no of steps	1	Incorrect	18785	149601	263810.8	−0.14538	1.100972
	1	Correct	19901	170121.9	168868.9	0.13939	0.823758
	2	Incorrect	17851	218330.4	314155.2	−0.20852	1.032192
	2	Correct	9967	227079.7	189559.8	0.383672	0.739985
Text difficulty	1	Incorrect	11855	177573.8	381136.9	−0.18094	1.033123
	1	Correct	6616	203391.6	185226.5	0.326651	0.793621
	2	Incorrect	7803	144783.2	106873.8	−0.08712	1.138506
	2	Correct	11911	147735.3	100444.5	0.058166	0.856067
	3	Incorrect	5795	136451.6	150529.8	−0.22687	1.049137
	3	Correct	4023	169338.7	90207.41	0.33386	0.75294
	4	Incorrect	11183	239833.1	314982.3	−0.20691	1.061214
	4	Correct	7318	254487	264707.3	0.328096	0.707014
Initial complexity	1	Incorrect	9961	191711.9	267532	−0.14792	1.092047
	1	Correct	9476	180384.5	165087.5	0.157059	0.823443
	2	Incorrect	11904	140536.1	315876.1	−0.13598	1.045109
	2	Correct	6852	162482.1	109789.6	0.237677	0.810647
	3	Incorrect	6398	153598.7	104605.7	−0.25437	1.078482
	3	Correct	8475	175335.9	102598.4	0.201511	0.824105
	4	Incorrect	8373	255865.5	357414	−0.20708	1.061264
	4	Correct	5065	264615.1	308214.1	0.35013	0.709196

Table 5.

GLMM Statistics for the Task Characteristics Models.

Model name	Effect tested	Fixed effect β₁	Variance of random effect	Correlation of random effects	Model comparison to its baseline Chisq (df)
Mtc1	random effect by proficiency level	0.4250**	0.07591	0.12	681.4623 (1)***
Mtc1r	Mtc1 without random effect correlation	0.4250**	0.0759		0.0397 (1)
Mtc2	random effect by monitoring progress	0.5414***	0.0388	−0.92	270.6855 (1)***
Mtc2r	Mtc2 without random effect correlation	0.5421***	0.03825		5.3863 (1)*
Mtc3	random effect by number of impasses	0.48225***	0.01006	−1.00	76.381 (1)***
Mtc3r	Mtc3 without random effect correlation	0.48103***	0.00887		17.790 (1)***
Mtc4	random effect by goal-setting	0.49655***	0.0356	−0.91	266.4855 (1)***
Mtc4r	Mtc4 without random effect correlation	0.49642***	0.0354		6.9104 (1)**
Mtc5	random effect by acquiring/evaluating	0.48839***	0.009599	0.28	71.7317 (1)***
Mtc5r	Mtc5 without random effect correlation	0.48866***	0.009552		0.2315 (1)
Mtc6	random effect by number of cognitive activities	0.6719***	0.112	−0.97	187.213 (1)***
Mtc6r	Mtc6 without random effect correlation	0.6751***	0.1139		14.537 (1)***
Mtc7	random effect by intrinsic task difficulty	0.5900***	0.03885	−0.37	603.1827 (1)***
Mtc7r	Mtc7 without random effect correlation	0.5899***	0.03886		3.8458 (1)*
Mtc8	random effect by necessary minimum steps	0.5087***	0.02463	−1.00	208.438(1)***
Mtc8r	Mtc8 without random effect correlation	0.5080***	0.02384		16.186 (1)***
Mtc9	random effect by text difficulty	0.51794***	0.03485	−0.86	366.2302 (1)***
Mtc9r	Mtc9 without random effect correlation	0.5189***	0.03957		5.2415 (1)*
Mtc10	random effect by initial complexity	0.48286***	0.02305	−0.1	<2e−16 (1)***
Mtc10r	Mtc10 without random effect correlation	0.48284***	0.02304		0.836
Mtc11	random effect by technology load	0.5838***	0.05536	−0.85	394.7869 (1)***
Mtc11r	Mtc11 without random effect correlation	0.5831***	0.0553		0.4311 (1)

*p < .05. **p < .01. ***p < .001.

Intrinsic task and ICT related characteristics contributed clearly to the escalation of the overall positive effect of time-on-task. For all three task characteristics belonging here, the corresponding full Mtc was the best fitting model and the correlation of the random by-characteristic effects is negative (technology load: Corr(b_0itl, b_1itl) = −0.85; inherent task difficulty: Corr(b_0itd, b_1itd) = −0.37; initial task complexity: Corr(b_0itc, b_1itc) = −0.1; text difficulty: Corr(b_0ill, b_1ill) = −0.86). As a result, H1 and H2 are approved.

The full models on the effect of number of minimum steps (nms) and impasses (ni) showed overparametrization (Corr(b_0nms, b_1nms) = −1, Corr(b_0ni, b_1ni) = –1). The corresponding restricted models proved superior to the baseline models, so there is some variance of the positive time effect along those characteristics but the data are not enough to tell more about it.

Overall, it seems that time-on-task can compensate a person with average abilities for difficulties deriving from the intrinsic task and ICT complexity more efficiently than to counterweight the complexity of cognitive/metacognitive processes of solving information problems, especially evaluating and acquiring information. Figures 2 and 3 show the estimated parameters of a randomly chosen test-taker.

Figure 2.

Outcome in Function of Time-on-Task at the Different Levels of Intrinsic Task Complexity. Estimated Parameters of a Randomly Chosen Test-Taker.

Figure 3.

Outcome in Function of Time-on-Task at the Different Task Difficulty Levels Stemming From Acquiring and Evaluating Information. Estimated Parameters of a Randomly Chosen Test-Taker.

Conclusions

To be able to solve complex information problems in an electronic environment is a key 21^st century skill that would be important to teach and learn. Additionally, DPS should be supported by appropriate environment design. However, considering purely the outcome of DPS activities, it is not possible to know when and why individuals fail while trying to achieve their goals. The time spent on specific tasks may help researchers to discover more about the reasons behind failures or inefficient DPS behavior. Our results indicate that the different task characteristics do not contribute the same way to DPS performance and relates to problem-solving efforts differently. Spending more time on a task is more likely to compensate an average problem solver for intrinsic task and technology drivers of difficulty than for the imposed cognitive activities of information problem-solving processes; especially the need to acquire and evaluate information. Thus, our results, showing that more time on a complex problem-solving task increases performance, are in line with the time-accuracy trade-off and dual processing theories (Chen et al., 2018; Goldhammer et al., 2014). Additionally, this study produced new information on how the sub-processes of complex problem-solving and task characteristics moderates the link between time-on-task and accuracy (see De Boeck & Jeon, 2019; Greiff et al., 2018).

In sum, the meaning of time-on-task is not consistent across problem-solving tasks. The interpretation of time-on-task should take the task difficulty along with the source of difficulty into consideration. Teachers, web designers and test-developers would make good use of the analyses of task characteristics along with the outcome and time-on-task data to calibrate task difficulty, to identify potential sub-processes and task features that cause problems, to gain better understanding of individual abilities, and most importantly, to reveal ways to help individuals to enhance their performance and decrease the cognitive load of DPS. PIAAC PS-TRE results (OECD, 2019) reinforce that educational interventions aimed at the development of DPS skills would be justified. As the PIAAC PS-TRE framework document (OECD, 2009) mentions, information-rich problems, where the task complexity lies in the information to be accessed and used, deserve particular attention.

Finally, we have to mention the limitations of our study and we also propose some future research pathways. First, as we did not control the time that test-takers spent on the tasks we can talk about associations between time-on-task and task outcome but we are not able to state that the amount of time spent on a task influences accuracy. Second, PIAAC PS-TRE consisted of fourteen tasks. More tasks that could be divided into more unique categories along the characteristics and designed in a way to be clearly distinguishable along characteristics would be helpful to corroborate the results. It would be also interesting to research what interface designs would lead to higher performance. Interfaces – menu systems and icons for example- are usually designed to offset technical difficulties but typically, are less supportive in helping to overcome difficulties deriving from processing a large quantity of information. Additionally, log-files reveal the processes leading to specific outcomes. Combining analyses on task characteristics, DPS processes and their timing is a promising technique to identify the causes of low DPS performance and counterbalance it by advanced environment design. Third, we did not manipulate or accounted for the motivation of test-takers. The motivation level of test-takers, i.e., how hard they tried to solve the test or how intensively they used their time, arguably has a major effect on the time-on-task and outcome relationship. Further studies should identify subjects under a certain engagement level and spending time on processes not related to the task. Analyzing the data of other countries and other DPS task results would be useful to see if the results are generalizable.

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The project was financed by the Higher Education Institutional Excellence Programme of the Ministry of Human Capacities in Hungary, within the framework of the 4th thematic programme “Enhancing the Role of Domestic Companies in the Reindustrialization of Hungary” of the University of Pécs (reference number of the contract: 20765-3/2018/FEKUTSTRAT) and the European Union, co-financed by the European Social Fund (Grant no.: EFOP-3.6.1.-16-2016-00004 entitled by Comprehensive Development for Implementing Smart Specialization Strategies at the University of Pécs).

ORCID iDs

Zsófia Vörös

Dániel Kehl

Author Biographies

Zsófia Vörös is a researcher at Faculty of Business and Economics of the University of Pécs, Hungary. Her research investigates cognitive dimensions that may underlie digital reading, learning and problem solving. She earned her PhD in psychology from the University of Poitiers, France (2009).

Dániel Kehl is an associate professor in the Department of Economics and Econometrics at the University of Pécs. He holds a PhD in Economics (UP, 2012). His main research interests include a wide range of applied statistical methods in various fields.

Jean-François Rouet (PhD, University of Poitiers, France) is a senior research scientist with the French Centre National de la Recherche Scientifique (National Center for Scientific Research). His research addresses the cognitive underpinnings of reading literacy and information technology use. He has been involved since 2006 in the OECD's PISA and PIAAC surveys of teenager and adult skills, respectively. He has recently coauthored Literacy beyond text comprehension: A theory of purposeful reading with Anne Britt and Amanda Durik (Taylor and Francis, 2018).

References

Anmarkrud

Ø.

Andresen

Bråten

(2019). Cognitive load and working memory in multimedia learning: Conceptual and measurement issues. Educational Psychologist, 54(2), 61–83.

Azevedo

Moos

D. C.

Johnson

A. M.

Chauncey

A. D.

(2010). Measuring cognitive and metacognitive regulatory processes used during hypermedia learning: Issues and challenges. Educational Psychologist, 45(4), 210–223.

Bates

Maechler

Bolker

(2012). lme4: Linear mixed-effects models using S4 classes (R Package Version 0.999999–999990). http://cran.r-project.org/web/packages/lme4/index.html

Chang

H. H.

(2015). Psychometrics behind computerized adaptive testing. Psychometrika, 80(1), 1–20.

Chen

De Boeck

Grady

Yang

C.-L.

Waldschmidt

(2018). Curvilinear dependency of response accuracy on response time in cognitive tests. Intelligence, 69, 16–23.

De Boeck

(2008). Random item IRT models. Psychometrika, 73(4), 533–559.

De Boeck

Jeon

(2019). An overview of models for response times and processes in cognitive tests. Frontiers in Psychology, 10, 102.

DeStefano

LeFevre

J. A.

(2007). Cognitive load in hypertext reading: A review. Computers in Human Behavior, 23(3), 1616–1641.

Fischer

A., S.

Funke

(2011). The process of solving complex problems. The Journal of Problem-Solving, 4, 19–42.

10.

Goldhammer

Kroehne

(2014). Controlling individuals’ time spent on task in speeded performance measures: Experimental time limits, posterior time limits, and response time modeling. Applied Psychological Measurement, 38(4), 255–267.

11.

Goldhammer

Naumann

Keßel

(2013). Assessing individual differences in basic computer skills: Psychometric characteristics of an interactive performance measure. European Journal of Psychological Assessment, 29(4), 263–275.

12.

Goldhammer

Naumann

Rölke

Stelter

Tóth

(2017). Relating product data to process data from computer-based competency assessment. In D. Leutner, J. Fleischer, J. Grünkorn, & E. Klieme (Eds.) Competence assessment in education (pp. 407–425). Springer.

13.

Goldhammer

Naumann

Stelter

Tóth

Rölke

Klieme

(2014). The time-on-task effect in reading and problem-solving is moderated by task difficulty and skill: Insights from a computer-based large-scale assessment. Journal of Educational Psychology, 106(3), 608–626.

14.

Goldhammer

Steinwascher

M. A.

Kroehne

Naumann

(2017). Modelling individual response time effects between and within experimental speed conditions: A GLMM approach for speeded tests. The British Journal of Mathematical and Statistical Psychology, 70(2), 238–256.

15.

Greiff

Krkovic

Nagy

(2014). The systematic variation of task characteristics facilitates the understanding of task difficulty: A cognitive diagnostic modeling approach to complex problem-solving. Psychological Test and Assessment Modeling, 56(1), 83–103.

16.

Greiff

Molnár

Martin

Zimmermann

Csapó

(2018). Students’ exploration strategies in computer-simulated complex problem environments: A latent class approach. Computers & Education, 126, 248–263.

17.

Greiff

Wüstenberg

Csapó

Demetriou

Hautamäki

Graesser

A. C.

Martin

(2014). Domain-general problem-solving skills and education in the 21st century. Educational Research Review, 13, 74–83.

18.

Liao

Jiao

(2019). Clustering behavioral patterns using process data in PIAAC Problem-Solving items. In B. Veldkamp & C. Sluijter (Eds.), Theoretical and practical advances in computer-based educational measurement. Methodology of educational measurement and assessment (pp. 189–212). Springer.

19.

Heitz

R. P.

(2014). The speed-accuracy tradeoff: History, physiology, methodology, and behavior. Frontiers in Neuroscience, 8, 150.

20.

Kalyuga

(2007). Enhancing instructional efficiency of interactive e-learning environments: A cognitive load perspective. Educational Psychology Review, 19(3), 387–399.

21.

Jacobs

G. E.

Castek

(2018). Digital problem solving: The literacies of navigating life in the digital age. Journal of Adolescent & Adult Literacy, 61(6), 681–685.

22.

Lazonder

A. W.

Rouet

J.-F.

(2008). Information problem-solving instruction: Some cognitive and metacognitive issues. Computers in Human Behavior, 24(3), 753–765.

23.

Mayer

R. E.

(1992). Thinking, problem-solving, cognition (2nd ed.). W. H. Freeman and Company.

24.

Naumann

(2019). The skilled, the knowledgeable, and the motivated: Investigating the strategic allocation of time-on-task in a computer-based assessment. Frontiers in Psychology, 10, 1429.

25.

Naumann

Goldhammer

(2017). Time-on-task effects in digital reading are non-linear and moderated by persons’ skills and tasks’ demands. Learning and Individual Differences, 53, 1–16.

26.

Organisation for Economic Co-operation and Development. (2009). Problem-solving in technology-rich environments: A conceptual framework (OECD Education Working Paper No. 36). OECD Publishing.

27.

Organisation for Economic Co-operation and Development. (2013). Technical report of the survey of adult skills (PIAAC). http://www.oecd.org/site/piaac/_Technical%20Report_17OCT13.pdf

28.

Organisation for Economic Co-operation and Development. (2019). Skills matter: Additional results from the survey of adult skills. OECD Publishing. https://doi.org/10.1787/1f029d8f-en

29.

Petscher

Mitchell

A. M.

Foorman

B. R.

(2015). Improving the reliability of student scores from speeded assessments: An illustration of conditional item response theory using a computer-administered measure of vocabulary. Reading and Writing, 28(1), 31–56.

30.

Raes

Schellens

Wever

B. D.

(2010). The impact of web-based collaborative inquiry for science learning in secondary education. In K. Gomez, L. Lyons, & J. Radinsky (Eds.), Proceedings of the 9th international conference of the learning sciences (Vol. 1, p. 6). International Society of the Learning Sciences.

31.

Rouet

J. F.

Britt

M. A.

Durik

A. M.

(2017). RESOLV: Readers’ representation of reading contexts and tasks. Educational Psychologist, 52(3), 200–215.

32.

Rouet

J. F.

Vörös

von Davier

(2016). Assessing problem-solving in technology-rich environments: What can we learn from online strategy indicators? In Y. Rosen, S. Ferrara, and M. Mosharraf (Eds.) Handbook of research on technology tools for real-world skill development (pp. 706–724). IGI Global.

33.

Scherer

Greiff

Hautamäki

(2015). Exploring the relation between speed and ability in complex problem-solving. Intelligence, 48, 37–50.

34.

Schneider

Shiffrin

R. M.

(1977). Controlled and automatic human information processing. Psychological Review, 84(1), 1–66.

35.

Sonnleitner

Keller

Martin

Brunner

(2013). Students’ complex problem-solving abilities: Their structure and relations to reasoning ability and educational success. Intelligence, 41(5), 289–305.

36.

Davison

M. L.

(2019). Improving the predictive validity of reading comprehension using response times of correct item responses. Applied Measurement in Education, 32(2), 166–182.

37.

Sweller

(2011). Cognitive load theory. In J.P. Mestre & B.H. Ross (Eds.) Psychology of learning and motivation ( Vol. 55, pp. 37–76). Academic Press.

38.

Vörös

Rouet

J. F.

(2016). Laypersons’ digital problem-solving: Relationships between strategy and performance in a large-scale international survey. Computers in Human Behavior, 64, 108–116.

39.

Vörös

Rouet

J. F.

Pléh

(2011). Effect of high-level content organizers on hypertext learning. Computers in Human Behavior, 27(5), 2047–2055.

40.

Wang

Hanson

B. A.

(2005). Development and calibration of an item response model that incorporates response time. Applied Psychological Measurement, 29(5), 323–339.

41.

Whitelock-Wainwright

Laan

Wen

Gašević

(2020). Exploring student information problem-solving behaviour using fine-grained concept map and search tool data. Computers & Education, 145, 103731.

42.

Winne

P. H.

(2011). A cognitive and metacognitive analysis of self-regulated learning. In B. J. Zimmerman & D. H. Schunk (Eds.), Handbook of self-regulation of learning and performance (pp. 15–32). Routledge.