Abstract
Background.
Aim. As
Need for support. Study 1 was
Model progression and worked examples. In Studies 2 and 3, the need for support was addressed by
Implications. The pattern of results across the three studies are discussed with regard to students’ use of available resources, influence of prior knowledge, and the relationship between performance and learning.
Keywords
Technology-enhanced inquiry learning environments enable students to learn science by doing science, offering resources to develop a deep understanding of a domain by engaging in scientific reasoning processes such as hypothesis generation, experimentation, and evidence evaluation. The central aim of this investigative learning mode is twofold: students should develop domain knowledge and proficiency in scientific inquiry (see Gobert & Pallant, 2004).
Computer-mediated simulations have long resided at the heart of these environments. A simulation can be defined as a program that incorporates an interactive model, which can be repeatedly changed and re-run in order for students to understand that model (Alessi, 2000). Compared to traditional more expository forms of instruction, several studies have shown that learning with simulations is more effective for promoting science content knowledge, developing process skills, and facilitating conceptual change (e.g., Eysink et al., 2009; Scalise et al., 2011). These promising results, however, only hold when the inquiry process is adequately structured and scaffolded.
Simulations are increasingly being supplemented with opportunities for students to build computer models of the phenomena they are investigating via the simulation. Following that scientists often use models during their inquiry, modeling is considered an integral part of the inquiry learning process in that students can build computer models to express their understanding of the relation between variables (de Jong & van Joolingen, 2008; van Joolingen, de Jong, Lazonder, Savelsbergh, & Manlove, 2005; White, Shimoda, & Frederiksen, 1999).
According to de Jong and van Joolingen (2008), when combining learning from simulations and learning by modeling, both approaches will reinforce each other. During modeling, students ideally go through four distinguishable stages: (1) model sketching, (2) model specification, (3) data interpretation, and (4) model revision (cf. Hogan & Thomas, 2001). Combining these stages within the inquiry learning activities provides a description of the integrated inquiry learning approach (cf. van Joolingen et al., 2005). Students have to gain their understanding of the phenomenon by performing experiments with simulation, which does not show them the computation model that drives the simulation. So when students have no prior knowledge about the domain, they conduct one or more exploratory experiments to gain an initial understanding of the phenomenon. Students with prior knowledge can skip this step and immediately start sketching a model outline to express their initial understanding. Subsequently students form hypotheses, which they can investigate with the simulation. The results of these experiments are then used to transform the model sketch into a runnable model by specifying the relations between the variables in the model. A student’s model can thus be conceived of as a set of hypotheses derived from prior knowledge or simulation output. During data interpretation students compare their model to data from the simulation, which during the conclusion phase feeds their decisions to revise the model.
This ideal sequence of steps is rarely observed in practice. Students generally have difficulty with both inquiry and modeling, which challenges the educational effectiveness of the integrated inquiry learning approach. For example, students are unable to infer hypotheses from (simulation) data, design inconclusive experiments, show inefficient experimentation behavior, and ignore incompatible data (for extensive reviews, see de Jong & van Joolingen, 1998; Zimmerman, 2007). Regarding modeling, the review by VanLehn (2013) suggests that students have difficulties in understanding the presentation and modeling language. For instance, Löhner, van Joolingen, and Savelsbergh (2003) found that students have difficulty entering equations in a modeling language and that it is thus advisable to use a graphical modeling language. Additionally, VanLehn showed that problems arise because students fail to adequately test and debug their model. For instance, Hogan and Thomas (2001) noticed that students often failed to let model output guide their revision of the model, and Stratford, Krajcik, and Soloway (1998) demonstrated students’ lack of persistence in debugging their model to fine-tune its behavior.
These findings suggest that students’ difficulties with both inquiry and modeling lie at a conceptual level. Most students manage to design and conduct experiments with a simulation; inferring knowledge from these experiments appears to be the major source of difficulty. Likewise, students are capable of building syntactically correct models, but they find it difficult to express their ideas in a modeling formalism (Sins, Savelsbergh, & van Joolingen, 2005). As this ineffective behavior is a serious obstacle to learning, students might benefit from additional support during their inquiry and modeling practices.
Our work over the past few years has aimed to identify and provide for students’ support needs during this integrated inquiry learning approach (see Mulder, Lazonder, & de Jong, 2010, 2011, 2014). These studies were conducted in a technology-enhanced inquiry learning environment that provided students with a simulation of a charging capacitor and a model editor to mimic the behavior of this capacitor. The first study (Mulder et al., 2010) concerned an empirical assessment of high school students’ need for support. Main results showed that the domain novices generally exhibited the same inquiry behavior as their more knowledgeable counterparts. However, the domain novices created rather naïve models, which indicated that they acquired almost no knowledge from their inquiry. This suggests that, without prior knowledge (and instructional support) the expert-like behavior was probably less appropriate and certainly less effective for novices.
This outcome suggested that support for the integrated inquiry learning approach should help students to align their inquiry activities with their level of domain knowledge. Model progression (White & Frederiksen, 1990) is probably the least intrusive form of support that aims to pave students’ way through an inquiry by carefully structuring the task content according to a simple-to-complex sequence. Model progression was found to lead to higher performance success in some studies (Alessi, 1995; Eseryel & Law, 2010; Rieber & Parmley, 1995; Swaak, van Joolingen, & de Jong, 1998), but other studies report less favorable results (de Jong et al., 1999; Quinn & Alessi, 1994). These differential effects might be attributable to the slightly different configurations of the simple-to-complex sequencing. Following White and Frederiksen (1990), two types of model progression can be distinguished. Model order progression (MOP) gradually increases the specificity of the relations between variables, whereas model elaboration progression (MEP) gradually expands the number of variables in the task. To examine which type of model progression would best suit students’ needs, the second study (Mulder et al., 2011) compared MOP, MEP, and a control group that received no additional support. Model progression in general was found to lead to higher performance success, and participants in the MOP condition outperformed those from the MEP condition.
However, despite the statistical significance of the performance improvement (p = .001), the absolute learning gains in this study were quite modest. It thus seems that students need more explicit support for the integrated inquiry learning approach to be effective. Such support could take the form of worked examples, which are a proven fruitful means to enhance problem-solving performance (e.g., Atkinson, Derry, Renkl, & Wortham, 2000; Sweller, Ayres, & Kalyuga, 2011; Sweller & Cooper, 1985). Worked examples typically include a problem statement, a step-by-step account of the procedure to solve the problem, and the final solution. More recently, a variant that can be applied in non-algorithmic problem-solving situations has been proposed (Hilbert & Renkl, 2009; Hilbert, Renkl, Kessler, & Reiss, 2008). These so-called heuristic worked examples do not emphasize the specific action sequence students should follow to solve a problem, but exemplify the heuristic reasoning underlying the choice and application of this action sequence. Recent reviews of worked-examples research have demonstrated that heuristic worked examples can effectively be applied in a variety of domains such as doing mathematical proofs, concept mapping, and second language learning (Renkl, Hilbert, & Schworm, 2009; Sweller et al., 2011). Therefore, the third study (Mulder et al., 2014) explored whether complementing model progression with such heuristic worked examples that explain what the activities in each model progression phase entail, and demonstrate how they should be performed, would further enhance students’ inquiry and modeling performance and learning. Main findings confirmed that worked examples improved students’ inquiry and modeling behavior as well as the quality of the models they created. These models were nevertheless of mediocre quality, as were students’ scores on a knowledge posttest. The latter finding implied that the worked examples did not help advance learning outcomes.
The present article aims to uncover why performance success –despite improvements- remained low in the three studies as reported in the Mulder et al. (2010, 2011, 2014) articles. Towards this end, we examine patterns in students’ inquiry activities and modeling performance across the three studies. Three themes guided this research synthesis: students’ use of available learning resources, the influence of prior knowledge, and the relationship between performance and learning. In the sections below, we first introduce the set up and methods of the three studies, summarize some of their key results, and discuss how these findings can help advance the design of support for the integrated inquiry learning approach.
Method of the Three Studies
Three studies were conducted to investigate how learning with computer-mediated simulations and models can be improved by integrated support; a brief overview of the studies is presented in Table 1.
Overview of the Three Studies.
The Learning Environment
Students in all three studies worked on an inquiry task about the charging of a capacitor. As the charging of a capacitor is a process that changes over time, this topic lends itself well to System Dynamics modeling. The students’ assignment was to examine an electrical circuit in which a capacitor was embedded, and create a computer model that mirrors the capacitor’s charging behavior. Participants performed this task within a modified stand-alone version of the Co-Lab learning environment (van Joolingen et al., 2005) that housed a simulation of an electrical circuit containing a voltage source, two light bulbs, and a capacitor. Through systematic experimentation with this simulation participants could induce four physics equations: (1) Ohms law, (2) the junction rule of Kirchoff’s law, (3) the loop rule of Kirchoff’ law, and (4) the behavior of capacitors.
The learning environment also contained a model editor tool for participants to represent their knowledge of the four physics equations in an executable computer model. Due to the dynamic nature of the physics phenomenon, these models had the graphical structure of a stock and flow diagram (as can be seen from Figure 1) that consists of variables and relations. Variables are the constituent elements of a model and can be of three different types; variables that do not change over time (i.e., constants), variables that specify the integration of other variables (i.e., auxiliaries), and variables that accumulate over time (i.e., stocks). Relations define how two or more variables interact. Each relation is visualized by an arrow connector to indicate the causal link between model elements, and specified by a quantitative formula to indicate the exact nature of this relationship. An example looks like this: A basic element that changes over time and has an initial value (Charge) is represented in a stock. Flows are connected to a stock, indicating the changes in the stock. These changes are specified from the basic elements that remain constant (i.e., constants) (e.g., capacitance (C), power source (S), resistance (R1 and R2)), and auxiliary elements (i.e., auxiliaries) (e.g., potential difference across the capacitor (Vc), potential difference across the resistances (Vr), current (I), resistance total (R)), which are connected by relation arrows.

Screen capture of the model editor tool displaying the reference model students had to build from their prior knowledge and/or insights gained through experimenting with the simulation. Students could add, delete, and change elements and relations to their models with the buttons on the left. The buttons in the top center of the screen enabled students to perform runs on their models.
The model editor also enabled participants to test their understanding by running the model and analyzing its output through a table and graph tool. These tools further allowed students to compare model and simulation output in a single window. Students could use the results of this comparison to adjust or fine-tune their model and thus build an increasingly elaborate understanding of a charging capacitor.
An embedded help file tool contained the assignment and offered explanations of the operation of the tools in the learning environment. The help files contained no domain information on electrical circuits and capacitors as this knowledge should be inferred from interacting with the simulation.
Variants of the Learning Environment Used in the Different Studies
All conditions used the same instructional content (i.e., electrical circuits), but differed with regard to the scaffolding mechanisms (see Figure 2). All participants in Study 1 and the control condition participants in Study 2 worked with the standard configuration of the environment (as described above) and thus received no scaffolding.

Schematic overview of the scaffolding in the four variants of the learning environments. Students in the control condition worked with a full-complex simulation and had to induce and build a full quantitative model. Students in the model elaboration progression (MEP) condition were given an increasingly elaborate simulation in each phase, which they had to model quantitatively. Both students in the model order progression (MOP) and students in the MOP + worked examples condition investigated a full-complex simulation in all three phases and had to induce and build increasingly specific models. Students in the MOP+ worked examples (MOP+WE) additionally received worked examples for each model progression phase.
In the MEP condition in Study 2, the complexity of the simulation was gradually increased by adding components to the electrical circuit. The simulation in Phase 1 contained a circuit with a voltage source and one light bulb, enabling discovery of Ohm’s law. A second light bulb was added to the electrical circuit in the simulation in Phase 2, now introducing the junction rule of Kirchoff’s law. The capacitor was added to the simulation in Phase 3, introducing both the loop rule of Kirchoff’s law and the behavior of capacitors. Participants had to induce and build a quantitative model of the circuit in each simulation. Over phases, participants could extend their model to incorporate the new elements. The possibility to engage in qualitative modeling was disabled in this condition.
Participants in the MOP condition in Study 2 and Study 3 received a full-complex version of the simulation, and were asked to induce and build increasingly specific models. Specificity pertained to the relations in the model and progressed in three phases from identifying a relation to quantitatively specifying that relation (cf. Lazonder, Wilhelm, & van Lieburg, 2009; Mulder et al., 2010). In Phase 1 students only had to indicate the model elements (variables) and which ones affected which others (relationships) – but not how they affected them. In Phase 2 students had to provide a qualitative specification of each relationship so as to indicate the general direction of effect (e.g., if resistance increases, then current decreases). In Phase 3 students had to specify each relationship quantitatively in the form of an actual equation (e.g., I = V / R).
The MOP+WE condition in Study 3 provided students with seven worked examples: one introductory example to introduce the domain, and two specific examples for each model progression phase. These examples demonstrated the heuristic strategies students should apply to cycle effectively through the processes of hypothesis generation, experimentation, and evidence evaluation. In each model progression phase, one worked example displayed these strategies for the students’ inquiry activities with the simulation; the second example concerned the use of these strategies during modeling. Both worked examples together showed how to coordinate simulation, model, and data-inspection activities. The worked examples were presented on a website and were accessible during the entire experimental session regardless of the model progression phase a student was in. Participants’ interaction with the website’s movie player that showed the worked example videos (e.g., pressing the play and stop button) were stored in a log file.
Participants
Study 1 compared the inquiry activities and modeling performance of domain novices with two more knowledgeable reference groups. This study was conducted with 31 Dutch students. They were selected for their levels of prior domain knowledge and classified as either low-level novices (10 junior high school students (aged 14 - 15) without prior knowledge), high-level novices (10 senior high school students (aged 18 - 20) from the science track with some prior domain knowledge), or experts (11 university students (aged 20 - 27) in electrical engineering).
Study 2 assessed the relative effectiveness of model order progression (MOP) and model elaboration progression (MEP). Participants in this study were 90 Dutch high school students from a science track, aged 15-17, with little to no prior domain knowledge. Participants were assigned to either the MOP condition (n = 28), the MEP condition (n = 26), or the control condition (n = 36).
Study 3 examined the merits of supplementing model order progression with worked examples. The study’s sample comprised 15 to17-year old Dutch high school students from a science track with low prior domain knowledge. Thirty-six participants were allocated to the model order progression condition (MOP), the remaining 46 students were assigned to the model order progression condition with worked examples (MOP+WE).
Knowledge Tests
In Study 3, a posttest was administered to assess participants’ conceptual knowledge of electrical circuits. The tests contained 16 questions that addressed the meaning of key domain concepts and students’ understanding of an electrical circuit containing a charging capacitor. An example of a key domain question is: “State the function of a capacitor in an electrical circuit”. A typical item to gauge students’ qualitative understanding of the task would ask students to select a correct qualitative specification of a relation (e.g., “if resistance increases, then current decreases”). Finally, a typical question addressing the physics equations is: ”Ohm’s law describes the relationship between voltage, current and resistance. What is the formula for Ohm’s law?”. Participants’ answers were scored using a rubric that allocated one point to each correct response. Inter-rater reliability estimate was .96 (Cohen’s κ). In Studies 2 and 3, an abridged version of this test was used to assign students to the conditions (see Appendix).
Procedure
Data for Study 1 was collected in individual sessions that took place in the research lab. At the beginning of a session the experimenter demonstrated the learning environment and handed out a reference guide on the modeling syntax that participants could consult during the task. Participants then started the task while thinking aloud; non-directive prompts were given when necessary. Maximum time to complete the task was 1.5 hours
Both Studies 2 and 3 consisted of two sessions: a 50-minute introduction, and a 100-minute experimental session that were carried out in regular classrooms. The time between sessions was one week maximum. During the introductory session participants first completed the pretest, then received a guided tour of the learning environment, and finally completed a brief modeling tutorial. During the experimental session the students worked individually on the task and could ask the experimenter for technical assistance only. Participants could stop ahead of time if they had completed the assignment. Participants in Study 3, in addition, completed the posttest one week after the experimental session.
All three studies were concluded with a short debriefing of the participants. This was considered important to ensure a constructive contribution to students’ science education, as the experimental activities were part of their regular curriculum. During this plenary debriefing, students’ had the opportunity to reflect on the task they just completed. Any remaining questions regarding the learning domain and/or task were addressed by the experimenter. Debriefing followed after all data was collected so as not to influence the studies’ results. As such, the debriefing did not influence the experimental setup, nor were students’ responses analyzed. It served merely an educational purpose of embedding the experiment in the students’ curriculum.
Coding and Scoring
Data analysis focused on experimentation behavior and performance success. Experimentation behavior was defined as the number of times participants clicked the “Start” button in the simulation (simulation experiment) or model editor (model experiment).
Performance success scores were assessed from participants’ final models. Where possible, both a model content and a model structure score were calculated. The model content score represented how many of the four knowledge components of charging capacitors (i.e., Ohms law, Kirchoff’s law (including its two rules: the junction rule and the loop rule), and the behavior of capacitors) were reflected in the participants’ model. One point was awarded for each correctly specified knowledge component, leading to a four-point maximum score. Inter-rater reliability estimate was 1.0 (Cohen’s κ).
The model structure score was established in accordance with Manlove, Lazonder, and de Jong’s (2006) model coding rubric. This score represented the number of correctly specified variables and relations in the models. “Correct” was judged from the reference model shown in Figure 1. One point was awarded for each correctly named element; an additional point was given if that variable was of the correct type (i.e., constant, auxiliary or stock). Concerning relations, one point was awarded for each correct link between two variables and one point was awarded for the direction. The maximum model structure score was 38. Inter-rater reliability estimates were .74 (variables) and .92 (relations) (Cohen’s κ).
Learning outcomes in Study 3 were indicated by students’ scores on the posttest; the maximum score was 14 points.
Results
This section first describes students’ learning activities for Study 1, Study 2, and Study 3, respectively, followed by the results on students’ performance success and learning outcomes for each study.
Learning Activities
Participants could develop and verify their knowledge of charging capacitors by running the simulation or their models (see Table 2). In Study 1, using Pillai’s trace, MANOVA with the number of simulation and model experiments as dependent variables, showed no between-group differences with regard to experimentation behavior, V = 0.21, F(4, 46) = 1.31, p = .280.
Summary of Learning Activities.
From these statistical analyses it appears that unsupported novices predominantly followed the same approach as experts. Due to their lack of prior knowledge low-level novices could only base their modeling efforts on insights gained through experimentation, or engage in trial and error activities. Therefore, Study 1 participants’ think-aloud protocols were analyzed to reveal the reasoning behind subsequent model changes (i.e., model hypotheses). Results indicated that the low-level novices hardly reasoned at all. Of all the changes the low-level novices made to their model 87% were not the result of proper reasoning and thus likely the result of trial and error activities. The changes to models that were guided by reasoning could be considered ‘data-driven’.
In contrast, 83% of expert’s model changes were the result of reasoning based on prior knowledge. Of the remaining model changes, 12% was ‘data-driven’, often involving statements about previous model runs, 2% was based on logical reasoning, and 3% was not the result of proper reasoning.
In the think-aloud protocols of the high-level novices 89% of the changes made to the model were the result of proper reasoning. This reasoning was based on prior domain knowledge (28%), data from prior experiments (33%), information found in the assignment (28%), or logical reasoning (11%).
In Study 2, where both MOP and MEP as means to support novices were tested, MANOVA (using Pillai’s trace) showed a significant effect of experimental condition on the number of experiments with the simulation and the model, V = .33, F(4, 162) = 7.90, p < .01. Subsequent univariate ANOVAs revealed that model progression significantly increases the number of model experiments but not the number of simulation experiments (model experiments: F(2, 81) = 18.00, p < .01; simulation experiments: F(2, 81) = 0.42, p = .66). Helmert planned contrasts showed that the number of model experiments was higher for both model progression conditions compared to the control group, and that participants in the MOP condition performed more model experiments than participants in the MEP condition (MEP+MOP vs. control: t(81) = 4.36, p < .01, r = .44; MOP vs. MEP: t(81) = 4.03, p < .01, r = .41).
In Study 3 students in the experimental condition were supported with worked examples that provided an explicit account of the inquiry activities in both the simulation and the modeling tool. In this study MANOVA (using Pillai’s trace) showed a significant effect of experimental condition on the number of experiments with the simulation and the model, V = 0.14, F(2, 79) = 6.25, p = .003. Subsequent univariate ANOVAs revealed that worked examples significantly increase the number of simulation experiments but not the number of model experiments (simulation experiments, F(1, 80) = 12.57, p = .001; model experiments: F(1, 80) = 0.60, p = .443).
More detailed analysis of students’ learning activities were performed in Study 3 to clarify and explain the above findings. Table 3 shows the frequency and duration of the learning activities in both conditions. ANOVAs (with Bonferroni correction; α = .01) indicated that students from the MOP+WE condition engaged more often in simulation activities, F(1, 80) = 10.72, p = .002, and data inspection activities, F(1, 80) = 8.49, p = .005. The conditions did not differ in the number of model activities, F(1, 80) = 4.45, p = .038, nor in the number of times they consulted the help files, F(1, 80) = 1.21, p = .274.
Mean Frequency of and Percentage of Time Spent on Learning Activities in Study 3.
Furthermore, in Study 3 differences between conditions were found on the relative time students spent on these inquiry and modeling activities. ANOVAs with a Bonferroni correction (α = .01) indicated that the MOP+WE students spent relatively more time on data inspection, F(1, 80) = 9.37, p = .003, whereas the MOP students spent more time with the model, F(1, 80) = 57.00, p < .001. No statistical differences were found for simulation activities, F(1, 80) = 2.50, p = .118, and help file seeking activities, F(1, 80) = 0.58, p = .450.
Performance Success and Learning Outcomes
Performance success was assessed from the participants’ final models (see Table 4). A distinction was made between model structure and model content scores. Participants’ model structure scores were analyzed by MANOVA with both model structure aspects (i.e., variables and relations) as dependent variables. In Study 1, using Pillai’s trace, this analysis produced a significant between-subjects effect, V = 0.44, F(4, 56) = 3.96, p = .007. Subsequent ANOVAs yielded significant between group differences for both aspects (variables: F(2, 31) = 4.48, p = .021; relations, F(2, 31) = 9.84, p = .001). Helmert planned contrasts revealed that experts included more correct variables in their models than novices, t(81) = 3.26, p = .011, r = .34 and had higher quality of the relations between these variables t(81) = 5.80, p<= .001, r = .54. The comparison among both groups of novices showed no significant differences between high-level novices and low-level novices for these measures (variables: t(81) = 1.80, p = .218; relations: t(81) = 1.50, p = .354).
Summary of Performance Success.
As little or no variation on the model content measure was detected, especially for low-level novices, this measure was analyzed by a non-parametric Kruskal-Wallis test, which showed a significant effect of prior knowledge, H(2) = 23.99, p = .001. Post hoc comparisons, using Mann-Whitney U tests with Bonferroni correction (α = .0167) revealed significant differences for all pair-wise comparisons (low-level novices vs. experts: U = 0.00, r = .92; low-level novices vs. high-level novices: U = 15.00, r = .70; high-level novices vs. experts: U = 7.50, r = .77).
The differences in the final models between the experts and the novices in Study 1 can be considered as an indication that support is necessary. Even though the learning environment provided participants with all necessary tools to induce all content knowledge, these differences suggest that, without support, novices do not acquire full comprehension of the domain. A closer inspection of the development of the learner models was performed in Study 1 to reveal why novices’ behavior was less effective.
A look at participants’ initial models shows that experts’ models contained nearly all basic elements from the target model (i.e., 1 stock and 4 constants) (M = 4.45, Range = 3-5). Novices included as many elements in their first model (low-level novices: M = 4.33, Range = 2-6; high-level novices: M = 4.00, Range = 3-5). However, the low-level novices’ models also contained a few erroneous elements such as ‘loading time’ and ‘switch’ (M = 0.89, Range = 0-2), which remained in their models all through the learning task (erroneous elements in low-level novices’ final models: M = 1.22, Range = 0-4).
Although low-level novices had a pretty good sense of which elements to include in their initial models, they were probably ignorant of the relationships between model elements. The modeling tool in Co-Lab anticipated this by offering participants the possibility to specify relationships qualitatively. Surprisingly however, only two low-level novices and one expert made use of this feature.
To facilitate students’ relationship construction, MOP was implemented in Study 2. In this study, MANOVA (using Pilla’s trace) with both model structure aspects (i.e., variables and relations) as dependent variables showed significant between-subjects differences on the quality of the created models, V = 0.21, F(4, 162) = 4.74, p = .001. Subsequent univariate ANOVAs validated the conjecture that model progression has no effect on the number of correct variables in the students’ models, F(2, 81) = 0.85, p = .431, but does enhance the quality of the relations between these variables, F(2, 81) = 9.53, p < .001. Helmert planned contrasts revealed that the model progression conditions combined had significantly higher scores for relations than the control condition, t(81) = 2.45, p = .006, r = .26, and that the MOP condition outperformed the MEP condition on this measure, t(81) = 3.56, p = .001, r = .37.
A Kruskal-Wallis test showed a significant effect for experimental condition on participants’ model content scores in Study 2, H(2) = 13.16, p = .001. The post hoc comparisons showed no differences in model content scores between the MOP condition and either the MEP condition, U = 114, or the control condition, U = 174. Comparison among the latter two conditions revealed a significant difference in favor of the MEP condition, U = 298, r = .33. In interpreting these results it should be noted that few MOP participants reached the third phase where they could specify their model quantitatively and thus obtained a model content score of zero, which can often be explained by students’ progressing slowly through the phases.
Study 3 showed that adding worked examples to MOP further enhanced students’ performance success. Using Pillai’s trace, MANOVA showed a significant effect for condition on the variables and relations aspect of the model structure score, V = 162, F(2, 79) = 7.65, p = .001. Subsequent univariate ANOVAs revealed significant worked example effects on both the variables, F(1, 80) = 15.38, p < .001, and the relations aspect, F(1, 80) = 9.45, p = .003. The model content scores in this study indicated that none of the participants reached a correct quantitative understanding of the physics equations. As no variation in scores was detected, the model content measure was not analyzed further.
A posttest was used to establish learning outcomes in Study 3. Performance was low for both groups; only 3 out of 14 (MOP+WE: M = 2.84, SD = 2.02; MOP: M = 2.97, SD = 1.70). Univariate ANOVA on this measure revealed no significant difference between the two conditions, F(1, 75) = 0.10, p = .759.
Discussion
In general, the studies reported here show several patterns with regard to supporting the integrated inquiry learning approach. The first pattern relates to the use of the resources (i.e., the simulation and modeling tools) provided in the inquiry learning environment. In all studies, regardless of both level of expertise and amount of support, students performed experiments with both the simulation and their self-constructed models. Given that figuring out the model underlying a simulation is almost always easier than figuring out how to build the same model (Alessi, 2000), learning from a simulation might be more appealing to the novice learners. It was therefore surprising that the novices in Study 1, who had to acquire all relevant information for their models from the simulation, performed as few experiments as their more knowledgeable counterparts. This expert-like behavior appeared ill-chosen for novice learners as no sound reasoning appeared to be behind the construction and modifications of their models and subsequently resulted in poorly constructed models. Comparable results were found in recent meta-analyses on inquiry learning in general (Alfieri, Brooks, Aldrich, & Tenenbaum, 2011; Minner, Levy, & Century, 2010), which conclude that the merits of inquiry learning only hold when inquiry learning is supported. Hence, a common mistake for an integrated inquiry learning approach is to offer it to novice learners without any support, because without support novice learners’ investigative efforts seldom lead to knowledge acquisition.
Fortunately, Studies 2 and 3 showed that model progression positively influenced experimenting with the model, and that the complementary worked examples positively influenced experimenting with the simulation. This confirms that the integrated inquiry learning approach has potential to integrate modeling into the inquiry process. Students who received support by both model progression and worked examples frequently conducted activities in both the simulation and the modeling tool. Although they spent most of their time building the model, they also spent a substantial amount of time with the simulation tool. Additionally, the students who received this support created better models.
However, one slightly atypical finding emerged with regard to participants’ tool use. Comparing the results across studies shows that the frequency with which participants performed experiments with both the simulation and the model tool was much lower in Study 1 than in the subsequent studies. This is atypical as the low-level novice students in Study 1 were comparable with regard to prior knowledge to the students in the control condition in Study 2, and neither group received support. This could be the result of the methodological difference between these studies, as in Study 1 participants were required to think-out-loud, which was not the case in subsequent studies. A previous study found that thinking aloud with non-directive probes had no disruptive influence on participants’ inquiry learning process (Wilhelm & Beishuizen, 2004). Nevertheless, thinking aloud may have made participants more selective in the number of experiments they performed in this study.
The second pattern relates to the influence of prior knowledge. All studies involved Dutch high school students who had some experience in both modeling and conducting and reporting lab experiments. These students can be considered to know the overall steps to take in using the scientific method as inquiry learning is introduced early on in their science curriculum. The steps to take in modeling, however, might have been less familiar as they had comparatively less experience in this area. Therefore, prior to the experimental session, all participants in the reported studies completed a modeling tutorial to familiarize them with the learning environment. With regard to domain knowledge, the high school students were largely unfamiliar with the topic of charging capacitors. Literature suggests that domain knowledge influences students’ inquiry process and thus how much they can pick up from an inquiry learning task (Hmelo, Nagarajan, & Day, 2000; Klahr & Dunbar, 1988; Lazonder, Wilhelm, & Hagemans, 2008; Schauble, Klopfer, & Raghavan, 1991). A review of school curricula and teacher statements showed that students were (or should be) familiar with electrical circuits and concepts such as power source, resistance and Ohm’s law. This knowledge is prerequisite to the topic of charging capacitors, which had not yet been taught in students’ physics classes. To confirm both assumptions, a prior knowledge test was administered in Studies 2 and 3 that addressed both the allegedly familiar knowledge about electrical circuits as well as the new and unfamiliar knowledge about charging capacitors. Students’ performance on this test (a score of 1 or 2 out of 8) indicated that students could indeed be considered domain novices. In hindsight, however, this low score also suggests that they may have lacked the necessary prerequisite knowledge, which may have negatively impacted their inquiry process. This suggests that insufficient prior domain knowledge is another pitfall when using the integrated inquiry learning approach. To prevent the negative influence of too low entry levels of domain knowledge in future research, the prerequisite knowledge could be recapitulated before or during the studies (Lazonder, Hagemans, & de Jong, 2010). However, the poor performance in Study 1 of high-level novices – who had some prior domain knowledge– suggests that increased prior domain knowledge without additional support will not guaranty that students learn from the inquiry task either. This suggests that a recapitulation of prior knowledge might help students to take more advantage of the support they receive.
A third pattern concerns the relation between performance success and learning. Positive effects for both model progression and worked examples on performance success were found in Study 2 and 3 respectively. But do students who perform better also learn more? As performance success appears a prerequisite for learning, Study 1 and 2 did not address this question. Theoretical and empirical evidence suggests that the performance measures (i.e., model quality scores) that assessed the instructional effects of model progression are indicative of the knowledge students acquired during the experiment. This assumption was based on constructionism, an instructional paradigm in which learning is considered synonymous to the knowledge construction that takes place when learners are engaged in building objects (Kafai & Resnick, 1996). Research has confirmed that the construction of models is associated with cognitive learning (e.g., van Borkulo, 2009) and that the quality of students’ models is associated with their reasoning processes (Sins et al., 2005). It thus seemed plausible to infer the instructional effects of model progression from the students’ task performance. However, to paint a more complete picture, a posttest measuring learning outcomes was administered in Study 3. Contrary to expectations, the favorable effects found on the performance measure did not show on the learning outcomes measure. One possible explanation is that the posttest was not sensitive enough to the students’ learning during the task. The posttest was construed to cover the contents of all model progression phases. As only a few students reached the third phase, many students were tested on subject matter they had not been able to investigate during the task, or spent only little time on. Future research should investigate how learning outcomes can be assessed more accurately. Until then it seems that the integrated inquiry learning approach – provided that additional support is given – has potential for increasing scientific content knowledge. However the present studies do not allow for such a definitive conclusion.
As for practical implications, the results of these studies show that students are more likely to benefit from the integrated inquiry learning approach when their entry level of domain knowledge is sufficient. Therefore, in actual practice teachers should keep an eye out to detect these knowledge gaps in time. It would be advisable that teachers respond to these knowledge gaps by first recapitulating the required prior knowledge. As Lazonder et al. (2010) demonstrated, providing students with domain knowledge before or during the task facilitated their inquiry learning processes and outcomes. Future research should demonstrate whether these beneficial effects of providing domain knowledge hold for the integrated approach to inquiry learning.
Alternative suggestions to make such a learning approach more feasible for classrooms include extending time on task and providing additional support during the task. However, teachers may consider extra class time unfeasible or undesirable, and one might indeed wonder how much extra time should be devoted to a relatively small topic such as the charging of a capacitor. Even though inquiry learning admittedly takes more time than direct instruction, the scope of an inquiry unit should match the amount of time available in the curriculum. A more practical solution might therefore be to offer additional support. In actual practice, teachers can provide students with the relevant domain knowledge or procedural assistance during the task. As a result, students can gain from all benefits that inquiry learning and modeling have to offer, without getting stuck by the difficult challenges that this integrated approach poses. However, for teachers who wish to implement learning from simulations combined with modeling, our advice is to supplement the inquiry and modeling task with model order progression and worked examples.
In conclusion, the studies do not allow for a definitive conclusion on the implementation of such support, although they suggest a positive effect of model progression and worked examples. More specifically, they suggest a positive effect of support where the model increases in specificity and of worked examples that show what the activities in each model progression phase entail, and how they should be performed. While such support assists learners during an integrated inquiry learning approach, which combines learning from simulations and learning by creating models, insufficient prerequisite knowledge still threatens students’ performance on, and thus what they can learn from, such a task. As with any novel application of learning support, continued iterative rounds of design and evaluation are needed to discover its true potential.
Footnotes
Appendix
Acknowledgements
We are very grateful to all students and teachers who participated in this study. We would also like to thank the reviewers and editors for their useful comments and suggestions on earlier drafts.
Author Contributions
All authors contributed to this article, in content and in form. YGM prepared and carried out the experiments, did the statistical data interpretation, and wrote the manuscript. AWL and TDJ contributed to the design of the interventions, both in focus and in form, and to the design of the experiments. Additionally they made numerous critiques and suggested specific wording during the editing of the manuscript.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The writing of this paper was partially funded by the project “Learning through modeling and self explanations”, which is part of the National Initiative Brain and Cognition (NIHC) funded by the Dutch Organization for Scientific Research (NWO), grant 056-31-011.
Author Biographies
Contact:
Contact:
Contact:
