Abstract
Computational thinking (CT) has attracted significant interest among many educators around the globe. Despite this growing interest, research on CT and programming education in elementary school remains at an initial stage. Many relevant studies have adopted only one type of method to assess students’ CT, which may lead to an incomplete view of student development on CT, while other studies employed small sample sizes, which may increase the chance of assuming a false premise to be true. Moreover, conventional programming courses typically have two limitations (e.g., limited student active learning and student low engagement). Given these gaps, this study investigates the effects of a theory-based (5E framework) flipped classroom model (FCM) on elementary school students’ understanding of CT concepts, computational problem-solving performance, and perceptions of flipped learning. To achieve this, a pretest-posttest quasi-experimental study was conducted in a rural elementary school, including 125 students in the experimental group and 122 students in the control group. The results showed that the 5E-based FCM significantly improved student understanding of CT concepts and computational problem-solving abilities. The results also revealed positive student perception toward the FCM. The benefits and challenges of the 5E-based FCM are discussed.
Introduction
In recent years, computational thinking (CT) has attracted much attention from educators and researchers (Hsu et al., 2018). CT can be regarded as a specific form of problem solving (Kale & Yuan, 2021), which enables students to recognize problems, think critically, and make decisions (Falloon, 2016). Wing (2014) even argued that CT skills are as important as reading, writing, and arithmetic skills. Courses on CT and programming are traditionally conducted for university students majoring in computer science (Wei et al., 2021). Yet, recently, a growing number of scholars (e.g., Flórez et al., 2017) have suggested that CT learning should start in the early years of schooling, such as elementary school, as it has positive benefits on students’ future learning (Papadakis et al., 2016). Compared with students in secondary school or university, younger students are more likely to have a positive attitude toward CT learning (Pérez-Marín et al., 2020). Although CT is still a relatively new topic in elementary education (Ching et al., 2018), some extant empirical studies of CT learning have shown promising progress. For example, engaging elementary schoolers in unplugged activities (e.g., del Olmo-Muñoz et al., 2020) or training them to apply the embodiment method when tackling issues (e.g., Sung et al., 2017) can improve their performance in CT and problem-solving. Studies have also suggested the efficacy of programming teaching to improve CT knowledge (e.g., Rodríguez-Martínez et al., 2020; Wu & Su, 2021). Yet, despite this progress in CT education, several gaps or constraints remain that call for further empirical investigation. When introducing programming training into elementary school classrooms, two major constraints (i.e., limited student active learning and low student engagement) may prevent young novices from developing CT (see “Why the FCM and 5E Conceptual Framework” section for more discussion). There is a continuing need for experimental research to explore empirically based effective strategies for teaching CT to elementary schoolers (Chalmers, 2018).
Given these gaps, this study investigated the effects of an innovative 5E-based FCM on the CT development of elementary schoolers. In this study, a pretest-posttest quasi-experimental design with experimental–control groups was adopted. Two hundred forty-seven elementary schoolers in the fourth grade were enrolled and divided into two groups. The experimental group (n = 125 students) learned CT through programming under the 5E-based FCM, while the control group (n = 122 students) learned CT under the non-flipped condition. Different types of measurements were applied to evaluate students’ projects, their responses to test items, and perceptions of students and the instructor, which can help the researchers obtain a more comprehensive insight concerning the effectiveness of the proposed intervention (Román-González et al., 2019). This paper makes three contributions to the literature on CT. First, it provides a detailed description of the development of a 5E-based flipped classroom approach to teach elementary schoolers CT. This detailed description is provided to encourage replication by other researchers and practitioners. Second, it presents the results of an experiment applying this model in 247 fourth graders. Third, it provides several valuable suggestions for using this approach to support elementary schoolers’ CT learning. This set of suggestions can provide helpful guidance for other instructors.
The rest of the manuscript is organized in the following sections. “Conceptual Background” section reviews the relevant literature. “Methodology” section details the methods used in this study, consisting of an overview of the context and participants, development of the 5E-based FCM, experimental design, data collection, and data analysis. “Results” section and “Discussion” section presents the results and corresponding discussion on those results. This is followed in “Implications for Relevant Research and Practitioners” section by the study’s implications for relevant research and practitioners. Finally, the manuscript finalizes with the “Conclusion and Limitations” section.
Conceptual Background
CT Dimensions
CT can be defined as “the thought processes involved in formulating a problem and expressing its solution(s) in such a way that a computer can effectively carry [them] out (Wing, 2014, p 1).” Specifically, CT represents a set of generally applicable skills, including decomposition and algorithmic thinking (Rose et al., 2017). Since its introduction, various frameworks have been proposed to delineate CT dimensions in K-12 education (e.g., Angeli et al., 2016; Rose et al., 2017). In this study, we utilized both the frameworks proposed by Brennan and Resnick (2012), as well as ISTE and CSTA(2011). The framework proposed by Brennan and Resnick (2012) has been widely employed as the theoretical basis of many studies (e.g., Falloon, 2016; Sáez-López et al., 2016; Zhang & Nouri, 2019). This CT framework comprises three dimensions: computational concepts, computational practices, and computational perspectives. The first dimension comprises essential concepts used in programming (e.g., sequences, loops, conditionals, operators). The second focuses on the processes involved in tackling problems (e.g., being incremental and iterative, testing and debugging). The third refers to learners’ evolving perception of computing, which is not the focus of this study. This framework provides a valuable reference for research in terms of teaching content and evaluation.
Another popular operational definition of CT was developed by ISTE and CSTA(2011). It views problem-solving abilities as an embodiment of CT skills and considers CT a problem-solving process that covers six components: defining and decomposing problems, collecting and processing data, representing data via abstractions, automating solutions via algorithms, formulating efficient solutions against evaluation criteria, and transferring knowledge to tackle other issues. When transforming ideas into algorithms, the control flow structures, including sequence, repetition, and selection structures, will be applied to represent the computer programs (Angeli et al., 2016). Specifically, the sequence denotes a series of ordered steps to be followed. The repetition structure refers to the process of running specific repeating commands by constructing an iterative plan “involving the identification of the repeating instructions and the condition governing the end or continuation of the repetition” (Chao, 2016, p. 204). The selection structure represents the decision-making process of running different command sets based on conditions by establishing “relationships between conditions and the corresponding computer instructions” (Chao, 2016, p. 204).
Using Scratch to Develop CT Through Programming
There are various approaches to developing CT in K-12 contexts (Pérez-Marín et al., 2020), among which programming is an effective approach (Flórez et al., 2017; Scherer et al., 2020). In recent years, various child-friendly programming tools have been developed (e.g., Bee-Bot, Kodu). Among these tools, Scratch stands out as one of the most common visual programming tools (Rodríguez-Martínez et al., 2020). Scratch simplifies the programming mechanics (e.g., replacing the conventional code-typing method with an intuitive drag-and-drop approach), offers support for learners (e.g., providing visual models that display the process and result of executing certain commands), and motivates learners to learn to program (e.g., its media-rich programming environment used for designing various products). Evidence from empirical studies (e.g., Durak, 2020) has shown that learning via Scratch can enhance students’ learning attitudes and outcomes. For example, Rodríguez-Martínez et al. (2020) found that programming learning via Scratch improved sixth-graders achievement on computational concepts.
Why the FCM and 5E Conceptual Framework
Although programming using Scratch can promote students’ CT development, two constraints may prevent the effective implementation of programming training in elementary school classrooms.
Lack of student active learning. The definition of CT (e.g., Brennan & Resnick, 2012) emphasizes the importance of concepts and skills. Students need to apply the CT concepts into practice to develop actual problem-solving skills (e.g., decomposition, algorithm design). Program design is a complex task for novices, which requires them to spend a considerable amount of time solving programming problems (Chis et al., 2018). Yet, traditional programming courses are usually centered on the teacher’s didactic lectures and concept attainment (Tsai, 2019). Due to the limited student active learning found in many elementary schools, it is difficult for young novices to engage in practices to fully develop their CT skills (Papavlasopoulou et al., 2019).
Low level of engagement. Programming is cognitively challenging for novices (Papavlasopoulou et al., 2019). Students’ passive absorption of course content in a traditional classroom setting may lead to their superficial understanding of knowledge (Strelan et al., 2020), which may discourage them from participating actively in classroom discussions and program designs. Moreover, limited in-class activity time prevents students from engaging in solving problems fully. As a result, students may feel frustrated and lack the motivation to learn to program.
To alleviate the aforementioned constraints, effective teaching strategies for programming training are sorely needed to make students interested in learning (Papavlasopoulou et al., 2019) and allow them to actively engage in tackling computational problems (Erol & Kurt, 2017). We, therefore, suggest that two models, namely the FCM and 5E framework, can be integrated to address the challenges in implementing programming courses in elementary school classrooms.
FCM
The FCM could be used to alleviate the aforementioned constraints. Its pre-class learning sessions can help students learn basic concepts related to programming before coming to class (Tsai et al., 2015). The flexibility of learning through video lectures can help young learners manage their cognitive load (Abeysekera & Dawson, 2015). Direct lectures are moved out of the in-class session to the pre-class session, enabling students to learn the content at their own pace. Furthermore, the FCM provides more opportunities for student active learning activities (i.e., group discussions, problem-solving). This may help promote students’ behavioral (e.g., involvement, increased interaction with teachers/peers), affective (e.g., interest), and cognitive engagement (e.g., understanding, focusing on task) (Bond, 2020).
5E Conceptual Framework
According to Tanner (2010), the order in which we design and sequence the course activities (e.g., what comes first, in-between, and last) is critical. For many instructors who have used a conventional lecture-based teaching approach, considerations of order have been mainly about the sequence of lecture ideas (Tanner, 2010). However, with the increasing use of active-learning strategies such as the flipped classroom approach, class sessions are moving from having a single main activity (i.e., a lecture) to having different activities over the course (Tanner, 2010). This raises an important question – how can an instructor sequence the various activities to maximize student learning?
The 5E framework can provide useful guidance to instructors in sequencing the order of the course activities. The 5E framework is developed based on various educational theories and models (refer to Bybee et al., 2006), which focuses on student-centered learning. The 5E framework requires teachers to stimulate students’ interests and activate their prior knowledge. It also encourages students to develop their initial understanding through exploration and challenge their comprehension through complicated tasks to eliminate misconceptions. This framework encompasses five phases of Engagement, Exploration, Explanation, Elaboration, and Evaluation (Figure 1). Each phase has its salient characteristics to offer teachers guidance during flipped course design. The application of the 5E framework can address the problem (Lo et al., 2018) of the lack of theoretical underpinnings when developing the flipped teaching, which may help improve the coherence and logic of flipped teaching activities. The 5E framework provides the theoretical support for the design and organization of the course activities in a flipped course.

Description of the 5E Conceptual Framework.
Recently, in higher education, some researchers have begun to explore using the 5E framework to design the FCM. Aşıksoy and Ozdamli (2017), for example, designed their video lectures guided by the first three phases. These lectures aimed to intrigue students, encourage their exploration through open-ended questions, and explain the lesson content. Subsequent in-class time was dedicated to the discussion and experiment planning in the elaboration stage and problem-solving in the evaluation stage. The results suggested that this method helped students obtain higher scores on the physics test compared with traditional teaching. Hew et al. (2018) designed two flipped courses based on the 5E model. The results indicated that 92% of the participants agreed that the courses were more engaging than non-flipped instruction.
However, the previous 5E flipped classroom research has focused mainly on higher education. University students are typically able to self-regulate their learning better than elementary schoolers, who often need assistance to manage their learning. Since the flipped classroom approach places greater demands on learners’ self-regulation, we know very little about the efficacy of a 5E-based FCM for elementary school students’ CT development. Additionally, Chinese elementary students, especially those in rural areas, are generally accustomed to teacher-centered teaching and rarely participate in active learning (Sit, 2013), which may, to some extent, undermine the efficacy of flipped classrooms. Therefore, it is essential to explore the effects of a 5E-based FCM on students’ CT in Asian elementary schools. The following research questions were addressed in this study:
Methodology
This study aimed to explore the effects of the innovative 5E-based FCM on students’ CT knowledge improvement. Empirically-based effective pedagogy should be developed to address challenges such as limited student active learning and a low level of engagement, especially when implementing programming teaching for young novices in elementary schools. The current study used a pretest-posttest quasi-experimental design with experimental-control groups. The independent variable of the study was the instructional approach (5E-based FCM/non-flipped), while the dependent variables were the student understanding of CT concepts and their computational problem-solving performance. In addition, students’ and the instructor’s perceptions of the FCM were examined.
Context and Participants
The study was conducted in a public elementary school. The teacher responsible for the 4-grade Information and Communications Technology (ICT) course was new to the flipped teaching and did not have much experience in programming. Therefore, the researcher, in collaboration with the teacher, undertook the main responsibilities of designing the learning materials and teaching programming. The researcher discussed the course content with the teacher to ensure that the same content was taught, and similar activities were applied in the experimental and control group. The teacher was present in the computer lab during all lessons to assist with monitoring teaching, managing the classroom, and providing technical support.
Two hundred and forty-seven 4th graders participated in this study. Table 1 shows the demographic information for two groups’ students. All participants had at least one and a half years of experience using computers in the ICT course taught by the same teacher, but most (about 95%) had not learned either programming or Scratch. Three classes were randomly assigned to the experimental group, consisting of 125 students, while the remainder were assigned to the control group, including 122 students. Before the intervention, the participants were informed of the group in which they were enrolled. However, they were not told whether they belonged to the control or experimental group.
Demographic Information for Participants of Two Groups.
Procedures
As presented in Figure 2, the computational thinking test (CTt) was administrated at the beginning of the intervention. One week later, the programming training was implemented in the ICT course, which provided one lesson per week and lasted five weeks. After completing the course, students were asked to finish two final programming projects individually. In addition, all students completed the CTt posttest. Then, some experimental group’ students were selected to attend individual interviews within two weeks.

Diagram of the Experiment Design.
In programming learning, each student was equipped with a computer on which the Scratch offline editor was installed. The first introduction lesson focused on explaining the main learning activities students needed to complete. The students were also taught the sequence concept. Following that, they learned other essential CT elements in the remaining lessons. Referring to the curriculum content of some research in K-12 (i.e., Basogain et al., 2018; Jun et al., 2017), a lesson syllabus was developed (Figure 3). Since all learners were unfamiliar with CT, we focused mainly on core concepts, such as sequence, loops, and conditionals. These fundamental concepts are among the most widely used in the CT literature (Tsai, 2019). Due to its importance and children’s poor performance in it (Choi et al., 2017; Falloon, 2016), problem decomposition was also included in this study.

Course Content for Learning Programming.
Toward a 5E-Based FCM for the Experimental Group
The experimental group adopted the 5E-based FCM, which includes two stages: the technology-supported self-learning stage and the interactive in-class learning stage. Figure 4 shows the activity design of flipped learning guided with the 5E framework. In this study, the flipped classroom had an average of 20 mins of pre-class activities (e.g., watch video lectures and answer quizzes), while the control did not have any pre-class activities to complete. However, the duration of the face-to-face in-class time was similar for both groups.

Depiction of the 5E Flipped Classroom CT Approach.
Technology-supported self-learning before class. In general, students prefer streaming content (e.g., watching videos) to reading texts (Smith, 2013). Therefore, the instructor provided video segments for students to conduct self-learning before class. Students often have a certain amount of other homework after school. Therefore, the combined time required to go through all videos for lesson preparation should not exceed 25 minutes, to avoid creating an excessive pre-class workload for students (Lo & Hew, 2017).
In each video lecture, the instructor used games or real-world questions to trigger students’ learning interests to engage them in video watching. The instructor also presented students with examples they were familiar with to elicit their prior knowledge (engagement phase). For example, the video displayed snowflake pictures and encouraged students to imagine drawing a snowflake in Scratch. Furthermore, by providing useful learning materials (e.g., code segments), video clips guided students’ exploration (e.g., encouraged students to think about how to simplify the script of drawing a square) to help develop their basic understanding of relevant concepts and their application (exploration phase). The video also provided brief definitions of new concepts or demonstrated specific command operations (explanation phase), helping students consolidate their conceptual understanding. Video lectures contained a live screen capture of the PowerPoint along with the instructor’s audio narration. The students had to watch two or three video clips for each lesson.
After the video segments, online quizzes were used for students to apply the knowledge acquired to solve problems on their own (exploration phase). Using quizzes can lead to better learning performance (Strelan et al., 2020). Brief explanations of the correct solution to each quiz question were provided to help clarify students’ misunderstandings (explanation phase). Knowing which question was answered wrongly enables students to self-evaluate the content they did not understand well (evaluation phase), thereby encouraging them to re-watch the video segments with purpose. Moreover, by checking student performance on quizzes, the instructor could evaluate whether they had prepared for the class and how much they had learned (Zappe et al., 2009). These learning resources (including video clips and online quizzes) were posted on the learning management system (https://cas.xueleyun.com/), easily accessed by the students.
Interactive learning during the in-class sessions. Four phases were used to organize the student-centered active learning activities during the face-to-face meeting. First, the instructor began the lesson with a brief review (engagement phase). Inviting students to play games related to the learned concepts or revisiting the real-life issues provided outside the class could arouse their attention. The instructor also helped the students recall what they had learned outside the class to activate their prior knowledge. Next, the instructor organized whole-class or group discussions (explanation phase), focusing on difficult quiz questions with a low correctness rate. This activity allowed students to express their doubts on questions and argued over different solutions with peers. The instructor was there to provide advice and clarified students’ misunderstandings (Lai & Hwang, 2016).
In the elaboration phase, students were provided with opportunities to participate in solving problems (e.g., creating programs), which helped extend their conceptual understanding to arrive at a higher level of cognition, that is, concept application. Students were asked to struggle through the problems on their own. The instructor only gave technical help regarding the Scratch programming tool, rather than giving solutions to the students regarding the problems they needed to solve. Students were encouraged to discuss and elaborate ideas with their peers.
Finally, in the evaluation phase, the instructor assesses whether the students have mastered what they have learned. Students were invited to present their solutions to the problems. In addition, they were required to assess their own learning performance by answering several questions, such as what I learned in today’s lesson, and what learning content I find challenging to understand.
Non-Flipped Teaching for the Control Group
In the control group, the 5E framework was not used to guide the activity design. In the non-flipped group, the instructor taught students using lectures, in which the instructor’s primary concern was about the sequence of the lecture ideas, supplemented with the teacher asking students questions and providing feedback to the students’ answers. During this period, the instructor taught students relevant knowledge that was the same as the content learned by the experimental group students in video lectures. Moreover, the instructor presented students with the same quiz questions. For simpler questions, the instructor provided students with correct answers and brief explanations, while for complicated ones, the instructor offered them detailed explanations. Questions explained in detail in the two flipped and non-flipped groups were generally the same. After the teacher-centered lecture, students were required to complete the same programming tasks as the experimental group students did. They had opportunities to share ideas with their peers in the programming practice. Unlike the experimental group, the instructor gave specific directions to solve the problems if the students asked for her help. Finally, the remaining time was spent on students’ solution sharing, teacher feedback, and student self-reflection. Although the control group lacked additional pre-class activities compared with the experimental group, both groups learned the same content (including quizzes and programming tasks).
Data Collection
CT Concepts
We selected the CTt developed by Román-González et al. (2017) to evaluate students’ understanding of CT concepts because this test covers the main CT contents taught in our programming training. We adapted the CTt consisting of 28 multiple-choice items as a diagnostic test to measure students’ prior knowledge. Since the visual blocks used in 20 items and options in the original CTt were not like the blocks in Scratch, all of these blocks were replaced by Scratch programming blocks. This test covered seven computational concepts, including basic directions and sequences, loops (repeat, repeat until), conditionals (if-then, if-then-else), loops and Boolean operators (repeat until and not operator), and simple functions. Furthermore, three kinds of cognitive tasks were required in this pretest: 1) a sequencing task including all types of concepts with 14 items; 2) a completion task covering all concept types with nine questions; and 3) a debugging task involving only five types of concepts excluding Boolean operators (with loops) and functions.
The posttest was modified from the CTt pretest to avoid the problem of students merely recalling the questions and corresponding answers in the pretest. Although the posttest questions were different from those of the pretest, they were similar in difficulty level and content scope. Both tests were checked by three experts with extensive programming experience (one is a faculty member responsible for teaching programming language in a top-tier Asian university, and the other two are advanced doctoral students).
Problem-Solving Performance via Programming Projects
To evaluate their problem-solving abilities, the students needed to create programs at the end of the course. This project-based assessment method, different from the CTt that focuses on gauging students’ knowledge of CT concepts, was used to assess students’ overall problem-solving performance and their ability to apply control flow structures (belonging to algorithm design) and problem decomposition abilities. This helped the researchers to obtain a more holistic view of students’ CT performance (see “Programming Projects” section for more details).
Two projects were developed by the researcher. The researcher consulted two experts (as mentioned above) for suggestions. One of the assignments involved drawing a house using basic geometric patterns where the loops concept was applied. Another was to guide a virtual character to follow a prescribed route to arrive at a destination and collect four moneybags, including conditionals and loops concept. The students were given 30 minutes and 20 minutes to complete the first and second projects individually, respectively.
Perceptions of the 5E-Based FCM From Students and the Instructor
Based on the CTt posttest scores, students of the experimental group were divided into two subgroups, high-level and low-level achievement subgroups. Twenty-seven students from two subgroups (high-level achievement: 15 students; low-level achievement: 12 students) attended individual interviews, which lasted 5–10 minutes each. The interview focused on the student’s perception of flipped learning. The relevant questions were adapted from several surveys used in flipped classroom research (e.g., Jeong et al., 2016; Zappe et al., 2009) and focused on the use of video lectures (e.g., Did watching video lectures before class help you complete in-class activities? Could you explain more?), online quizzes (e.g., Did the completion of online quizzes help you realize that you have ignored certain concepts in video watching? If yes, what measure did you usually take?), and in-class activities (e.g., Did you find interactions in programming practice helpful to your learning? Could you explain more?). Moreover, the instructor interview aimed to understand the benefits of this 5E-based FCM and challenges that may arise in the flipped teaching implementation.
Data Analysis
CTt
There was no missing value in the two tests since the instructor checked all answer sheets when collecting data to ensure all participants wrote down their names on them. For each test, an individual’s score was computed by summing the number of correct answers for a maximum of 28 points. To evaluate the effects of the 5E-based FCM on student achievement in the whole test as well as the three types of subtasks, a 2 × 2 × 3 mixed ANOVA method was conducted. This analysis consisted of one between-subjects factor pedagogy (i.e., 5E-based FCM and non-flipped instructional model) and two within-subjects factor measurement times (i.e., the pretest and posttest) and task type (i.e., sequencing, completion, and debugging task). As the numbers of multiple-choice questions for each task were different (see “CT Concepts” section), the students’ correctness rates were used as the dependent variable. When this three-factor ANOVA yielded significant results, we conducted further explorations through Bonferroni-corrected pairwise comparisons.
Programming Projects
We developed a coding scheme based on the operational definition of ISTE and CSTA (2011) and similar evaluation frameworks developed by other researchers (e.g., Chao, 2016), to assess the extent to which the students could use the Scratch blocks to conduct computational problem-solving activities independently.
Figure 5 shows the two categories used for the project evaluation: goal attainment and code organization. The goal attainment category mainly represents the effectiveness of the design solution, namely, the extent to which the students could achieve the desired goals by devising their programs. The students obtained one point per goal they accomplished, with 26 representing full marks of the project design. Figure 6 presents script examples of four students to show how the subgoals they achieved were scored. Two sub-skills involved in this problem-solving process were further checked. The first sub-skill evaluated whether students used these control flow structures correctly to algorithmize their solutions. Higher values in this sub-category mean that the students accurately applied repetition and selection structures to complete their algorithms. The second sub-skill evaluated whether the students used a decomposition strategy to break a problem into several manageable subtasks covering all goals that the whole task was intended to achieve. The students received one point if they succeeded; otherwise they received zero.

Evaluation Criteria for Student Performance in Solving Problems.

Script Examples of Four Students With Different Sub-Goals Achievement.
Code organization refers to the students’ ability to propose efficient solutions with high readability. It mainly assesses whether there are extraneous commands in the students’ programs, such as commands that were executed but made no contribution to goal attainment (see Mouza et al., 2016) and commands/procedures that were not invoked (see Denner et al., 2012). Redundant commands impede the solution’s efficiency. In contrast, although they are harmless to the solution’s efficiency, dead commands can lead to visual clutter and distraction for novices. The number of redundant commands and dead commands/procedures in each student’s project was counted. A high value in the number of these commands indicated that the students performed relatively worse in this category.
There was no missing value in data since the instructor checked students’ projects when collecting data to ensure all participants name their projects using their names. Four hundred and ninety-four projects (two projects per student) were carefully coded. We opted to code the entire 494 programming projects instead of merely coding a sample of the 494 projects to avoid any potential sampling error.
Two researchers (the lead author and a doctoral student) individually coded approximately 52% of the data. For the goal attainment category and its sub-categories, the result indicated that Cohen’s kappa was 0.984 (p < 0.001) with a 95% CI of [0.978, 0.990], showing almost perfect inter-rater reliability (Landis & Koch, 1977). For the code organization category, as the raters had to count the number of relevant commands, the intraclass correlation coefficient (ICC) was used to check inter-rater reliability. The analysis showed the ICC to be 0.913 (p < 0.001, 95% CI: 0.900 to 0.924), demonstrating excellent agreement beyond chance between the two raters (Landis & Koch, 1977). The differences were resolved through discussion. Then the lead author finished the remaining coding work. The chi-square test was conducted to explore whether there was a significant difference between the two groups in the completion rate of the problem decomposition work. Since the data in other (sub)categories violated the normality assumption, a non-parametric test was used to compare the differences in learning scores between the two groups.
The issue of possible outliers was also checked. A few outliers existed in the data in the projects. We conducted the same data analysis after removing these outliers and found that the two analyses’ results (with and without outliers) were similar, namely, these outliers did not cause significant effects. Thus, we decided to keep these outliers in our data analysis and results.
Individual Interviews
The qualitative data collected from interviews helped provide insights into students’ and the instructor’s perceptions of the 5E-based FCM. The interviews were first transcribed to document the conversations and then thematically analyzed to form meaningful themes (Corbin & Strauss, 2008). When reporting the interview findings, some of the excerpts of interview transcripts were translated into English and directly quoted to elaborate on the categories generated (Johnson, 1997) and dispel any misinterpretations.
Results
We used 0.05 as the significance level for all analyses. Bonferroni adjustment was used for pairwise comparisons to minimize the probability of Type-I error. The parametric tests reported the partial Eta-squared correlation coefficient (
Student Achievement on the CTt
Table 2 presents the descriptive data on the two groups’ correctness rates in different tasks and at different measurement times. The mixed factor ANOVA revealed that the interaction between pedagogy and task type was insignificant (F(2,244) = 1.75, p = 0.177). However, a significant main effect of pedagogy was revealed (F(1,245) = 11.12, p = 0.001), which suggests a marked difference in correctness rates for the whole test between the two groups, favoring the experimental group. This analysis also uncovered a significant main effect of task type (F(2,244) = 39.14, p < 0.001). The follow-up comparisons showed that the students performed best in the sequencing task and worst in the debugging task. All comparisons reached the significance level.
Correctness Rates in the Pretest and Posttest for Each Group in Each Task and ANOVA Analysis Results Concerning Pedagogy.
The ANOVA revealed an interaction between measurement time and pedagogy (F(1,245) = 28.01, p < 0.001), showing a difference in correctness rates between the two groups before and after the intervention. Specifically, in the pretest, the correctness rate of the students in the experimental group was similar to that of the control group (p = 0.644), indicating that both groups were similar in terms of their initial CT knowledge. In contrast, the difference between the two groups in the posttest was statistically significant, and the experimental group performed better (p < 0.001,
Moreover, there was an interaction effect between measurement time and task type (F(2,244) = 20.73, p < 0.001). Specifically, there were significant differences between the three tasks for both tests (i.e., pretest: p < 0.001,
Finally, there was a marked main effect of measurement time (F(1,245) = 339.88, p < 0.001), indicating that the students’ correctness rates increased significantly from pretest to posttest. After further analyzing the simple main effect of measurement time, the results showed that for both groups, regardless of the task, the students obtained higher correctness rates in the posttest than in the pretest (all comparisons achieved the significance level with p < 0.001, except that of the completion task in the control group with p = 0.004). There was no three-way interaction (p = 0.629).
In summary, students’ correctness rates on the CTt in the flipped and non-flipped groups had initial equivalence. However, although the achievement of both groups improved significantly after programming instruction, the flipped group performed better, with medium to large effect sizes. When considering the task type, the results showed that the two groups initially maintained a similar performance on each task, and both obtained higher correctness rates on the completion and sequencing tasks than on the debugging task. Thanks to the interventions, the performance of both groups on each task markedly increased from pretest to posttest. However, the improvement of the flipped group was more significant than the non-flipped group. Both groups had the highest correctness rates on the sequencing task.
Student Achievement in Applying CT to Solve Problems
For problem-solving performance, the differences between the two groups were statistically significant (Table 3). Specifically, the students in the 5E-based flipped classroom achieved more goals (Mdn = 26) through their self-created programs than those in the non-flipped classroom (Mdn = 20), U = 4920.00, z = −5.076, p < 0.001. Furthermore, the experimental group generally wrote fewer extraneous commands in programs than the control group (U = 5399.50, z = −4.103, p < 0.001).
Mann-Whitney U Test on Student Scores for Problem-Solving Performance.
When further considering the extent to which the students could apply control flow structures during algorithm design, results showed that the experimental group performed better (Mdn = 25) than the control group (Mdn = 21), U = 4849.00, z = −5.091, p < 0.001, r = 0.324. Specifically, as shown in Table 4, the 5E-based FCM helped the students better master the repetition structure (U = 4818.50, z = −5.160, p < 0.001) and selection structure (U = 6240.50, z = −3.343, p < 0.001), thus improving their potential to use relevant concepts to achieve goals.
Mann-Whitney U test on Student Scores for the Application of Control Flow Structures.
As for their decomposition ability (Table 5), the chi-square test revealed a significant difference in the students’ completion of decomposition work between the two groups (
Chi-Square Test on Student Performance on Decomposition.
Individual Interviews
Student interview data mainly showed their viewpoints on the learning processes outside and inside the class. Most students agreed that the out-of-class session benefitted learning (Figure 7). For example, the video lectures gave them greater learning autonomy and supported their problem-solving and acquisition of knowledge. Furthermore, online quizzes guided their learning and deepened their understanding of knowledge.

Advantages of Learning Outside and Inside the Class With the 5E-Based FCM.
Three advantages of in-class learning were identified (Figure 7). Ten students mentioned that whole-class revision and discussions contributed to their learning. When asked whether they interacted with peers during the activities (n = 26), all interviewees stated that they engaged in learning interactions in which they asked questions (n = 24), helped others (n = 21), or both (n = 19). Furthermore, the majority of those who received help reported that the interactions helped their learning (n = 19).
The instructor’s interview primarily focused on the benefits and challenges of the 5E-based FCM, which can further help us gain in-depth insights into the effects of the 5E flipped teaching on students’ learning. The instructor emphasized the importance of the 5E framework in the design and organization of flipped teaching activities. For example, the instructor stated, “5E framework directed me to organize the order of various learning activities better to maximize the positive effect of the flipped teaching”. Moreover, according to the instructor, one of the main benefits of the 5E-based FCM was that it enabled student self-paced learning (i.e., “Before coming to class, the pre-class activities allowed students in the 5E-based FCM to conduct self-paced learning. Video lectures allowed students to drag the progress bar to pause, replay, or skip certain content. This can help students manage their cognitive load.”). Another benefit identified from the instructor’s interview was that the 5E flipped model supported a higher level of student engagement (i.e., “the exploration phases in the pre-class learning enabled students to identify gaps in their understanding of concepts … in the explanation phase during the in-class learning, many students always show great enthusiasm for viewpoint sharing…in the subsequent elaboration phase, I felt that students were more actively involved in programming tasks.”).
Aside from these advantages, two issues were identified based on the interview data of the instructor. One issue appeared at students’ out-of-class learning stage, while another arose in their in-class session (Figure 8).

Challenges of Implementing the 5E-Based FCM.
Discussion
Overall, the results show that the 5E-based FCM is indeed an effective pedagogy to help students develop their CT. The main findings are discussed in four subsections: student understanding of CT concepts, student ability to apply CT to solve problems, essential components of the 5E-based flipped classroom, and suggestions to address the identified challenges.
Student Understanding of CT Concepts
The average correctness rates of both groups on the pretest were similar, with around 43%. This result is to be expected, given that CT training is not included in the school curriculum. Moreover, both groups achieved the lowest correctness rates (33%) on the debugging task. Debugging is a complex cognitive process (Klahr & Carver, 1988). The error detection stage in debugging is especially challenging for young novices, as it requires them to use concepts and strategies they are unfamiliar with to solve problems (Falloon, 2016).
Despite relatively low initial correctness rates, both the experimental and control groups experienced significant improvements in correctness rates on the three tasks (sequencing, completion, debugging). It suggests that learning programming with Scratch can enhance elementary school students’ understanding of computational concepts, which is consistent with the findings of previous research (e.g., Pérez-Marín et al., 2020; Sáez-López et al., 2016). Moreover, both groups performed best on the sequencing task in the posttest, which is attributed to the routine exercises (e.g., online quizzes and programming project production) in which students learned to understand how such code segments work to achieve specific goals and attempted to develop their scripts to solve problems.
In addition, the flipped group significantly outperformed the non-flipped group, indicating that the 5E-based FCM had a positive impact on the mastery of computational concepts. Specifically, the flipped group saw a noticeable increase in correctness rates on the debugging task. It may be attributed to the fact that flipped teaching helps strengthen student understanding of concepts and gives them more time to engage in debugging practice while designing projects. However, this does not necessarily mean that students are now experts in program debugging. Program debugging is too complex to master within a relatively short period (Zhang & Nouri, 2019). Future research should collect different data (e.g., students’ debugging performance during program refinements) to examine the debugging ability development.
Student Ability to Apply CT to Solve Problems
Our findings revealed a marked difference in overall problem-solving performance between the two conditions. First, the students in the flipped group achieved more sub-goals by designing programs. Second, they tended to make their programs clearer and easier to understand. Specifically, most of them developed a habit of deleting dead commands from the script area, which helps them avoid visual clutter and focus on comprehending how programs work (Aivaloglou & Hermans, 2016). Moreover, they were more likely to propose a higher-quality solution without redundant commands.
When further measuring student performance on CT sub-skills, our results again showed that the 5E-based FCM could strongly benefit students. For example, more students in the flipped group used high-level structures (e.g., repeat, repeat until, and if-then) and applied them correctly to deal with problems. More students in the flipped group could also successfully divide the problem into the required subtasks essential to achieve the ultimate goal. In contrast, the non-flipped group students performed worse in these two aspects, such as making mistakes when applying control flow structures and being unable to complete the problem decomposition work.
Essential Components of the 5E-Based Flipped Classroom
Based on the third research question “What are the students’ perceptions of the 5E-based flipped lessons?”, we can infer that the positive CT and problem-solving performance of the flipped group can be attributed to the in- and out-of-class activity design guided by the 5E framework. Therefore, this section discusses some key components in the flipped classroom to show how they contribute to students’ CT development.
The Out-of-Class Self-Learning Stage
The learning stage before class mainly includes two activities: 1) video lectures containing the engagement, exploration, and explanation phases and 2) follow-up online quizzes covering the exploration, explanation, and evaluation phases. The primary objective is to familiarize students with basic computational concepts and their applications, focusing on knowledge recall, understanding, and application.
The video lecture is the core teaching strategy at this stage. It can help students conduct self-learning based on their learning needs (van Alten et al., 2020), such as replaying specific video clips to seek answers when they faced difficulties or skipping certain easy-to-understand content to save time. Video lectures were designed guided by three 5E phases. They first piqued students’ learning interest, then allowed students to explore the meanings and applications of relevant concepts, and finally provided students with the teacher’s explanations and demonstrations. The students interviewed considered the video lecture a helpful approach to preview the lessons, commenting that learning via videos could help them familiarize themselves with new knowledge, solve quiz questions, and acquire strategies to deal with in-class problems.
Online quizzes are also an essential component. Many studies have adopted online exercises as a method of measuring students’ pre-class learning (Akçayır & Akçayır, 2018). Computerized feedback offers students evidence of their misunderstandings, which may encourage them to act further to eliminate misconceptions on programming concepts. For example, when faced with difficult questions, some students chose to ask their parents for help, while some tended to watch the video clips repeatedly until they could address those questions. Lo et al. (2018) also agreed that online exercises could guide student learning, for example, by motivating them to re-attempt quizzes until they achieve full marks. Such actions may deepen students’ understanding. Furthermore, student responses to quiz questions allow the teacher to track their mastery of knowledge and determine which items should be emphasized during class. Most students completed tasks of watching video clips and answering quiz questions, indicating that limiting the pre-class learning duration for each lesson to an average of 20 minutes is an acceptable level when considering students’ pre-class workloads.
The In-Class Interactive Learning Stage
Creating programs is cognitively challenging for novices. The four phases (i.e., engagement, explanation, elaboration, and evaluation) in the 5E framework were used to guide the activity design of the in-class learning session, which encouraged the students to engage in higher-order cognitive activities (i.e., applying, analyzing, evaluating, and creating).
Introducing an engagement phase at the start of each lesson is necessary, considering that some students sometimes forgot the content learned before class. A brief review could help students activate their prior knowledge and facilitate students’ engagement in subsequent active learning activities. The students reported that this activity helped strengthen their understanding of the required expertise. In the explanation phase, as Lai and Hwang (2016) suggested, the teacher guided the whole class to discuss quiz questions that had high error rates, so that the students could express their misunderstandings. Our interview results showed that the discussions helped eliminate student misconceptions.
Conducting problem-solving activities in the elaboration phase is a crucial way to use in-class time productively. CT refers to the capacity to use fundamental computer concepts to solve problems. Problem-centered activities provide students with opportunities to struggle through the problems on their own. In these activities, they think about and explore a problem, and then create programs to accomplish specific goals during which they experience problem decomposition, strategy selection, abstraction creation, algorithm design, and error fixing, which contribute to students’ mastery of CT skills (Topalli & Cagiltay, 2018).
Suggestions to Address the Identified Challenges
Based on the fourth research question “What are the instructor’s perceptions of the 5E-based flipped lessons?”, we identified two challenges associated with online quizzes and in-class evaluation activities. In this section, we provide remedial suggestions to address the challenges for future practices, which may further enhance the positive effect of 5E-based flipped teaching.
When doing the quiz, students could not replay the videos to search for forgotten knowledge until they had submitted answers to all questions. Offering students scaffolding (e.g., learning sheets) may encourage them to observe the videos seriously, prevent them from forgetting knowledge learned in the video lectures, and improve their correctness rates on quizzes. Furthermore, some students tended to answer unknown quiz questions randomly. Consequently, offering a means of communication through which the teacher can provide timely feedback to dispel student doubts may further promote student learning outside of the classroom (Bhagat et al., 2016).
The teacher found that many students completed their self-reflections casually without offering clear and detailed descriptions due to the limited time and motivation. Therefore, future research should leave more time for students’ self-evaluation. Moreover, the teacher’s feedback may motivate them to reflect on their learning seriously.
Implications for Relevant Research and Practitioners
This study shows important implications for enriching empirical studies on CT training in elementary schools. There are several methodological limitations in existing empirical studies. First, many researchers have adopted only one type of assessment method to measure students’ CT performance (e.g., Choi et al., 2017; Rose et al., 2017; Sáez-López et al., 2016; Sung et al., 2017), which can lead to an incomplete view of student development in CT skills (Román-González et al., 2019). Furthermore, some studies have relied merely on self-report CT instruments (e.g., Jun et al., 2017), which may not provide reliable evidence for changes in students’ actual abilities. Second, the sample sizes of previous studies tend to be small, ranging from 28 to 40 subjects (e.g., Chen et al., 2017; Pugnali et al., 2017; Rose et al., 2017). Using a small sample size can undermine the study’s validity and increase the chance of assuming a false premise to be true (Faber & Fonseca, 2014). It also makes the research findings very preliminary (Zhao & Shute, 2019), impeding their generalization. Third, several studies did not use a control group (e.g., Sáez-López et al., 2016) or had shown significant inequalities in terms of the number of participants between the experimental and control groups (e.g., Pérez-Marín et al., 2020). This can decrease the reliability of the (comparison) results. This study makes up these limitations. We adopted a quasi-experimental design with experimental–control groups to explore to what extent the 5E-based FCM can promote CT development among 247 elementary schoolers. We also used various quantitative (i.e., CTt test and project evaluation) and qualitative (i.e., individual interviews) methods to collect more reliable data to help us develop a comprehensive understanding of the effectiveness of the 5E-based FCM.
In addition, this study has significant implications for teaching practices. First, we demonstrated that the 5E framework could guide the implementation of the flipped classroom in elementary schools, which may offer instructors an effective teaching method. Second, our results highlight the efficacy of 5E flipped teaching for elementary schoolers’ CT and problem-solving abilities development, thereby enriching teaching resources for CT training.
Conclusion and Limitations
This study integrated five instructional phases into the FCM and evaluated its effects on the CT development of elementary schoolers. Overall, our results show that the 5E-based flipped teaching model can enhance 4th graders’ understanding of CT concepts and their computational problem-solving performance (including specific CT skills).
The positive outcome of the 5E flipped model could not be attributed solely to more class time. Although the flipped classroom approach can help free up more time for students and instructors to use, it is not the time per se that determines how effective the approach will be. More specifically, the 5E framework promotes student achievement more than the traditional lecture-based approach (e.g., Boddy et al., 2003; Mullins, 2017). The 5E framework achieves this by fostering student active participation (Tanner, 2010). For example, the elaboration phase during the in-class sessions allowed for student active learning where they were asked to think about a problem (e.g., how to draw a flower using Scratch). Students were asked to struggle through the problems on their own. The instructor only gave technical help regarding the Scratch programming tool, rather than giving solutions to the students regarding the problems they needed to solve. Thinking about a task without immediate answers from the instructor can lead to deeper student cognitive processing, which in turn can help students to understand the content better (Deslauriers et al., 2019). Student active learning can enhance learning compared to an instructor lecturing where the latter usually explains the solutions instead of allowing students to think about the solutions first (Deslauriers et al., 2019). During the in-class session, the flipped group showed a higher level of engagement than the non-flipped group. A brief review to engage students’ prior learning at the beginning of the class helped the flipped group make better preparation for subsequent in-class learning activities. Many flipped group students were able to share their viewpoints in in-class discussions. They were also more actively involved in exploring the solutions to programming tasks during which they exchange ideas with peers.
Nevertheless, several limitations should be considered when interpreting our results. First, due to the ICT course schedule at the participating school, we could only conduct the programming class once a week. Moreover, the CTt posttest was administered in the third week after the last class. Such an arrangement may not benefit the maximization of the intervention effect. The effect of flipped teaching on student learning may be more pronounced if future research adopts a relatively intensive class schedule. Second, the teaching duration was short. The longitudinal study should be implemented to investigate changes in student achievement and attitudes over a longer period, providing more enlightening guidelines for implementing the 5E-based FCM in K-12 contexts. Finally, we only enrolled 4th graders in our programming course. We hope future studies will explore whether this 5E-based FCM works in other subject areas and with students of different grade levels.
Despite these limitations, given the importance of CT development in children and the feasibility of the FCM in elementary education, we believe that this study provides valuable advice on how to design the flipped classroom to promote student understanding of CT concepts and computational problem-solving abilities in elementary schools.
Footnotes
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
