Abstract
To combat the high cost of textbooks, open (digitally free) textbooks have recently entered the textbook market. Griggs and Jackson (2017) reviewed the open introductory psychology textbooks presently available to provide interested teachers with essential information about these texts and how they compare with traditional (commercial) introductory textbooks. They did not, however, include any discussion of the research that has examined the effects of open introductory psychology textbooks and other open educational resources versus traditional introductory textbooks on students’ course performance (e.g., course grades). The present study provides a review of this research. The review indicated that no firm conclusions can be drawn not just because there are a limited number of studies with seemingly conflicting findings but more importantly because of numerous uncontrolled relevant variables in all of the studies. To aid researchers who want to conduct future studies on this topic and reviewers who will evaluate these studies, we discuss these variables and the control issues they create.
Keywords
Griggs and Jackson (2017) reviewed all of the open (digitally free) introductory psychology textbooks presently available and concluded that their overall quality in this early stage of their development was clearly less than that of current traditional (commercial) introductory textbooks. Nonetheless, they acknowledged that these open texts meet an introductory psychology textbook market need to help financially and academically underprepared (at-risk) students, especially at community colleges, with the cost of their college education. This is a large need in psychology because community college students comprise half of the 1.5 million annual enrollment in the introductory psychology course (Ewing et al., 2010).
The purposes of Griggs and Jackson (2017) were to make introductory teachers aware of what open introductory texts were available and to provide critical assessments of each of these texts to aid introductory teachers who might want to consider open textbooks in their text selection process. There is, however, other critical information that would aid teachers considering adoption of an open textbook in making their selection decision. It concerns the effects of open introductory psychology textbooks and other open educational resources (OERs) versus traditional introductory texts on students’ course performance (e.g., course grades). OERs are any type of educational materials that are in the public domain or introduced with an open publishing license. In addition to textbooks, OERs include resources such as course assignments, tests, lecture notes, syllabi, videos, review materials, and so on. Anyone can legally copy, use, adapt, remix, and reshare such resources. Thus, open introductory psychology textbooks can be adapted to fit a teacher’s or department’s version of the introductory course, and OERs can be incorporated in such adaptations. Chapters can be deleted or shortened, new content can be added, video links can be incorporated, and so on. Such customizability is especially relevant to the introductory psychology course because of its enormous variability across teachers and departments in terms of goals, methods, and content (Weiten & Houska, 2015). This seemingly unlimited customizability of open textbooks along with the obvious cost savings to students are what make open textbooks appealing to some introductory psychology teachers, especially those at schools with large numbers of at-risk students. But how do such open textbooks impact students’ course performance compared to traditional introductory textbooks? It is this question that the present study addressed. Just as Griggs and Jackson provided descriptions and critical assessments of the available open introductory psychology textbooks, we provide descriptions and critical assessments of the studies that have been conducted hitherto on the effects of open textbooks and resources versus traditional textbooks on students’ performance in the introductory psychology course.
Method
To identify the research studies that have been conducted on the effects of open introductory psychology textbooks or other OERs versus traditional introductory psychology textbooks on student learning outcomes, we performed Google searches using numerous queries, such as open introductory psychology textbooks, open educational resources, effectiveness of open textbooks on student learning, and impact of open texts on student success. Also, upon finding a relevant article, we searched its reference list for other pertinent articles and conducted online searches for other relevant studies by the article’s author(s). These searches identified eight possibly relevant articles and a review article on studies of student and faculty perceptions of OERs and their general efficacy.
Upon careful inspection, however, only two of the eight identified articles examined the effectiveness of open introductory psychology textbooks or OERs in the introductory course: Hilton and Laman (2012) and Fischer, Hilton, Robinson, and Wiley (2015). In his review article, Hilton (2016) mentioned a ninth study, an unpublished doctoral dissertation—Robinson (2015). Actually, one of the two previously identified relevant studies (Fischer, Hilton, Robinson, & Wiley, 2015) was a follow-up study to this dissertation. Hence, we included Robinson’s dissertation in our analyses. Hilton’s review focused on the general findings across the numerous disciplines examined in the nine studies on student and faculty perceptions of OERs and their efficacy and did not detail or summarize the findings for courses in any specific discipline, such as the introductory psychology course, the focus of our study. Thus, in chronological order, we describe and critically assess the efficacy findings for the introductory psychology courses included in each of the three relevant studies.
Hilton and Laman (2012)
Hilton and Laman (2012) was the first published research study to evaluate the effects of an open introductory psychology textbook on students’ course performance. This study took place in 2011 at Houston Community College (HCC), a large community college with more than 70,000 students at that time. Hilton and Laman pointed out that many of these students were not only academically at risk but also financially underprepared for the cost of college. They further reported that because some students did not purchase books at all and other students used out-of-date editions or nonassigned textbooks, the HCC psychology faculty decided to make cost a primary adoption criterion and thus consider open textbooks for the introductory course.
In the fall 2011 term, an HCC adaptation of Stangor’s (2011) open introductory textbook, Introduction to Psychology, was used in 23 sections of the introductory course with a total enrollment of 690 students. 1 Baseline control data were taken from a sample of introductory classes from the preceding spring 2011 term in which traditional introductory textbooks had been used. Three measures of course performance were used: average class grade point average (GPA), student withdrawal rate, and departmental final exam score. Hilton and Laman found that average GPA improved from 1.6 to 2.0 from spring to fall, the withdrawal rate dropped from 14% to 7.1%, and the average grade on the departmental final exam improved from 67.6% to 71.1%. Similar results were reported for two instructors who taught sections in the spring term using traditional texts and in the fall term using open texts. However, no statistical comparisons were reported for any of the results. Thus, we do not know whether these differences are meaningful or not.
In addition to the absence of any statistical comparisons, there were many confounds operating in this study that likely impacted the results. For example, the psychology department committee (a separate committee from the textbook adoption committee) changed the course objectives and final exam test bank for the fall term, and because of the open nature of the Stangor text, a group of HCC psychology faculty were able to customize the Stangor text to align it with the changed standards and course objectives. They also adapted the text so that the reading level was lowered to one that they thought appropriate for their students (Hilton & Laman, 2012, p. 267). How this was done was not specified. In brief, the Stangor text was adapted to specifically fit the course and students at HCC, but the traditional texts used in the spring term obviously could not have been adapted in this manner. In fact, none of the five different traditional textbooks used in the spring term were identified so it is impossible to compare the level of difficulty of these texts versus that of the Stangor text (classified as a brief, lower level text by Griggs & Jackson, 2017). 2 Level of text difficulty obviously impacts student learning outcomes and thus likely played an important role in this study, especially because the low-level Stangor textbook was adapted to be even lower in level. For a discussion of level of text difficulty, see Griggs (1999).
Also, the N used to compute the findings for the open textbook condition in fall 2011 (∼370, see Table 1, p. 268) was not 690 as given in the method section for this condition. Even if you add the Ns for the two instructors who taught sections in both spring and fall 2011 (see Tables 2 and 3, pp. 268–269), the N is not 690 but rather 792. In brief, the N for the fall 2011 open textbook condition does not match the reported sample size of 690 students for this condition in the method section. It follows then that approximately 370 fall students were likely selected from the 690 students in sections using the adapted open textbook in fall 2011. However, Hilton and Laman did not say that they were selected, much less how they were selected. In addition, no N is reported in the method section for the control group selected from the spring 2011 term, and no rationale is provided for how the classes from the spring term were selected. We are only told that “baseline data from a sample of classes that had been offered in the spring 2011 semester” were taken (p. 268). Also, 4 of the 6 Ns given in tables 1–3 are given as approximate Ns as if the exact numbers of participants were not known.
Furthermore, several ancillaries for student use were developed for the HCC adaptation of the open Stangor text, including a narrated final examination review with slides, and the 100 questions on the final examination used in all the fall sections of the course were revised to be compatible with the HCC adaptation. The fact that there is no mention of a comparable narrated final examination review for students in the spring term is important because if there were no such review in the spring term, the students in the open textbook condition in the fall term would have an unfair advantage, confounding the final exam score measure. In addition, the authors point out that the revised final examination test bank used in the fall term may have been simpler than the one used in the spring (Hilton & Laman, 2012, p. 270). It is also important to realize that the act of revising the final exam test bank for the fall term compromised the departmental final exam score measure, rendering the findings for this measure pretty much meaningless. This modification of the final exam also likely affected one of the other performance measures, average class GPA, in favor of the open textbook condition. 3 Regardless, given the confounds created by the revision of the final exam between spring and fall terms and the narrated review for the fall final exam and the ensuing impact of these confounds on the departmental final exam score (and very likely, the average class GPA), the study’s methodological flaws that we discussed, and the lack of any statistical analyses of the results of the study, it is clear that no firm conclusions can be drawn from this poorly controlled and reported study.
Robinson (2015)
Robinson (2015) is an unpublished doctoral dissertation done by Thomas Jared Robinson at Brigham Young University. Robinson examined student success in seven courses, one of which was introductory psychology, using OERs in the 2012–2013 academic year at seven different colleges associated with the Project Kaleidoscope (PK) Open Course Initiative. This was the pilot year of the initiative, supported by grant funding from Next Generation Learning Challenges of the Bill and Melinda Gates Foundation. PK brings together colleges in a collaborative effort to improve the success of at-risk students through the adoption, measurement, and improvement in course design for high-enrollment introductory courses using only OERs. By course design, they mean a set of open materials and assessments necessary to deliver the course—a traditional textbook replacement package comprised only of open materials. No information, however, on what specific open textbooks and resources and traditional introductory psychology textbooks were used in the various introductory psychology courses was provided. Hence, we have an unknown OERs treatment pitted against an equally unknown traditional treatment. Obviously then, the impact of treatment level of difficulty cannot be addressed. 4 Lastly, because the OERs used were not specified, we will use the general term, OERs, when we refer to this condition
The psychology data came from only 3 of the 7 schools: Chadron State College (a 4-year college in Nebraska), College of the Redwoods (a community college in California), and Tompkins Cortland Community College (a community college in New York). All three schools, like all the schools in the PK Open Course Initiative, serve predominantly at-risk students, designated as such by each college’s internal evaluation (Bliss, Hilton, Wiley, & Thanos, 2013; Bliss, Robinson, Hilton, & Wiley, 2013). The Ns for the two conditions were extremely different in size with 1,849 students in the traditional textbook (control) condition and only 223 in the OERs (treatment) condition.
The student course performance variables that are of interest here were course grade and probability of passing with a C− or better. Because of the quasi-experimental design that does not allow random assignment of participants, Robinson used propensity score matching to reduce the risk of selection bias and provide more valid estimates of the effects of using OERs versus traditional textbooks. As Robinson pointed out (p. 35), such matching strengthens the experimental design by minimizing preexisting differences between the control and treatment groups by balancing the probabilities of being in either group (for more detail on propensity score matching and how it is used, see Austin, 2011; d’Agostino, 1998; Luellen, Shadish, & Clark, 2005).
The statistical analyses for the psychology course data indicated that both course grade and the probability of passing with a C− or better significantly differed between the two conditions. Students in the OERs condition scored approximately half of a course grade lower than those in the traditional psychology textbook condition, a significant difference. Psychology students in the OERs condition also showed a significant decrease in probability of passing with a C− or better. These findings for psychology, along with comparable ones for the business courses included in the study, are important because they are the first findings of a negative effect associated with an OER adoption. It is worth noting though that no such negative effects were found for the other five courses (algebra, biology, geography, reading, and writing) examined in the study.
Robinson (2015) described some limitations of his study that he thought comprised threats to the validity, especially the external validity of his study, and thus provided sufficient reason to interpret the findings with caution (see discussion on pp. 62–65). For example, the self-selection of the colleges in the study meant that these colleges would only represent a very small subsection of all postsecondary institutions and thus limit the generalizability of the findings. It is also important to realize that as part of the PK grant funding for these schools, the open curriculums used in the seven courses were curated and vetted by an external PK staff who also trained teachers how to use the curated OERs. Hence, this abnormal adoption scenario may have also contributed to the findings, thereby limiting their generalizability. Astutely, Robinson pointed out that threats to external validity are inextricably tied to the use of OERs because of their flexible nature in that they can be revised and remixed freely so that it is not very likely that any two classes would use the same OERs. This flexibility of OERs means that teachers can use these open resources (as in the Hilton & Laman 2012 study) in ways that would not be possible with textbooks because of their rigidity. Thus, because OERs were used in the courses examined in Robinson’s study and in the Fischer et al. (2015) study to be discussed next, definite threats to the external validity of both studies were present and the generalizability of their findings would be limited.
Given the wildly discrepant Ns for the OERs and traditional conditions, it seems almost a certainty that the sets of instructors for the two conditions were different. Hence, there was no control for instructor characteristics. For example, one such characteristic that varies greatly among instructors and would be very relevant to this study would be the instructors’ grading policies, especially with respect to leniency versus toughness. Course performance would be better in sections taught by instructors who graded leniently versus those taught by instructors who were tough graders. This is not the only instructor characteristic that would impact the results of studies such as this one. Another good example is instructors’ teaching proficiency. Students in sections taught by more proficient instructors would likely perform better than those in sections taught by less proficient instructors. To control for such possible confoundings, a more appropriate way to conduct this study and others like it would be to use only one set of instructors with each instructor teaching an equal number of OERs and traditional sections of the introductory course. Without controlling for instructor characteristics in this manner, any differences in students’ course performance could just be a function of differences in instructors’ grading inclinations and teaching proficiencies.
Fischer, Hilton, Robinson, and Wiley (2015)
Fischer et al. (2015) followed up Robinson’s (2015) study at 10 PK institutions in the 2013–2014 academic year (the second year of PK’s Open Course Initiative), 4 of which were among those included in Robinson’s study. 5 Fifteen courses, including four different introductory psychology courses, that each had sections using OERs and sections using traditional textbooks, were examined. No information on what specific OERs or traditional textbooks were used in the courses is given so the effect of text level of difficulty again cannot be addressed. There were three student success measures of course performance: course grade and the probability of passing course with C− or better, as in Robinson’s study, and course completion rate. However, due to small sample sizes, propensity score matching could not be applied for these measures. Because students in OERs conditions were only compared with students in traditional textbook conditions who were taking the same introductory course, there were an insufficient number of students in the OERs condition to do propensity score matching when diffused across the 15 courses.
Four different introductory psychology courses were analyzed separately. As in Robinson’s (2015) study, the Ns were very discrepant, ranging from 52 to 822 in the traditional textbook condition and from 26 to 109 in the OERs condition. The differences in Ns for three of the four courses were very large (e.g., for one course, the N was 822 in the traditional condition and only 26 in the OERs condition). There were a total of 2,052 students in the traditional textbook condition and only 323 in the OERs condition. Thus, as in Robinson, the sets of instructors were almost certainly not the same, so there was no control for instructor characteristics, such as their grading policies, thereby confounding the findings.
There were no significant differences in course completion rate for any of the four introductory psychology courses. However, in one of the four courses, students in the OERs condition had significantly higher grades, but there were no significant differences with respect to course grades in the other three courses. With respect to passing with a C− or better, no significant differences were observed for any of the four introductory courses. Thus, Robinson’s findings for these latter two student course performance measures were not replicated. We should also note that when the conditions were not significantly different for a course, Fischer et al. only reported that they were not significant and did not provide any comparative numerical data for the two conditions. Hence, the direction of the nonsignificant differences cannot be determined.
Why the negative effects associated with the use of OERs in introductory psychology observed by Robinson (2015) disappeared in this follow-up study is a difficult question to answer. Several factors could be involved. We will speculate about a few of these. At least some of the schools in the Fischer et al. (2015) study were in their second year of using the OERs, and thus some teachers in the OERs condition might have been more proficient in their use of the OERs. Because some of the psychology data in Robinson’s study came from two schools (Chadron State College and Tompkins Cortland Community College) that were also in the Fischer et al. school sample, it is possible that this increased teacher proficiency factor could have been operating, at least for two of the introductory psychology courses. However, we do not know for sure that any of the introductory psychology course data came from these two schools that were in both studies because Fischer et al. did not report the college from which the data for each course were taken.
It is also possible that the traditional textbooks used in the Fischer et al.’s (2015) study were higher in level of difficulty than those used in Robinson’s (2015) study, resulting in poorer student performance in the traditional textbook conditions in the Fischer et al. study and thereby creating the difference in results for the two studies. Because no information on these texts is provided in either study, this possibility cannot be examined. Also, because faculty characteristics were not controlled, it is possible that the teachers in the traditional text condition in Robinson’s study were better teachers or more lenient graders than those in the OERs condition, leading to a greater level of student success in their courses; and this might not have been the case in the Fischer et al. study, causing the superior performance of students in the traditional textbook condition to disappear. Similarly, variance in exam difficulty within and between the studies might play a role in the observed differences between the two studies. In sum, there is just too much uncontrolled variance across the two studies to determine the specific causes underlying the different results observed in the studies. Furthermore, all of the threats to external validity that Robinson pointed out for his study also apply to the Fischer et al. study. Thus, as with the results of Robinson’s study, the Fischer et al. findings should be interpreted with caution.
Conclusions
What can we conclude from the three studies that we have discussed about the effects of OERs and open textbooks versus traditional textbooks on students’ performance in the introductory psychology course? Simply put, no firm conclusions can be drawn. Why? All three studies that we discussed involved uncontrollable variance across instructors, textbooks and course materials, courses, exams, and institutions. Thus, a host of variables—teacher characteristics, especially teaching proficiency and grading leniency, how the classes were taught, the level of difficulty of the course textbooks and OERs, the difficulty of course exams, and so on—were not controlled and thereby could have impacted the results of these studies. The authors of all of the studies thus far seem to be oblivious to the importance of the textbook difficulty factor, which is extremely important in comparing introductory psychology textbooks (Griggs, 1999). In the Robinson (2015) study and Fischer et al.’s follow-up study, identifications of the traditional textbooks used in the control conditions were not given, and no descriptions of the OERs used in the treatment conditions were provided. It is as if the difficulty in course materials is not important to student course performance. To control for this factor, future researchers should ensure that they use lower level traditional introductory textbooks because the three currently available open introductory psychology textbooks are lower level (Griggs & Jackson, 2017). Also, to aid in controlling instructor variance, all of the instructors involved in future efficacy studies should be concurrently teaching both traditional and open sections of the introductory course in the same term. In sum, although there are only three studies thus far, they make us blatantly aware of the many control issues that are part and parcel of addressing the efficacy question of OERs versus traditional introductory psychology textbooks on students’ course performance. Not only will future researchers need to address these control issues so that the only difference between conditions is the type of textbook being used, but reviewers of such studies will need to insure that researchers have done so.
The situation of studying this efficacy question, however, is even more problematic because of another factor that provides uncontrollable variance and very likely played a role in the three studies that we discussed. It is the unlimited customizability of open textbooks and OERs to fit individual introductory courses. Openly licensed texts and resources can be revised by individual teachers and departments, so that they are tailored to a teacher’s, or department’s, specific course objectives, design, and content. Content can be added, deleted, remixed, or revised to create a specific course design. Such limitless customizability is clearly not possible with traditional introductory textbooks. This ability to tailor an open text to a specific version of the introductory course is a critical difference between open resources and textbooks and traditional texts that is independent of the quality of the texts or resources and that would always favor the open text or resources with respect to their effect on students’ course performance. This factor likely played an important role in the results of all three of the studies that we reviewed, but especially in Hilton and Laman (2012), given that the HCC psychology faculty adapted the Stangor open introductory textbook not only to fit their version of the introductory course but also lowered the text’s reading level to be more closely aligned with the reading level of their students.
Given the host of variables that need to be controlled and the critical customizability difference between open and traditional textbooks, it may not be possible to do efficacy studies that are sufficiently controlled. In addition, as Robinson (2015) pointed out, such seemingly boundless customizability of open textbooks and OERs severely limits the external validity of such efficacy studies. For example, even if two schools are using adaptations of the same open text, these adaptations would likely be very different. A good example of this is the College of Lake County’s adaptation of the HCC adaptation of the Stangor open introductory psychology text (retrievable from http://dept.clcillinois.edu/psy/IntroductionToPsychologyText.pdf). Although the College of Lake County’s adaptation is directly based on the HCC adaptation, the two adaptations are very different. For example, there are only 11 chapters in the Lake County adaptation versus 14 in the HCC adaptation, and the chapters in the Lake County adaptation have been reordered and further revised to fit the Lake County introductory course. Also, the external validity of the studies thus far has been seriously limited by the fact that all of the colleges involved in these studies have been schools with predominantly at-risk students, clearly not representative of the general college population. In sum, given the difficulty in successfully controlling all of the relevant variables in open textbook or OERs versus traditional introductory psychology textbook efficacy studies and the inherent limits on the external validity of such studies, we think that it is a moot question whether such research can yield truly meaningful results. Possibly, future efficacy research will prove us wrong, but the efficacy research conducted thus far certainly does not.
Footnotes
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The authors declare that they are both authors of commercial psychology textbooks because this authorship could potentially be perceived as a financial conflict of interest given that this paper involves evaluations of studies of effects of open (digitally-free) versus commercial textbooks on student performance.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
