Abstract
This study used a pretest-posttest design to measure student learning in undergraduate statistics. Data were derived from 185 students enrolled in six different sections of a social statistics course taught over a seven-year period by the same sociology instructor. The pretest-posttest instrument reveals statistically significant gains in knowledge for each course section and all sections combined. The results demonstrate that pretests can establish students’ prior knowledge at the beginning of the semester, while posttests measure learning at the end of the course. Namely, pretest-posttest knowledge gain was influenced more by the content and presentation of the social statistics course than by students’ statistical ability and/or test-taking skills prior to the class. Recommendations on the use of pretest-posttest data to inform pedagogy in social statistics are included in the discussion.
The baccalaureate degree curricula of most undergraduate sociology and other social science programs require one or more courses in statistics. Despite the importance of quantitative data analysis in these disciplines, students with limited mathematics backgrounds and anxiety over statistics present a challenge to professors teaching these courses (Bandalos, Finney, and Geske 2003; Blalock 1987; Cerrito 1999; Forte 1995; Garfield and Chance 2000; Macheski et al. 2008; Wilder 2010). Not surprisingly, faculty assigned to teach statistics search for approaches that improve students’ statistical skills. Numerous classroom techniques (e.g., small-group work, collaborative testing, humor, computer-assisted instruction, active learning, etc.) have been described in college statistics courses (Delucchi 2006; Helmericks 1993; Schacht and Stewart 1990; Schumm et al. 2002; Strangfeld 2013). Faculty using these practices report greater student satisfaction with the course (Fischer 1996; Perkins and Saris 2001; Potter 1995; Stork 2003), reduction of anxiety (DeCesare 2007; Lomax and Moosavi 2002), and a belief that learning was greater than students could have achieved without the instructional innovation (Auster 2000; Wybraniec and Wilmoth 1999; Yamarik 2007).
Upon close review, many of these studies provide little or no (direct assessment) empirical evidence that students’ statistics skills and knowledge (i.e., learning) actually increased because of the teaching strategy. Assessment is subjective and frequently relies on the perceptions of students or faculty (Fisher-Giorlando 1992; Lomax and Moosavi 2002; Marson 2007; Schacht and Stewart 1992). While not without some merit, comments based on informal impressions, and even quantitative measures of course satisfaction (e.g., student evaluations of teaching or alumni surveys), do not directly signify student learning. As indicators of perceived knowledge (rather than actual knowledge), these indirect methods of assessment are limited because assumptions must be made about what such self-reports mean (Price and Randall 2008).
Students’ academic performance, such as examination scores and course grades, as a proxy for learning (Borresen 1990; Delucchi 2007; Perkins and Saris 2001; Smith 2003; Yamarik 2007) also does not represent direct evidence of learning (Baker 1985; Chin 2002; Garfield and Chance 2000; Lucal et al. 2003; Wagenaar 2002; Weiss 2002). Why not? Learning is different from performance. Learning is increased knowledge: that is, the difference between what students know at the beginning of the semester compared with the end of the semester. Performance is demonstrating mastery: for example, accurate computation of a statistic or a correct answer to a multiple-choice question. This distinction between learning and performance is important since students enter courses with unequal knowledge, skills, and educational experiences. For instance, a student may begin the semester knowing little, learn a great deal, perform adequately, and receive average grades, or a student may enter a course knowing a great deal, learn a small amount, perform very well, and earn high grades (Neuman 1989). Consequently, pretests are necessary to establish prior knowledge, and posttests are requisite to measure learning.
Direct assessment of student learning in undergraduate statistics courses, based on pretests and posttests, is rare (Bridges et al. 1998; Price and Randall 2008). An objective of this study is to evaluate the effect of course completion on students’ statistical knowledge. Therefore, I use a pretest-posttest design to measure students’ prior course knowledge and to measure learning at the end of the semester. This method of assessment is proceeded by recommendations on the use of pretest-posttest data to inform pedagogy in undergraduate social statistics.
Data and Methods
Institutional Context
The study was conducted at a small (approximately 1,500 students) state-supported baccalaureate degree–granting university in the United States. The “Carnegie classification” describes the institution as a Baccalaureate College–Liberal Arts (McCormick 2001). The institution is coeducational (68 percent women; 32 percent men), ethnically diverse (59 percent ethnic minorities), and comprised predominantly of students of nontraditional age (65 percent are 25 years of age or older). Eighty-two percent of students are employed (40 percent working more than 31 hours per week), and all students commute to the campus.
Course Description
Statistical Analysis is an undergraduate course taught in the Division of Social Sciences. The course introduces students to descriptive and inferential statistics. Completion of College Algebra (or a higher-level mathematics course) with a grade of “C” or better is the prerequisite. Statistical Analysis is one of two methods courses required for all social science majors (e.g., anthropology, economics, political science, psychology, and sociology) at the university. In addition, the course fulfills a core requirement for professional studies majors (e.g., business/accounting and public administration). As a result, approximately 68 percent of the students enrolled in Statistical Analysis are social science majors and 32 percent come from professional studies.
Sample
Student data were derived from enrollment lists and class records for six sections of Statistical Analysis that I taught over a seven-year period. Complete information was obtained for 185 of the 214 students enrolled in the course at the beginning of each semester, an 86 percent response rate. The class met for 75 minutes, twice a week, during a 16-week semester. The course consisted primarily of lectures on descriptive and inferential statistics that paralleled chapters in the text and readings in a booklet about the Statistical Package for the Social Sciences (SPSS) (Stangor 2000).
Requirements for the course included Examination 1 (15 percent), Examination 2 (20 percent), a final examination (35 percent), two small group projects worth 10 percent each, and 12 quizzes weighted a combined 10 percent. Textbook homework and computer exercises were assigned but not collected on a regular basis. 1 This approach was intended to encourage students to be proactive and self-diagnostic with regard to course content. Students reporting difficulty completing these assignments were invited to seek my assistance prior to the due date. While the text (most recently Levin and Fox 2011) and SPSS booklet (Stangor 2000) changed as new editions became available, the instructor, lectures, homework assignments, quizzes, group projects, examinations, and grading criterion were essentially constant across the six sections of the course.
Pretest-Posttest Instrument
To assess students’ statistical knowledge, a comprehensive multiple-choice examination was administered at the second class meeting during the first week of the semester. 2 This pretest contained 30 questions on descriptive and inferential statistics derived from “typical” computational and quantitative reasoning skills covered in the Statistical Analysis course. (See the appendix for pretest-posttest content areas.) The same instrument was administered as a posttest at the last regular class session, one week prior to the final examination. Students were given 45 minutes to complete each test and could use a calculator and consult their textbook. 3 Pretest and posttest scores did not count as a grade or earn extra credit. Only students who completed both tests were included in the data set. The Office of Institutional Research and Assessment (serving as the campus institutional review board for faculty research proposals requiring student and course-level data) approved the Statistical Analysis course pretest-posttest project upon which this article is based. Information collected and analyzed did not include student names or other individual identifiable information.
In this study, the term learning refers to actual improvement over the span of a semester in measurable skills and knowledge regarding social statistics. The dependent variable (Improvement) represented learning or knowledge gained from the course. Improvement was coded by subtracting the percentage of correct answers (out of 30) that students received on the pretest from the percentage correct on the posttest. Positive values denoted an increase in students’ statistical knowledge from the beginning to the end of the course (pretest/posttest gain = Improvement), while zero or negative percentages signified no improvement. The higher the percentage, the more knowledge a student gained or material learned.
Course Characteristics
Class size and course meeting times were recorded. In addition to completing the pretest and posttest, students completed three examinations during the semester. These examinations required students to perform computations and interpret data. During each 75-minute test, students worked independently but were permitted to use calculators, textbooks, lecture notes, quizzes, and homework assignments. The three examinations were scored on a 0- to 100-point scale.
Approximately once a week during the final 10 to 15 minutes of class, students completed a quiz. Each quiz involved computations and interpretations similar to (but less rigorous than) those on examinations. 4 Students could use calculators, textbooks, lecture notes, and their homework but were required to complete quizzes independently. The first four quizzes covered descriptive statistics and paralleled the quantitative skills assessed on Examination 1. Quizzes 5 through 8 covered inferential statistics and involved knowledge evaluated on Examination 2. The last four quizzes focused on statistical relationships and demanded knowledge similar to that found on the final examination. The 12 quizzes were scored on a 0- to 10-point scale.
Course requirements also included completion of two group projects. Approximately four weeks prior to a project’s due date, students were instructed to organize themselves into groups containing two to four members. Groups decided how to divide the workload, but each member was required to be involved in all stages of the project. Students were collectively responsible for their project, and all members received a group grade. To discourage “free riders” (i.e., individuals who contribute little or nothing to the project), students were asked to apprise me of members who did not attend group meetings or were not performing their share of responsibilities. The class was informed that individuals who did not contribute their “fair share” to the project would have their grade lowered accordingly. After the initial formation of the groups, students met outside of class. Students were encouraged to meet with me when they had questions and to submit rough drafts of their papers.
Group Project 1 introduced students to material that would appear on Examination 1. Working in groups, students used SPSS to compute frequency distributions, cross-tabulations, and descriptive statistics (i.e., measures of central tendency and dispersion) for nominal, ordinal, and ratio scale variables. After obtaining an SPSS printout, the group was required to interpret the data and write up the results in a two- to three-page paper. Group Project 2 paralleled content on the final examination (e.g., correlation and regression). Students were required to select one scholarly article from several that I placed on reserve in the library. Each group was instructed to discuss the selected article and interpret its findings. Subsequently, the group was required to compose a two- to three-page paper demonstrating their ability to interpret multiple regression analysis as it appeared in the article. Grades were assigned to the group projects on a 0- to 12-point scale, where 12 = A, 11 = A–, 10 = B+, and so on.
Student Characteristics
Demographic information was collected on the pretest and posttest. Student characteristics included age, gender, major, and prior knowledge (percentage of correct answers on pretest). Table 1 presents coding and descriptive statistics for all course and student characteristics.
Course and Student Coding, Means, and Standard Deviations (N = 185).
Results
Pretest-Posttest Differences
To determine whether course completion was associated with learning in social statistics, I had to establish that knowledge increased at the end of the semester. The study’s design generated appropriate data, while a statistical test was necessary to determine significant differences between pretest and posttest means (Improvement, i.e., the dependent variable). I applied a paired-sample t test to each of the six sections of the course Statistical Analysis.
Pretest-posttest means, standard deviations, and differences for each class taught and all classes combined appear in Table 2. The table displays the mean percentages of correct responses for the pretests and posttests. The difference between means is statistically significant for each course and all courses combined (t = 22.0, p < .001), revealing substantial improvement (i.e., knowledge gain) in test scores. The overall mean pretest score (percentage correct) is 43.9 percent, compared with the mean posttest score of 64.8 percent. This difference between the pretest and posttest mean equals 20.9 percent. Given the significant paired-sample t tests, I conclude that students’ social statistics knowledge was greater at the end of the semester than at the beginning of the semester. This increased learning occurred in addition to the effects of students’ prior knowledge, as measured by the pretest. Consequently, it is unlikely that gains in learning can be attributed to student experiences prior to enrollment in my social statistics course.
Pretest-Posttest Means, Standard Deviations, and Differences.
Note: The values for the difference column are the changes in the percentage correct from the pretest to the posttest. *p < .001 (two-tailed test).
Discussion
The results reveal statistically significant gains in knowledge for each course section and all sections combined. The pretest-posttest instrument consistently documents improvement in student learning: that is, on average, nearly a 21 percent increase in correct responses between test administrations. I attribute this increase to learning and the acquisition of quantitative skills. Pretest-posttest knowledge gain was influenced more by the content and presentation of the social statistics course than by students’ statistical ability and/or test-taking skills prior to the class.
Pedagogical Implications
This study has application to higher education in the areas of pedagogy, student learning, and assessment. The findings are germane to faculty in general and to those who teach statistics, one of the most challenging courses in the undergraduate curriculum, in particular. By integrating a pretest-posttest design into social statistics courses, sociologists generate data that may be used to improve their pedagogy and enhance student learning of quantitative skills.
The results are by no means representative of all students or institutions, so the conclusions drawn are best viewed as tentative. Clearly, students performed better, on average, on the posttest. However, the present study suggests only that students can learn to interpret and analyze quantitative data in an undergraduate course. The next step is to examine the relative effectiveness of different teaching strategies for instruction in social statistics. Below, I offer pedagogical advice.
A pretest-posttest instrument, once put into practice, can be used to improve the process of teaching statistical skills to undergraduates. For example, posttest content on which students performed poorly can be revised, and increased emphasis and class time can be devoted to these topics. Then again, some students may enter the course with stronger than expected quantitative skills; consequently, they grasp complex statistical concepts more readily than anticipated by the instructor. Pretests can identify a student’s prior knowledge of course content, enabling faculty to spend less time on those areas. In short, both the instructor and students can benefit from a pretest-posttest course design.
The assessment model described here provides faculty with the opportunity to identify groups of students with similar pretest-posttest results. For instance, an instructor notices that students who received high pretest scores exhibited the most improvement on the posttest. Conversely, students that showed the least pretest-posttest improvement may have performed poorly on the pretest. Such information is invaluable for those teaching social statistics. If “at-risk” students can be detected with the pretest, remediation of problem areas is possible. Students identified early in the semester can be offered review and preview course materials and/or grouped with students who scored high on the pretest to work on collaborative learning projects.
While pretest-posttest assessment can document student learning, the absence of knowledge gain or “value added” from class completion indicates a need to improve the course. In response to such an outcome, sociologists can use a pretest-posttest course design to evaluate the effectiveness of innovative pedagogy. For example, faculty could use hierarchical regression analysis (Schutz et al. 1998) to assess the net effect of teaching strategies (e.g., active learning exercises, use of computer technology in data analysis, and cooperative group projects) on students’ pretest-posttest results. If experimentation with cooperative group activities reveals a strong positive effect on student learning, faculty possess empirical evidence to justify course modifications that expand application of this pedagogical innovation. The same analytic procedure can identify the effects of student characteristics (e.g., age, gender, field of study) on pretest-posttest differences. Multivariate analysis, in combination with pretest-posttest data, helps faculty answer the following pedagogical question: “How much of the total variance in students’ learning is explained by specific teaching strategies and/or student and course characteristics?”
Academic departments, colleges, and universities seeking to maintain accreditation and demonstrate compliance with professional and government guidelines increasingly must rely on assessment of students. Pretest-posttest course designs are one means of documenting that learning is taking place in the classroom. As distance learning technology and instructional software replace traditional approaches to teaching social statistics, there is increasing need for practical and accessible ways to determine the efficacy of these methods. When designed to evaluate students’ statistical knowledge and skills, pretest-posttest assessments inform pedagogy and focus faculty on the most important assessment goal: improving students’ learning of social statistics.
Suggestions for Future Research
Although the findings of this study suggest that students’ statistical abilities increased over the course of a single semester, the results do not inform the debate over other salient issues in social statistics, for example, assessment of particular teaching strategies. Consequently, I suggest the following areas for exploration. First, future research should identify course characteristics that improve learning in social statistics at different types of institutions and on diverse student populations. Modifications in course design and implementation may be required for the effective application of instructional innovation in different environments. Second, more studies are required that connect pedagogical practices to actual student learning (i.e., direct assessment). Using student evaluations of teaching, attitude surveys, and even course grade point averages as learning outcomes does not adequately measure whether a particular technique increased students’ statistical skills and knowledge. I suggest using gains in information content-learning to evaluate outcomes (Gelles 1980). Third, there is a need for more experimental assessments of pedagogy in social statistics courses. This would include research designs that use a systematic method of comparison, using both pretest-posttest and experimental and control groups (Baker 1985; Chin 2002). For example, faculty assigned to teach multiple sections of statistics might use a “new” method of instruction in one section and compare the amount learned with a traditionally taught course.
Conclusion
During the seven-year period in which the six sections of social statistics were taught, student evaluations of the course were very high, exceeding the campus average. 5 Evidence that students report high levels of satisfaction is gratifying, especially when one considers the notorious reputation that social statistics has for many students. Nevertheless, had the high evaluations been juxtaposed with little or no evidence of student learning, skepticism about the efficacy of my instruction would be justified. Faculty seeking new ways to teach social statistics should continue to experiment with their pedagogy. However, they must also assess student learning and be prepared to modify the content and delivery of social statistics when evidence of knowledge gain cannot be linked to course completion.
Footnotes
Appendix
Pretest-Posttest Content Areas for the Statistical Analysis Course
| Item No. | Topic |
|---|---|
| 1 | Organizing raw data—descriptive statistics |
| 2 | Frequency distributions—descriptive statistics |
| 3 | Contingency (cross-tabulation) tables—descriptive statistics |
| 4 | Contingency (cross-tabulation) tables—descriptive statistics |
| 5 | Histogram—descriptive statistics |
| 6 | Scatter plot—descriptive statistics |
| 7 | Skewness—descriptive statistics |
| 8 | Percentiles—descriptive statistics |
| 9 | Central tendency (mean)—descriptive statistics |
| 10 | Central tendency (mode)—descriptive statistics |
| 11 | Central tendency (median)—descriptive statistics |
| 12 | Variance and standard deviation—descriptive statistics |
| 13 | Normal curve—inferential statistics |
| 14 | Normal curve—inferential statistics |
| 15 | Confidence interval—inferential statistics |
| 16 | Hypothesis testing—inferential statistics |
| 17 | Hypothesis testing—inferential statistics |
| 18 | Hypothesis testing—inferential statistics |
| 19 | Hypothesis testing—inferential statistics |
| 20 | Hypothesis testing—inferential statistics |
| 21 | t test—inferential statistics |
| 22 | t test—inferential statistics |
| 23 | Analysis of variance (ANOVA)—inferential statistics |
| 24 | Analysis of variance (ANOVA)—inferential statistics |
| 25 | Chi-square—inferential statistics |
| 26 | Correlation |
| 27 | Correlation |
| 28 | Regression |
| 29 | Regression |
| 30 | Multiple regression |
