Abstract
Research has identified a number of problems limiting the implementation of content standards in the classroom. Curriculum materials may be among the most important influences on teachers’ instruction. As new standards roll out, there is skepticism about the alignment of “Common Core–aligned” curriculum materials to the standards. This analysis is the first to investigate claims of alignment in the context of fourth-grade mathematics using the only widely used alignment tool capable of estimating the alignment of curriculum materials with the standards. The results indicate substantial areas of misalignment; in particular, the textbooks studied systematically overemphasize procedures and memorization relative to the standards, among other weaknesses. The findings challenge publishers’ alignment claims and motivate further research on curriculum alignment.
The federal role in classroom instruction has dramatically grown over the past decade. Throughout the last quarter of the 20th century, many states developed and implemented content standards specifying the knowledge and skills students were to acquire at particular grade levels. However, the federal government required standards in only a few grades and subjects. Though there had been sporadic efforts at establishing national content standards prior to 2001, Bill Clinton’s effort in 1994 coming the closest, decisions about instructional content had always been left to states and even districts and schools.
The No Child Left Behind Act of 2001 (NCLB) mandated states adopt their own content standards in mathematics and reading, and all states followed suit. But research indicated problems with the 50 sets of state standards. For instance, NCLB-era standards were criticized for being poorly structured from grade to grade (Schmidt, Wang, & McKnight, 2005), for emphasizing procedural skills at the expense of conceptual understanding (Polikoff, Porter, & Smithson, 2011), and for varying dramatically from state to state (Porter, Polikoff, & Smithson, 2009).
Citing international competitiveness and the need to prepare students for success in college and careers, a group of governors came together with educational leaders in 2009 to create the Common Core State Standards (CCSS) in grades K–12 mathematics and English language arts. Spurred by the Obama administration’s Race to the Top program, more than 40 states have fully adopted these standards. Given the historic fragmentation of K–12 education policy, this is a milestone in the history of standards-based education reform in the United States. Many scholars believe that the CCSS have the potential for resulting in meaningful improvements in the content of instruction and student mastery of advanced content (e.g., Carmichael, Martino, Porter-Magee, & Wilson, 2010; Cobb & Jackson, 2011; Schmidt & Houang, 2012).
To be sure, there are numerous challenges to implementing the standards. Foremost among these is the quality and alignment of supporting documents and policies, including assessments, textbooks/curriculum materials, and teacher professional development. The theories of standards-based reform argue that teachers must receive clear and mutually reinforcing messages from standards, assessments, and curriculum materials about the content they should be teaching (Smith & O’Day, 1991), and research indeed suggests that coherent (i.e., better-aligned) policies are associated with stronger instructional responses to standards (Polikoff, 2012a). In this analysis I define alignment as agreement on both topic and cognitive demand (Porter, 2002), though I explore alternative definitions of alignment as well. Following this definition, in order for a material, such as a textbook, to be aligned to the CCSS, the book must address all of the content specified in the standards at the target grade levels and no content that is outside the scope of the standards.
As the past decade of implementation of state standards has shown, teachers will not be able to successfully implement the standards if they are not supported with high-quality, aligned materials (Hill, 2001; Polikoff, 2012b; Spillane, 2004). Standards contain vague and sometimes complex language that is difficult for teachers to interpret (Hill, 2001). Even if teachers are able to appropriately interpret the standards, it is challenging to find curriculum materials that adequately cover the target material—school district personnel making adoption decisions are often forced to rely on nonexistent or problematic information about curriculum quality and alignment (Zeringue, Spencer, Mark, & Schwinden, 2010). Thus, in order for the CCSS to achieve their desired effects, the standards must be supported with aligned textbooks and curriculum materials.
We also know from research that textbooks often vary substantially in their effects on student achievement. For instance, an experiment conducted by the National Center for Educational Evaluation found that first and second graders taught using Saxon Math outperformed those taught using Scott Foresman–Addison Wesley by 0.17 standard deviations (Agodini et al., 2010). And a study of student achievement in Indiana highlighted that schools using the SBG mathematics curriculum significantly outperformed those using the Saxon curriculum (Bhatt & Koedel, 2012).
There has also been increasing national attention paid to curriculum. For instance, a report from the Brookings Institution argued that “the Common Core standards will only have a chance of raising student achievement if they are implemented with high-quality materials, but there is currently no basis to measure the quality of materials” (Chingos & Whitehurst, 2012, p. 1). The authors of that report emphasized that curricular reforms are relatively inexpensive as compared with popular policy reforms, such as teacher quality, school choice, and class size reduction. In short, high-quality curricula can have sometimes substantial impacts on achievement at relatively low cost. There is less evidence about what causes some curricula to be more effective than others, but a logical hypothesis is that the best textbooks are the ones that are best aligned to the standards and assessments. These materials help teachers faithfully implement the standards and, therefore, produce gains on assessments aligned to those standards. To the extent that the standards represent a “good” target, such gains would presumably be of value.
Given the adoption of the CCSS nearly nationwide and the recent evidence as to the effectiveness of curricula, there is a great need for work that can help teachers implement curricula aligned with the standards. Textbook companies have been making claims of alignment since shortly after the standards were created (Sawchuk, 2012), but there is reason to be skeptical of these claims given their poor track record (Schmidt et al., 2001). Rather, what is needed is a rigorous technique to use high-quality alignment methods, giving information about the alignment of curricula directly to teachers, schools, and districts. This study represents the most detailed analysis of textbook alignment to standards yet conducted. I apply a widely used alignment technique to three “Common Core–aligned” textbooks, three editions of the same textbooks aligned to previous Florida standards (the Next-Generation Sunshine State Standards [NGSSS]), and one commonly used text that is not explicitly aligned to any standards in fourth-grade mathematics. I rate the textbooks on their alignment to the CCSS and to state standards and identify areas of alignment and misalignment. Specifically, I address the following research questions:
To what extent are four popular textbooks aligned with the Common Core standards for fourth grade mathematics? To the extent that there is misalignment, what are the sources of that misalignment?
To what extent do Common Core–aligned textbooks differ in content coverage from versions aligned with prior state standards?
In what follows, I first briefly describe the prior literature on curriculum material alignment and describe the centrality of alignment in standards-based reform’s theory of action. Next, I describe the alignment methodology and the sampling strategy. Finally, I present results from a series of alignment analyses. The results are not intended to apply to all CCSS-aligned texts, but they show a troubling lack of alignment with the standards for the chosen texts. In particular, the textbooks cover most all the topics in the standards, but they fail to reach the advanced levels of cognitive demand called for by the standards. The CCSS-aligned texts are also more closely aligned with prior NGSSS-aligned versions of the texts than should be the case given the differences between fourth-grade CCSS math and Florida’s NGSSS. Given these findings, I call for more, and more detailed, alignment analyses of curriculum materials in other grades and subjects by researchers, practitioners, and publishers.
Background
Standards-Based Reform
For two decades, policymakers have implemented standards-based education reforms to bring coherence to K–12 policy and practice. The vision of standards-based reforms is that clear, coherent content standards, coupled with mutually reinforcing supports in the form of assessments, curriculum materials, and teacher in-service and preservice professional development, will result in improved curriculum alignment and instructional quality, leading to concomitant improvements in student achievement in the target content areas (Clune, 1993; Smith & O’Day, 1991). NCLB effectively mandated the basic provisions of standards-based reform, including content standards, aligned assessments, and school accountability, in 2002.
In spite of the conceptual appeal of the vision of standards-based reform, research suggests that its effects on instruction and student outcomes have been modestly positive, at best. For instance, while teachers generally report great efforts at instructional improvement (e.g., Hamilton & Berends, 2006), instruction remains poorly structured across grades and sites (Polikoff, 2012b; Schmidt, Cogan, Houang, & McKnight, 2011). Recent work provides startling evidence of instructional incoherence and poor standards alignment in mathematics (Jacobs et al., 2006; Polikoff, 2012b, 2012c). Furthermore, variation in curricular coverage accounts for a meaningful proportion of the persistent achievement gaps in mathematics (Schmidt et al., 2011).
Perhaps not surprisingly, given these weak instructional responses, the achievement effects of standards-based reforms have been underwhelming. The most methodologically sophisticated research suggests that NCLB and related accountability policies have had positive effects of roughly 0.2 standard deviations in mathematics (Dee & Jacob, 2011; Hanushek & Raymond, 2005; Rouse, Hannaway, Goldhaber, & Figlio, 2007), with no consistent evidence of achievement gap reductions. These gains, while perhaps impressive in light of persistent difficulties in raising achievement nationwide, are by no means large enough to erase long-standing achievement gaps.
The Role of Textbooks
There are numerous contributors to the weak implementation of standards in the classroom. One contributor is the design of state assessments and accountability policies, which has narrowed teachers’ focus to that which is tested and has undermined the content messages of the standards (Hamilton et al., 2007). For instance, state assessments under NCLB were not well aligned with standards, sending teachers conflicting messages about what to teach (Polikoff et al., 2011). In particular, these tests narrowly sampled from the domain of the standards, leaving vast swaths of standards material untested (Jennings & Bearak, 2014). Furthermore, they tended to concentrate on the lowest levels of cognitive demand (memorization, procedures), neglecting more conceptual skills (Polikoff et al., 2011).
A second likely contributor to weak standards implementation is the poor quality and alignment of textbooks and curriculum materials (Hill, 2001; Spillane, 2004). A number of studies have commented on the overwhelming size and repetition of U.S. textbooks (Flanders, 1987), especially when compared with texts from other (notably, East Asian) countries (Alajmi, 2012; Schmidt et al., 2001; Schmidt, McKnight, & Raizen, 1997). Textbooks are an important influence on student learning experiences. Textbooks affect students directly through daily use and indirectly through teachers’ use of texts to guide instruction (Ball & Cohen, 1996). Large majorities of teachers and students use textbooks frequently (Chingos & Whitehurst, 2012), even in an era when curriculum materials are increasingly available online. To be sure, teachers do not passively implement textbooks as written—rather, teachers adapt them as they are implemented (e.g., Barr, 1988; Freeman & Porter, 1989; Remillard, 2005; Sosniak & Stodolsky, 1993). Nevertheless, textbooks clearly affect what is taught and what students learn (Schmidt et al., 2001; Schmidt, Houang, & Cogan, 2002). In mathematics, in particular, topics that are not included in textbooks are unlikely to be taught (Stein, Remillard, & Smith, 2007).
The study of textbooks is based on the concept of student opportunity to learn (OTL), which has been a guiding principle in standards-based reform (McDonnell, 1995). OTL theories argue that the content, form, quality, and duration of instruction are primary drivers of student learning. While the intended curriculum corresponds to content standards and the enacted curriculum corresponds to the content teachers teach (Porter, 2002), textbooks mediate the standards-to-practice continuum. That is, textbook authors interpret the standards, and these interpretations influence what teachers implement in the classroom (Ball & Cohen, 1996). Of course, textbooks are far from the only influences on teachers’ instruction, and there are no recent reliable data on the prevalence of textbook use (a 2002 California survey found 92% of teachers reporting they used textbooks; Oakes & Saunders, 2002). Recent years have seen a proliferation of technological developments in the area of curriculum that have expanded teachers’ implementation options. Despite this, textbooks are still widely used in U.S. classrooms (Chingos & Whitehurst, 2012).
Textbook Alignment
There has never been a systematic examination of the alignment of textbooks to state standards. A recent review of alignment methodologies (Martone & Sireci, 2009) concluded that only one of the three primary alignment methodologies used today can allow for comparisons among standards, assessments, textbooks, and instruction—the Surveys of Enacted Curriculum (SEC) approach. The other two main approaches have been used only for test-to-standards alignment analyses. However, there have been no large-scale, published analyses of alignment using any methodology. Still, there are several relevant studies of textbook content that bear on the issue of textbook alignment with state content standards.
Dating back to the 1980s, analyses of elementary mathematics textbooks revealed that their content varied substantially. One study used content analyses of several texts based on a three-dimensional content taxonomy, concluding that the texts differed enough that they did not define a de facto national curriculum (Freeman et al., 1983). Nevertheless, reviews of popular mathematics textbook series showed similar levels of repetitiveness from grade to grade, such that students being taught from any of the books could be expected to repeat the same content across years (Flanders, 1987).
Several studies have used data from international assessments to compare the content of U.S. textbooks with those tests. For instance, an early study found that U.S. textbooks were poorly aligned with the tests used in the Second International Mathematics Study (SIMS; Flanders, 1994). In that study, alignment was defined by mapping each SIMS item to the textbooks. However, perhaps contradicting Freeman et al. (1983), the study also noted that the textbooks tended to agree with one another on topic inclusion. Later studies (Schmidt et al., 1997, 2001) found similar results based on the Trends in International Math and Science Study. Another study compared textbooks to the National Council of Teachers of Mathematics standards, finding that the content of the texts was weakly aligned to the standards both before and after their 2000 revision (Jitendra, Deatline-Buchman, & Sczesniak, 2005). However, after the 2000 revision, the texts seemed to be better aligned to the standards’ instructional design criteria.
Despite the lack of specific evidence about textbook alignment, it seems reasonable to infer that alignment can be no better than moderate, if for no other reason than that instruction remains modestly aligned with standards (Polikoff, 2012b). Until the recent creation of EdReports, there has never been an external body rating the alignment of textbooks to standards in the United States. Thus, teachers and curriculum coordinators must either take textbook companies’ claims of alignment at face value or determine alignment themselves (Zeringue et al., 2010). Unfortunately, there is little evidence that teachers and district curriculum coordinators have been able to simply read the standards and create aligned curricula. For instance, Hill (2001) found that district curriculum committee members interpreted words like construct and concept—words that had precise instructional meanings when they were included in standards documents—in ways not intended by their authors. These inaccurate interpretations resulted in written curricula that differed substantially from the authors’ intended approach to reform. Spillane (2004) corroborated these findings, arguing that district personnel often misunderstood the content messages of standards and passed down these misunderstandings to teachers and others through poor-quality and poorly aligned curriculum materials.
Methods
The SEC
To analyze alignment, this paper uses data from the SEC in mathematics. The SEC in mathematics is a content taxonomy that was developed over time with input from mathematics and mathematics education experts. For a complete history of the SEC, see Porter (2002). The current version of the SEC defines content at the intersection of two dimensions—183 topics and five levels of cognitive demand. Thus, there are 915 total “cells” in the SEC mathematics language. The cognitive demand levels are based on a modified Bloom’s taxonomy (similar but not identical to Anderson et al., 2001) and are defined in the Appendix in the online version of the journal as they are on the content analysis forms. The topics are intended to be exhaustive as to the topics covered in K–12 mathematics; they are grouped under 16 coarse-grained topics, also listed in the Appendix. The SEC languages can be used to content analyze standards, assessments, and curriculum materials, and they can also be used in survey studies to measure the content of teachers’ instruction (Martone & Sireci, 2009). Data from the SEC have been used in dozens of published studies over the past decade.
Textbook Selection
This research uses content analysis data on both content standards and textbooks. As a proof of concept for a much larger study, fourth-grade mathematics was chosen for this investigation. Mathematics was chosen because it is one of the two CCSS subjects and because there is a larger prior research base on mathematics textbooks and their effectiveness, as mentioned above. The main version of the CCSS—that posted on the corestandards.org website—was analyzed, rather than any of the state-specific versions of the CCSS that may include up to 15% additional content. The CCSS content analysis data are those that have been used in previous published research on the standards (e.g., Porter, McMaken, Hwang, & Yang, 2011).
Florida was selected as a comparison state because it is one of only two states (along with Indiana) that routinely collects data on textbook adoptions by districts—those textbook adoption data are being used for a separate study estimating the achievement effects of textbooks in Florida. Florida is also a statewide textbook adoption state and a large state that likely influences the content in textbooks nationwide. In 2009 to align with the NGSSS, the state adopted three elementary mathematics textbooks: (a) Houghton Mifflin Harcourt Go Math! Florida, (b) Pearson/Scott Foresman–Addison Wesley enVision Math Florida, and (c) Macmillan/McGraw-Hill Math Connects Florida. Each of these three textbooks was content analyzed, as were the new “Common Core–aligned” versions of the same books: (a) Houghton Mifflin Harcourt Go Math! (Common Core edition), (b) Pearson/Scott Foresman–Addison Wesley enVision Math Common Core, and (c) Macmillan/McGraw-Hill Math Connects and Math Connects to the Common Core. Finally, as an example of a textbook not explicitly aligned to any standards, Saxon Intermediate Mathematics 4 was content analyzed, as this book has featured prominently in recent textbook studies (e.g., Agodini et al., 2010). Go Math!, enVision Math, and Math Connects are all still marketed and sold as Common Core aligned as of January 2015, though the publishers also sell other Common Core–aligned textbooks (for instance, McGraw Hill sells My Math, which was created in 2013 specifically for Common Core). Table 1 provides additional details about the seven analyzed books.
Information on the Seven Textbooks Reviewed
Note. Page counts include total number of pages analyzed. Lesson counts include investigations (Saxon) but not lesson tests or pretests, though these are included in alignment calculations. CCSS = Common Core State Standards; NGSSS = Next-Generation Sunshine State Standards.
Includes Math Connects Grade 4 (published 2009) and Math Connects to the Common Core, Grade 4 (published 2012).
Content Analysis Procedures
Content Analyzing the Common Core Standards
Teams of trained analysts who have conducted SEC analyses for a decade or more conduct all content analyses. The textbook content analyses and those of the Florida NGSSS used three analysts, while those of the CCSS used four analysts. Mathematics content analysts are former K–12 mathematics curriculum leaders, state officials in mathematics education, and university professors of mathematics education. Content analysis is conducted at the finest-grained level of detail possible. For standards, each objective is analyzed. For those sets of objectives where a series of subobjectives fall underneath a broader objective, the standard practice is that only the subobjectives are analyzed. For example, Objective 4.NF.3 reads, “Understand a fraction a/b with a > 1 as a sum of fractions 1/b.” This has four subobjectives, 4.NF.3A through 4.NF.3D, one of which is “Understand addition and subtraction of fractions as joining and separating parts referring to the same whole.” The analysts code the four subobjectives but not the broader objective, because it is assumed that the four subobjectives describe what is meant by the broader objective. In the case of fourth-grade math CCSS, this means three of the 37 listed objectives are not included in the content analysis. Reanalyzing the standards so that all 37 objectives are included does not substantively change the results of this research; however, estimated alignment indices are typically somewhat lower because publishers never indicate spending more than one or two lessons on these broader objectives (much less than their 1/37th representation in the standards). These alternate results based on all 37 objectives are provided in Appendix Tables S1 and S2 in the online version of the journal.
The content analysis results for the standards are, themselves, an illuminating way to understand what is called for by the CCSS. As shown in previous analyses, the CCSS calls for more coverage of higher levels of cognitive demand than previous state standards (Porter et al., 2011). In fact, on average across grades there is almost an even split between lower-level (memorization, perform procedures; 53% of total K–12 content) and higher-level (demonstrate understanding, conjecture, generalize, prove; solve nonroutine problems; 47% of total K–12 content) skills in the standards (Porter et al., 2011). The fourth-grade standards are therefore relatively more procedural than other grades’ standards, since 60% of fourth-grade CCSS content is on memorization or procedures. The large majority of CCSS objectives at fourth grade are coded as having some elements of lower-level and some elements of higher-level cognitive demand. Just five of 37 objectives are coded by all raters as being solely focused on memorization or procedures, and only three are coded by all raters as focusing solely on the top three levels of cognitive demand. In short, the standards at fourth grade call for both procedural fluency and also conceptual understanding.
Content Analyzing Textbooks
For textbooks, text, examples, and exercises are analyzed as well as supplementary material found in each chapter (e.g., chapter tests, checkpoints, introductions). For this study, every chunk of text (these are located in boxes in sections defined by headers; each box or section counts as one chunk), every worked example, and every problem in each lesson was analyzed. Thus, for each textbook lesson—each textbook had 100 or more such lessons—there were an average of three to four chunks of text or examples and 30 to 40 exercises analyzed. Within a particular document, all content-analyzed objectives, items, or text are equally weighted. While it would be possible to develop an alternative weighting scheme, such as weighting by importance, equal weighting is the most logical and defensible approach. Lessons at the beginning or end of each book that indicated they were focused on third- or fifth-grade content were not analyzed.
Content analysts are allowed to place each objective, item, or text chunk into between one and six SEC cells—multiple cells are allowed because many objectives tap multiple topics or levels of cognitive demand. Regardless, the point value for the item is equally allocated across the number of identified cells. Content analysts code all the items independently, after which the team meets in person or virtually to discuss the codes. Final coding decisions are made independently, and the results from each coder are averaged to arrive at the final content analysis. Figures 1 and 2 show sample content codes for a page of the Math Connects Common Core textbook, and two of the CCSS objectives the book indicates are tapped by the lesson in question. The content codes here highlight the strong consistency across raters. The full content codes for all analyzed objectives and textbook content are available on request.

Sample textbook page (Math Connects, 2009) with content codes.

Sample Common Core State Standards objectives with content codes.
To analyze the reliability of content analyses, it is possible to use a generalizability theory D study. Prior investigations of the reliability of content analyses of standards and assessments found typical G coefficients around 0.75 (Porter, 2002; Porter, Polikoff, Zeidner, & Smithson, 2008). Applying the same technique to the CCSS content analysis data produces a coefficient of 0.96 for four raters. Applying the technique to textbooks produces generalizability coefficients of 0.99 for three raters on all seven textbooks. There are several potential explanations for the dramatically more reliable results here as compared to previous studies of standards or assessments. First, research shows that the larger the number of items, the greater the alignment tends to be (Polikoff & Fulmer, 2013); this would also be true at the level of the individual rater, and it could certainly explain the high reliability for textbooks that have many thousands of items. Second, it may be that textbook items or the Common Core objectives are easier to rate more consistently than prior standards or assessment items, leading to more reliable results. Certainly, textbook exercises tend to cover fewer SEC cells than the typical test item or objective. Third, it may be that there is something different about the coding process for this study than how it was used previously. For instance, the content analysts are more experienced now than they were when doing prior analyses, so perhaps that experience has resulted in more reliable ratings. Regardless, these results indicate that content analyses of textbooks and the CCSS are almost perfectly reliable.
The result of the content analysis of any document is a matrix of 915 proportions, one for each SEC cell. The proportion in each cell indicates the percentage of the total standards or textbook content on that particular topic/cognitive-demand combination. The cell proportions for a particular document sum to 1. They can be used for a number of descriptive analyses. For instance, summing the proportions across cognitive demand levels within a topic yields a marginal proportion for topic—the percentage of total content on that particular topic. Similarly, summing the proportions across topics within a cognitive-demand level yields a marginal proportion for cognitive demand—the percentage of total content on that level of cognitive demand.
Alignment
Perhaps the most common use of the matrices of proportions is for calculating one or more indices of alignment. The main alignment index is calculated by comparing any two matrices using the following formula:
Here, xi is the proportion of content in cell i of document x (e.g., the CCSS for fourth-grade mathematics), and yi is the proportion of content in cell i of document y (e.g., one of the CCSS-aligned textbooks). Mathematically, this is equivalent to the sum of the cell-by-cell minima. The resulting index ranges from 0 to 1 and indicates the proportion of content in exact agreement at the cell level.
Because the main alignment index requires exact proportional agreement, it does not count as aligned any “extra” content in a given cell in a textbook, even if that cell is included in the standards. That is, if a particular cell represents 2% of the standards content and 5% of the textbook content, the alignment index would count only 2% of that textbook content as aligned. This choice for the alignment index was by design—previous studies indicated that the form of the alignment index that best predicted student achievement gains was the main index requiring proportional agreement (Gamoran, Porter, Smithson, & White, 1997). Nevertheless, many would consider all 5% of the content in this example to be aligned, insofar as the textbook is covering content from the standards, so an alternate alignment index has been used in some analyses (Polikoff, 2012b; Polikoff et al., 2011):
This alternative alignment index counts content in cell i of document y as aligned if that cell is included at all in document x (i.e., xi > 0). That is, if a particular cell represents 2% of the standards content and 5% of the textbook content, this index would count 5% of that textbook content as aligned. This index can be calculated in both directions—with the standards as the reference and with the textbook as the reference. With the standards as the reference, the interpretation is the proportion of the textbook on content that is included in the standards. With the textbook as the reference, the interpretation is the proportion of standards content that is covered at all in the textbook.
Recent methodological work has shown that the maximum possible alignment is, in fact, almost always less than 1 for typical alignment analyses involving standards and assessments, necessitating a correction to be applied (Porter, Polikoff, Barghaus, & Yang, 2013). However, given the length of the textbooks in the study, including many thousands of score points, the maximum possible alignment indices for all analyses of textbooks and standards here can be shown to be equal to 1. This simulation code is available from the author. Therefore, no correction needs to be applied. Other recent methodological work using simulation studies allows for the estimation of alignment distributions due to chance given key attributes of the content analysis, such as the length of the documents, the number of content analysts, their agreement rate, and the rate at which they split their ratings in multiple cells (Fulmer, 2011; Polikoff & Fulmer, 2013). Where appropriate, I compare obtained alignment indices to these simulated chance distributions to explore whether the results obtained show evidence of greater-than-chance alignment.
Analysis
To answer the first question, I present the main and alternative alignment indices for each of the three CCSS-aligned books with the CCSS. I also present the alignment of Saxon Math Intermediate 4 with the CCSS, to explore how the results differ for a textbook that does not claim standards alignment. With the alignment indices calculated, I turn to exploring the content analysis data to uncover the sources of misalignment. I describe each textbook and the standards in terms of marginal proportions for cognitive-demand levels and topics, and I indicate the most common areas of alignment and misalignment.
To answer the second question, I conduct similar analyses comparing the three matched pairs of CCSS- and Florida-aligned textbooks. I first calculate an alignment index for the CCSS Grade 4 with the Florida NGSSS Grade 4 and indicate the major areas of misalignment. Finally, I calculate the alignment index for each of the three pairs of books and describe differences with the standard-to-standard comparison.
Results
Textbook Alignment to the Common Core
How Well Aligned Are Textbooks to the Standards?
The results of the main alignment analysis are presented in Table 2. This table contains all pairwise alignment indices among the nine documents, though the indices relevant to the first research question are bolded. The three alignment indices for the Common Core–aligned textbooks range from .294 (Math Connects) to .396 (Go Math!). For Saxon Math Intermediate 4, which is not intended to be aligned to the CCSS, the index is .282. These figures indicate that 28% to 40% of the textbooks’ content is in perfect proportional agreement with the CCSS.
Main Alignment Indices for Pairwise Comparisons of Standards and Textbooks
Note. The results of three alignment analyses are presented in bold.
Another way to interpret these values is to compare them to typical alignment indices from other analyses—for instance, these values are slightly higher than the average mathematics test–standards alignment (Polikoff et al., 2011) and instruction–standards alignment (Polikoff, 2012b) in the NCLB era. A third way to interpret these values is to compare them to simulated distributions of what alignment values would be expected by chance using properties of the coding procedure (number of raters, agreement rate of raters, etc.; Polikoff & Fulmer, 2013). All of the alignment indices presented here far exceed the expected average alignment indices due to chance, which are approximately .16. Indeed, the 99th percentiles of the chance alignment distributions are approximately .17, so the obtained values show strong evidence of greater-than-chance alignment. Thus, a first conclusion about the alignment of CCSS-aligned textbooks with the CCSS is that the alignment indices are modest, but they clearly show some degree of alignment effort by publishers given the indices far exceed that which would be expected by chance.
The alternative alignment index is based on a less stringent definition of alignment, in that it does not require exact proportional agreement. With the standards as the reference, this index describes the proportion of textbook content on SEC cells that are also in the standards. This is useful to know given the common criticism of the U.S. math curriculum’s overwhelming breadth and shallowness (e.g., Schmidt et al., 2001). The values of this index are shown in the second column of the top panel of Table 3. The alignment values range from .647 for Saxon to .796 for Go Math!. Thus, for each of the four textbooks, between 64% and 80% of the content in the textbooks is on SEC cells that are also included in the standards. Based on this alignment definition, a second conclusion about CCSS-aligned textbooks is that the large majority of the content they emphasize is indeed content found in the CCSS. However, given the disparity between the main alignment index and the alternative index, it is clear that the content in the CCSS-aligned textbooks is not evenly allocated across objectives (as is assumed for the standards). Indeed, a glance through the textbooks indicates that they are heavily focused on certain content areas (e.g., multiplication of multidigit numbers) and much less so on others (e.g., understanding additive properties of angles).
Sources of Misalignment in Textbook–Standards Alignment Analysis
Note. Three rightmost columns add to 1, except for rounding.
It is possible to flip the alternative alignment index so that the textbooks are the reference, in which case the interpretation is the proportion of standards content that is included at all in the textbooks. In other words, this is a measure of CCSS coverage, giving credit to the books if they cover CCSS content in any amount. These values are shown in the second column of the bottom panel of Table 3, and they range from .757 for Go Math! to .838 for Math Connects. In other words, depending on the textbook, between 75% and 84% of standards content is covered by the textbook (i.e., 16% to 25% is not covered at all). Interestingly, the rank ordering of the books on this measure is nearly the opposite of the rank ordering on the previously mentioned indices. That is, textbooks in which a larger proportion of the content comes from the standards are not necessarily the textbooks that cover the largest proportion of standards content.
What Are the Sources of Misalignment?
Given these alignment values, the analysis next turns to an exploration of sources of misalignment. The first five columns in Table 4 present the marginal proportions for cognitive demand for the two standards documents and the seven textbooks. To offer a more global assessment of the differences, the last column treats the cognitive demand variable as a continuous 1 (memorize) to 5 (solve nonroutine problems) variable. The cognitive-demand analysis indicates that each of the three CCSS-aligned textbooks and Saxon Math Intermediate 4 emphasizes memorization and procedures overwhelmingly (between 87% and 93% of total textbook content). In contrast, content analyses indicate just 60% of the CCSS content is on these two levels. Furthermore, while almost one third of the standards content calls for students to demonstrate understanding, just 7% to 13% of the textbooks’ content is at this level. Finally, the four textbooks have essentially zero coverage of either of the top two levels of cognitive demand, as compared to 11% in the standards. Clearly, there is a good deal of misalignment at the cognitive-demand level in the textbooks—all of them systematically fail to cover the more conceptual skills called for by the standards.
Proportional Emphasis on Cognitive-Demand Levels in Textbooks and Standards
Note. First five columns are proportions of total content. Proportions may not add to 1 due to rounding. Index treats cognitive demand as a continuous variable, with memorize equal to 1 and solve nonroutine problems equal to 5.
Another way to describe misalignment is to partition the misaligned content in both the textbooks and the standards. Of the textbook content that does not count as aligned under the alternative alignment index, there are two remaining types of misalignment—misalignment on cognitive demand only and misalignment on topic. Misalignment on cognitive demand only describes content where the topic is included in the CCSS but the textbook covers that topic at a cognitive-demand level that is not in the standards. Misalignment on topic describes content where the textbook covers a topic that is not in the standards at all. The values for these two indices are shown in the rightmost two columns of the top panel of Table 3. The proportion of content that is misaligned on cognitive demand is roughly .150 for all four books. The proportion of content that is misaligned on topic ranges from .049 for Go Math! to .196 for Saxon Math Intermediate 4 (i.e., 5% to 20% of textbook content is on topics not included in the standards at all). Among the misaligned topics, the two most prevalent are adding and subtracting decimals, a fifth-grade CCSS skill, and number sense/patterns, a third-grade CCSS skill, each representing about 2% of the textbooks’ total content, on average. In short, the misaligned textbook content is mostly on cognitive demand for enVision Math Common Core and Go Math! and evenly split between cognitive demand and topic misalignment for Saxon Math Intermediate 4 and Math Connects.
Again, the reference can also be flipped, as shown in the bottom panel of Table 3. This analysis partitions the standards content that is not covered by the textbooks into two types—the proportion of standards content that is misaligned on cognitive demand only (i.e., the textbook covers the topic that is included in the standards but at a different level of cognitive demand) and the proportion that is misaligned on topic (i.e., the standards cover the topic but the topic is not covered at all in the textbook). The proportion of standards content that is misaligned on cognitive demand ranges from .108 for Saxon Math Intermediate 4 to .147 for Go Math!. The proportion of standards content that is misaligned on topic ranges from .050 for Math Connects to .096 for Go Math!. Thus, approximately 1/10th to 1/20th of the topics covered in the standards are not covered at all in these four textbooks.
Comparing Common Core–Aligned Texts to Florida-Aligned Texts
The second research question asks to what extent the CCSS-aligned textbooks represent meaningful changes from prior Florida NGSSS-aligned texts. The results of three alignment analyses are presented in bold in Table 2. The three alignment indices are, for Math Connects, .705; for enVision Math, .674; and for Go Math!, .645. These alignment indices indicate that the three matched pairs of books contain 64% to 71% agreement on content—a much higher value than any of the alignment indices for text–standards or instruction–standards comparisons in this or any other previous SEC analysis. This is despite the fact that the alignment of the CCSS Grade 4 with the Florida Grade 4 NGSSS is only .281, as shown in the top left corner of Table 2.
It is also possible to calculate an alternative alignment index for the three textbook pairs. In this case, the Florida-aligned version is chosen as the reference, and the question is what proportion of the CCSS-aligned textbook is on content that is also included in the Florida text. The values for the three pairs are as follows: for Math Connects, .939; for enVision Math, .964; and for Go Math!, .931. The same calculation for the CCSS as compared to the Florida NGSSS gives a value of .391, indicating just 39% of Common Core Grade 4 content is on SEC cells covered in Florida’s NGSSS Grade 4. Given these values, it is clear that the new, CCSS-aligned textbooks introduce very little content that was not already covered in the previous NGSSS-aligned versions of the texts. This may explain why the alignment of the CCSS-aligned texts with the CCSS is on average no better than the alignment of the Florida-aligned texts with the NGSSS.
Discussion
Implementation of the CCSS is well under way, with aligned assessments to be fully rolled out in 2014–2015. Arguably one of the strongest influences on the successful implementation of the standards will be the quality and alignment of the curriculum materials used by teachers. Publishers have created Common Core–aligned texts and marketed these to schools and districts, and they are being taken up and adopted presently. The purpose of this analysis was to provide a first look at the degree to which Common Core–aligned texts truly are aligned with the standards, using the most rigorous alignment analysis yet applied to the textbook–standards comparison. Seven fourth-grade mathematics textbooks were analyzed as a proof of concept of what could be a much larger exercise.
The results of the analysis are not clear-cut. On the positive side, the large majority of content in the standards is covered in the three “Common Core–aligned” mathematics texts analyzed here. Also, the large majority of the textbooks’ content is on content found in the standards. These findings are both also true of a fourth book, Saxon Math Intermediate 4, which was not intended to be aligned with the standards. Another positive conclusion is that very few topics covered in the CCSS are not covered at all in the textbooks—less than 10%, on average, across books.
On the other hand, there are a number of problems with textbook alignment to the standards. Overall, in the main alignment index, all three of the Common Core–aligned books have alignment indices with the standards that are .4 or lower. There appear to be three primary sources of misalignment. First, the textbooks systematically overemphasize procedures and memorization and underemphasize more conceptual skills relative to their emphasis in the standards. This is true across all topics. Second, the textbooks each have their own content emphases that do not correspond with the uniform emphasis assumed for the standards. While of course it may be the case that some CCSS objectives indeed merit more coverage than others, this finding raises the question of whether teachers, publishers, or someone else (school boards, administrators, etc.) should decide where to put the relative emphasis across a given set of objectives. And third, the textbooks do not cover some portion—one fourth to one sixth—of the SEC cells covered in the standards. Finally, this analysis highlighted that the new CCSS-aligned textbooks overlapped substantially with the old Florida NGSSS-aligned textbooks, despite the fact that the two sets of standards call for substantially different content.
Taken together, these results indicate that for these three textbooks produced by major publishers and marketed as Common Core aligned, there are meaningful alignment problems. Teachers relying on these materials to help them teach the Common Core for fourth-grade mathematics will practice misaligned instruction in several ways. For instance, they will systematically fail to teach the advanced cognitive-demand levels called for by the standards. They will also overemphasize some standards topics and neglect others. Unless teachers are made aware of these shortcomings of existing curriculum materials and assisted to address them by supplementing their curricula, implementation of the standards may be weaker than is desired.
Of course, one of the major limitations of this work is that it applies only to the limited set of textbooks studied. Unfortunately, the task of content-analyzing entire textbooks, with thousands of exercises apiece, is both time-intensive and expensive. Ongoing research attempts to find a way to drive down time and financial costs associated with the content analysis by exploring the extent to which the results presented here are sensitive to various approaches to simplifying the content analysis task (e.g., by analyzing only the text or only some random selection of exercises). That work (Polikoff, Zhou, & Campbell, in press) suggests that the results obtained here can be replicated by analyzing only a small portion of the textbook, such as every fifth item. It also illustrates that the text in the lessons is typically somewhat more standards aligned than are the exercises and that cognitive demand is especially low for textbook exercises. Further working to simplify the content analysis task to improve its feasibility is important.
There is ample opportunity for other fruitful lines of research on curriculum alignment. The work can certainly be expanded to other textbooks, grades, and even subject areas; the SEC has languages in English language arts, science, and social studies. Content analyses of textbooks can be supplemented by content analyses of other curriculum materials, such as freely available online curriculum materials that are increasingly widely used by teachers. Given that research shows that textbooks vary in their effectiveness (e.g., Agodini et al., 2010), it would be worthwhile to investigate the extent to which textbook content may be associated with effectiveness. Finally, it is essential to move from the textbook into the classroom to understand how curriculum materials influence teachers’ instructional responses to the standards.
Despite the proliferation of online curriculum materials, textbooks remain an important mediating factor separating the policy of content standards from the outcome of improved or aligned instruction. Unfortunately, the results of this work indicate alignment problems that may impede teachers’ standards implementation. Research has shown for some time that there is a great need for more and better information on curriculum quality and alignment (Chingos & Whitehurst, 2012), and researchers will certainly have to play a substantial role in the provision of these data. This article presents a first attempt to apply a research-based alignment tool to this problem—a tool that could be applied relatively easily at scale, given sufficient resources, to improve the quality of information available to educators choosing texts. By providing a detailed accounting of the shortcomings of existing curriculum materials, it is hoped that publishers will improve their products and educators will work to address the limitations of the materials they are using. Without these efforts, the Common Core’s vision of instructional improvement may fall into the dustbin of failed instructional reforms of the past.
Footnotes
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
