Abstract
The commentaries in this special issue as well as the articles they address contribute to the knowledge base about advanced learners, how they are identified, and how they are best served. In our Editors’ commentary, we have organized our thoughts following the same order in which the commentaries appear in the remainder of this issue. As the other commentary authors have done, in our discussion, we endeavor to provide our own perspective as well as a coalescing of ideas around the ideas addressed in each commentary.
Card and Giuliano (2014)—Does Gifted Education Work? For Which Students?
A question that is too rarely considered in gifted education research is whether or not “effective” programs or interventions would have been effective for only those students who participated, or if they also might have been effective for a much larger population. In our work with schools, it seems this question usually is avoided lest it open the floodgates to even broader participation in already-underfunded and understaffed gifted programs and related interventions. In this particular study, Card and Giuliano made use of a natural experiment to determine which students benefitted most from a particular gifted intervention: (a) non-disadvantaged students who met traditional 130+ IQ criteria; (b) gifted students classified as English language learner (ELL) or low-income, who met modified 116+ IQ criteria; or (c) high-achieving students who were not identified as gifted using either of these IQ criteria but were placed in the gifted classrooms anyway to fill spaces. In this particular setting, from the standpoint of best practices in gifted education, the intervention was not ideal. All students in the gifted classrooms completed the same curricula, but because the identified students were able to complete it faster, they also engaged in additional enrichment activities. Little other information regarding the nature of the “gifted” program was provided.
The greatest take-away from this study is that both the groups of students who were identified using intelligence tests showed significant but small to zero practical benefits as measured by reading, math, writing, or science achievement test scores. However, those students who were placed in the programs based on past achievement showed increased reading and math achievement test scores of 0.4 to 0.5 standard deviations. This points to two important lessons for gifted education practice. First, the way in which students are identified (the tool or process) and the unmet needs of those students must match as closely as possible to the program to be provided. For example, it is possible that the high IQ students had already mastered all there was to learn from the program and that is why they showed little gains. It is also possible that the students with the most relevant needs, that is, those who were most suited for the program, were those identified on the basis of achievement (as opposed to intelligence). In either case, the first lesson is that any intervention should be based on student need, and therefore the students should be selected using a content-valid measure of the curriculum to be provided. Second, the field of gifted education needs to consider the potential harm that might come to students, to schools, and to the field as a whole when students who might benefit from a program (such as the high achievers who did not also receive high IQ scores) are prevented from participating. In our experience, educators of K-12 gifted students often seem extraordinarily concerned with false positives—that is, the potential for a student to get into the program who might not be successful. But what about false negatives, or the students who are kept out even though they might have benefitted? In a different setting, the high-achieving students in this study easily could have been told they were not gifted and, therefore, not been allowed to participate in a program that did in fact result in substantial learning gains. We believe this state of affairs is all too common in the schools, especially in states where the criteria for giftedness are set by the state, and funds are only allocated for students formally identified as gifted. We are concerned that gifted programs often are needlessly exclusive in their otherwise well-intentioned use of identification methods that cannot be supported from the standpoint of their ability to predict success in the program. In general, it is likely that more students could benefit from gifted services, beyond those who are typically provided with them (see also the discussion of identification error rates in Peters, Matthews, McBee, & McCoach, 2014). Findings by Card and Giuliano (2014) seem clear: Students who did not quite meet the identification criteria actually benefitted the most from the program provided. This seems a compelling example of inappropriate identification (i.e., eligibility) criteria being used for this particular service. Instead, schools should consider the potential harm (or lack thereof) that would result if their identification procedures were far less exclusive.
To return to the point regarding the match between identification and interventions, we want to point out an interesting finding that was also noted by the commentary authors in this issue. At least in this specific study by Card and Giuliano, general measures of ability or intelligence were not successful in locating students who would benefit from the particular intervention in which they were placed (though exactly why is not clear). In light of the ongoing debate in the field of gifted education about the best ways to identify students, this particular study reinforces that the most important characteristics of a “good” identification process are that it aligns well with the programs and services to be provided, and that the students selected are those who will benefit. This can not only be judged independently by comparing the content of the assessment against the content of the intervention, but it can also be checked empirically by examining what proportion of the identified students do well in the program. If, as in the present study, the answer is that none of them performed any better than those who did not participate, then something is wrong; either the identification protocol or the programming provided needs to be revised.
We were encouraged to see Warne as well as Makel and Wai focus so heavily on the purpose of “gifted education” services in the particular district studied by Card and Giuliano. Far too often the questions of “why do we have gifted education?” or “what’s the goal of this particular intervention?” are never considered. After all, if the point of the program is (for example) to foster creative thinking, then criticizing the program for not resulting in increased math or reading achievement scores would be inappropriate. Makel and Wai made this point eloquently by analogy, stating that “if a prescription does not treat a symptom it was never intended to treat, it does not make the prescription useless” (p. 76). Although our commentary authors noted that the “program” provided in this particular district was far from being ideal or best practice in gifted education, we still believe there is much to be learned from this study, and we think it serves as an excellent real-world case study of gifted education in practice. In designing programs, identifying students, or evaluating systems, scholars and K-12 practitioners need to consider the larger goals of the program. After all, not all interventions in K-12 schools are intended solely to increase test scores, nor should they be!
Card and Giuliano (2015)—Can Universal Screening Increase the Representation of Low-Income and Minority Students in Gifted Education?
The same authors, Card and Giuliano, conducted a second study (2015) on the same large district’s gifted education program, this time focusing on the identification system and its effect on identification rates of underrepresented learners. We believe this article holds vital implications for the field of gifted education.
First, the district policies evaluated by Card and Giuliano (2015) showed that underrepresentation can be greatly reduced and the number of low-income, ELL, and racially or ethnically diverse students can be increased through the application of universal screening and the adoption of group-specific norms. This supports a point highlighted by McBee, Peters, and Miller (in press) that universal screening will always result in the maximum identification system sensitivity (or in other words, it will miss the fewest students, in comparison with all other possible approaches). For the K-12 practitioner, this means that whenever possible, the assessments used to identify gifted students should be administered to as many of the students in the potentially eligible population as possible. Of course, doing so will also increase the time and cost needed for identification. However, the greater the extent to which practice departs from this recommendation (such as through the addition of a screening phase), the less sensitive the identification system becomes. For example, this implies that a district should test all second-grade students for participation in gifted programming, rather than only testing those who are first nominated by a teacher. When this proves too costly or time-consuming, a two-phase screening system can be utilized to lower costs, but it must be implemented with care lest such a system drastically inflate false negatives.
A second major implication demonstrated by Card and Giuliano (2015) is that a district that consciously chooses to allocate resources for this purpose can make a difference in gifted underrepresentation. Universal screening is a good first step, but it alone will never fully eliminate underrepresentation. In the case of their study district, the addition of universal screening was accompanied by the use of group-specific preferences. Students classified as ELLs and those from low-income families were identified via different cut scores compared with their more advantaged peers (i.e., 1 SD above the mean as opposed to 2 SDs). This provided a major avenue through which underrepresentation was mitigated. Such a change not only carries clear consequences with regard to the identified population’s size and diversity, but it also demonstrably decreases underrepresentation rates. Lakin in her commentary noted that the students who benefitted most from the modified identification system were primarily African American, Hispanic, low-income, and ELL, and this finding is consistent with other research on Florida’s alternative gifted identification pathway (McBee, Shaunessy, & Matthews, 2012). This finding carries a clear message for scholars and practitioners of gifted education: The field as a whole tends to spend too much time debating the best tools or processes to identify underrepresented learners, but there is a much simpler first step to take—the implementation of both universal screening and group-specific norms or preferences. It is not always about using a different test or a different process to locate underrepresented learners. Sometimes schools simply need to use existing assessments in a different way (Matthews & Peters, in press) to best locate talent among diverse populations.
Reading the Card and Giuliano (2015) article also makes us think about the question, by what criteria should identification systems be judged? To us, the primary criterion for the quality of an identification system seems to be the degree to which it correctly identifies every student who needs and will benefit from the program to be provided. This is similar to psychometric sensitivity. Despite its fairly straightforward nature, we wonder how often this goal is considered by schools. Do schools actually take proactive efforts to make sure they are locating all students who might benefit? Is the intervention’s goal to challenge everyone? To assure that no student is placed in an intervention unless the district can be 100% confident that the student will benefit? In an era still heavily influenced by No Child Left Behind, we worry that higher priority goals in practice may be to keep the program size as low as possible so gifted budgets can be diverted to fund other causes, or to upset as few non-involved people as possible by keeping under wraps the visibility of gifted education programming.
To combine these points, if the goal is correctly locating all students who might benefit from or have a need for a particular program, then two-stage identification systems (often made up of a screener or nomination phase followed by a confirmation phase) should be far less common than they presently are, or at the very least, these efforts should be much more carefully designed. As Lakin has observed, teacher nominations are still the most common starting point in gifted identification in the United States. This is not just a problem for sensitivity in general; it presents an even more severe issue for underrepresented learners, a group that the Card and Giuliano (2015) study showed to be the students most harmed by the use of two-stage identification systems. The overall message seems clear: Whenever possible, districts should avoid two-phase systems and instead should strive for universal assessment. This position does have cost implications, but districts at the very least should consider the balance between increased cost and decreased identification sensitivity if universal assessment is not used. Every educator should be concerned about how many students are being missed because of the application of poorly designed two-stage assessment systems.
Bui, Craig, and Imberman (2011)—Is Gifted Education a Bright Idea? Assessing the Impact of Gifted Education and Talented Programs on Achievement
This study’s authors sought to evaluate two practices and programs: (a) participation in a middle school gifted program and (b) participation in gifted magnet programming on the basis of random lotteries among eligible applicants. Moon, in her commentary, makes the excellent point that program evaluations of any kind (experimental or not) are far too rare in gifted education. We would go even farther to argue that the lack of rigorous program evaluation data showing any level of effectiveness is actively hindering support for advanced educational programming. Despite the historical lack of data, we are encouraged by recent studies such as Bui et al. and Davis et al. (both commented on in this issue) and by Adelson, McCoach, and Gavin (2012), all of which have sought to examine the actual effects of gifted programming.
The Bui et al. study is important for several reasons. First, as we stated earlier with regard to Card and Giuliano, districts and studies only rarely consider whether or not students who have just missed the identification cutoff would have benefitted from the program. In implementing a Regression Discontinuity Design (RDD), Bui and colleagues looked at this very question when evaluating participation in a middle school gifted program. Unfortunately, after 1.5 years of participation, those who participated showed no differences in achievement (aside from small effects in science) compared with students who did not participate. The same outcome was found with the gifted magnet school programs: Aside from some increased achievement in science, no significant increases in achievement were observed among participating students.
As Moon noted, some of the null finding may be ascribed to the wrong students being placed in the two interventions (and this possibility also may help explain results in the Card and Giuliano [2014] study). This is because the matrices used for identification eligibility gave more points for reading and math than for similarly high scores in science or social studies. In addition, Moon brought up a key question that is also at issue in the second Card and Giuliano article: “To what degree does the awarding of obstacle points (e.g., socioeconomic status, English language proficiency) create a situation where there may be a misalignment between students’ needs and the services provided?” (p. 95). In both of these studies (Bui et al., 2011, and Card & Giuliano, 2015), students from certain populations were given admissions preference in one form or another in order to increase their participation rates in gifted programs. Although the field of gifted education has long grappled with the issue of underrepresentation and how it can be addressed, to date there have been few empirical examples of schools actually implementing proposed solutions. In both the Card and Giuliano (2015) and Bui et al. studies, the districts made proactive efforts to address underrepresentation. As a result, both districts achieved some level of success. What remain unanswered are questions of what political barriers these districts faced in the process of implementing these policies, and how the programs dealt with the broader range of educational needs now present in the newly identified populations because of the changed identification preferences. These questions offer productive directions for future study. Despite these limitations, we are happy to see districts trying to address underrepresentation, even if doing so may make uncomfortable those people who previously have been satisfied with the status quo.
Collins and Gan (2013)—Does Sorting Students Improve Scores? An Analysis of Class Composition
Instructional grouping and its less popular older cousin (tracking) have been debated and researched for decades. Often, as was the case with the analysis by Collins and Gan, the concern is with whether or not grouping high-performing or gifted students “harms” low-achieving or non-gifted-identified students. Right away we find this focus should be unpacked. Are special education inclusion interventions ever evaluated for what educational, social, or affective influence they have on students who are already grade-level proficient? Is the allocation of resources to programs such as Head Start or Reading Recovery ever analyzed with regard to how it influences students who are not in those programs? We think not. And, we find it troubling that the burden is assumed to fall only on advanced learners to prove that the services they need do not negatively influence other students. This is a worrisome double standard. If the U.S. educational system is founded on a belief that all students should learn and benefit from their time in school, then this goal cannot be selectively applied. Either everyone deserves to learn, or we need to concede that the system’s purpose is not to foster growth among all learners, but rather simply to move most students to some minimal level of proficiency.
Overall, Collins and Gan found that “sorting” students on the basis of their prior scores had a significant effect on later math achievement. Both high- and low-performing students benefitted more from homogeneous classes in comparison with classes grouped solely by student’s age. What’s more, as Gentry points out in her commentary, the researchers had no idea what was actually done in the clustered classrooms. All they were able to assess was the relative diversity of the students in each classroom and the effect that this diversity likely had on these students’ learning outcomes. This limitation is important for two reasons. First, it suggests that grouping methods benefit advanced learners while not harming lower achieving students, regardless of the specific details of how this grouping was accomplished. Not only did instructional grouping not harm the average or low-performing students, but all students appeared to benefit in math and, to a lesser degree, in reading.
A second implication was not addressed directly by Collins and Gan, but we find it extremely relevant nonetheless, and that is that their findings support the idea that teachers are better able to reach a larger number of their students when they have a narrower range of learning needs to address. Although this would seem a common-sense observation, there has been relatively little empirical study to support it. Consider that the Response to Intervention model (RtI) expects that roughly 80% of students can be challenged in a Tier I or general classroom (Peters et al., 2014). If this is the expectation, then we must ensure that teachers are not being saddled with unreasonable expectations in terms of their ability to meet widely varying student readiness levels. One straightforward way this expectation can be made more reasonable, thus facilitating the success of initiatives such as RtI, is to use instructional grouping at the schoolwide or grade level to decrease the range of student readiness within each classroom that any given teacher is responsible for addressing in her or his instruction.
Too often the debate about instructional grouping is framed as being about equity versus excellence. But why must these be seen as mutually exclusive? The Collins and Gan study supports the idea that a district can achieve both goals. Cluster grouping, for example, offers a well-established method through which a larger proportion of students can be challenged during a greater portion of the instructional day without a need for undue or excessive restructuring of the existing school schedule (e.g., Gentry & Mann, 2008). It also appears that such methods can help challenge a wider range of students such that fewer students need more costly or invasive individualized interventions.
We agree wholeheartedly with Gentry in her call in this issue for a better understanding of what makes for successful verses unsuccessful grouping. What needs to be changed in the grouped classrooms for everyone to learn? How diverse is too diverse when it comes to students’ instructional readiness in a given classroom? And, what variables mitigate the effects of grouping on learning outcomes? There have been many studies on grouping with just as many findings. It is time to sort the signal from the noise to better understand what the key factors are that make clustering a positive intervention, rather than a potentially inequitable one, for students at all ability levels.
Davis, Engberg, Epple, Sieg, and Zimmer (2010)—Evaluating the Gifted Program of an Urban School District Using a Modified Regression Discontinuity Design
The particularly interesting implications of the Davis et al. study are not necessarily what the authors found with regard to retention, though that too was interesting, but rather the other practical implications of their findings (overall they concluded that being in a gifted program made students more likely to stay in the district). For example, though not a specific research question, the authors found that test administrators had manipulated IQ scores just below the cut offs in order to “boost” many students into eligibility. This is not a new finding (Matthews, Peters, & Housand, 2012, reported identifying this as “permissive assignment” in 2007), but is problematic for two reasons. First, it means that students are being admitted to the program who, by definition, had less of a need for it because they did not actually meet the entrance criteria. Second, and in our mind more importantly, low-income students are much less likely to be able to afford to seek out a private psychologist to get the kind of second opinion that was often necessary for reconsideration when a student had just missed the cut score on his or her first attempt. In a way, low-income students had only one chance to meet the criteria whereas higher income students had as many chances as they could afford. This type of differential access to testing resources will contribute to underrepresentation as long as it is allowed in identification policies.
The Davis et al. article was also unique in that it treated retention in the district as an outcome variable. Although this is an interesting approach for reasons we will address, we disagree with the authors’ premise that one of the reasons urban districts have gifted programs is to “retain students who might otherwise leave for suburban or private schools” (p. 33). Although there is some literature to support the idea that higher income families are the most likely to open enroll out of their home district (Welsch, Statz, & Skidmore, 2010), we know of no cases where gifted programs have been put in place explicitly to retain high-income families (though of course this may be a secondary, unpublicized rationale in some cases). Despite this minor disagreement over purpose, we do agree with the authors that schools should consider what influence gifted education programs might have on overall family satisfaction. With the increasing prevalence of charter schools and open enrollment, schools could see themselves losing substantial sums of money if families transfer out of the district because of a lack of available opportunities within the district for advanced learners. We also know of districts from around the United States that garner substantial amounts of money through open enrollment transfer into the district from surrounding areas, due directly to their gifted programs. In these cases, gifted education primarily seems to be viewed as a money maker for the district.
Overall, Davis et al. did find that being identified as gifted had a significant effect on retention, with higher effects among non-subsidized lunch students. In other words, families who did not receive subsidized meals but did have a child identified as gifted were more likely to stay within the district. This effect was non-significant for those families receiving subsidized meals, likely because low-income families are less likely to open enroll or otherwise leave the district.
Overall the greatest take-away from this article is its own philosophical question: Should urban schools go out of their way to offer gifted education programs with the explicit goal of retaining families that might otherwise leave? From a purely financial perspective, this could make logical sense. For example, in Wisconsin a school can lose several thousand dollars per year for every student who open enrolls out of a district. The same holds true for students in North Carolina who transfer to a public charter school, pulling much of their funding with them from their local school district. If, instead, the district can keep a student by providing a service for a lesser amount of money, then that investment has paid off. The downside is that any funds expended to keep a student in the district are now no longer available for other programs and services that might go to (for example) remediation, and this presents its own political and philosophical problems. In the end, we believe that programs should be provided that ensure all students are challenged and, in fact, urban districts are likely to have far more high-ability students then they realize. The focus on finances should, ideally, be secondary to the focus on student growth and learning.
Summary
After reading several times through the original National Bureau of Economic Research (NBER) articles and the invited commentaries in this special issue, one thing we are left to ponder is whether or not the benefits realized by these gifted programs—that is, increased student retention associated with being identified as gifted, and/or increased learning due to clustering–could have been realized even without the concept of “giftedness.” For example, in the Collins and Gan study, students at all levels showed academic benefits from instructional grouping by prior achievement. Similarly, the 2014 Card and Giuliano results pointed to high-achieving students benefitting more than the students who were identified under more traditional (IQ-based) giftedness criteria. What stands out across these studies is that all the students who were previously high performing could have been served effectively without ever considering whether they were or were not identified as gifted. Makel and Wai in particular observed that gifted education research (and specifically the work by Card and Giuliano) has tended to be reported negatively in the media. It is likely that at least part of this is due to the term “gifted” itself and to the associated connotations that it carries (cf. Matthews, Ritchotte, & Jolly, 2014). Could all or some of the benefits described in these articles be achieved without the complicated process that is gifted identification? Are there other, more cost-effective ways to deliver academically advanced instruction to those students who would be most likely to benefit from it? If we could abandon the term gifted, would some of the negative side effects (e.g., challenges of elitism and concerns over differential racial representation) be mitigated? We believe these questions are worthy of further study.
A few caveats or general limitations that apply to the NBER articles as a body of research also are worth pointing out. First, all the articles related to gifted education (aside from the Collins and Gan article on sorting) involved a heavy reliance on IQ for student identification. Most authors also mentioned that alternative pathways were available, but nevertheless we think it important to point out that individual IQ testing is no longer as common in gifted identification as it once was (National Association for Gifted Children, 2015), and that the adoption of multiple-criteria identification systems is becoming more widespread in schools.
In the studies by Davis et al., McBee (2006), and by Card and Giuliano (2015), minority, ELL, and low-income students were undernominated by their teachers for gifted programs. This is not a surprise to most people familiar with gifted education, but it does call into question whether or not teacher nominations should continue to be used as the initial catalyst for identification. When a student must be nominated before any other identification procedures are put into action, that nomination stage can only harm the overall sensitivity of the identification system as a whole (McBee et al., in press). Potential costs aside, if there must be a formal identification of students with the label gifted, then universal screening is far more effective than other approaches at maximizing sensitivity. The studies discussed in this issue as well as other work of which we have been a part seem to make a clear case for the use of teacher ratings only as part of a multi-criteria system, and never as the single or even primary pathway to additional consideration.
Multiple commentary authors pointed out that the purpose of particular programs or “gifted education” more broadly was rarely discussed in these articles. We find this quite troubling. Often there is neither a specific goal or purpose offered in support of gifted education services nor even a consistent choice of educational interventions. Why then do we have such programs? To retain wealthy students? To challenge those who are under challenged? Because gifted students are inherently at risk? As we stated in our introduction, policy makers often lament the lack of research on the effectiveness of educational interventions, but at the same time interventions must be evaluated based on whether or not they achieved their purpose or stated goal. One promising trend in the field is that generic “gifted” programs are falling out of favor and are being replaced by domain-specific talent development programming (Olszewski-Kubilius & Thomson, 2015). To be defensible, schools need to be ready to articulate the purpose of every advanced academic intervention they provide, so that these interventions can be appropriately evaluated and so students who participate in them can be assessed for progress within specific domains of learning. Failing to articulate a common, logical purpose for gifted education has hamstrung the field for decades, and this must be addressed if we are to make progress.
Overall we found these articles as well as their respective commentaries fascinating. In fact, although of course our personal views are biased in specific ways, we cannot help thinking how interesting it would be to place this special issue before students in a gifted educational doctoral seminar and then observe the ensuing discussions. These authors’ findings are neither uniformly positive about gifted education, nor unanimously focused on gifted education as we believe it should be practiced, but despite these issues we can agree that their work carries lessons for practice that are both meaningful and substantial. Sharing these with a wider audience was our goal in crafting this special issue, and we hope that by doing so we will foster further dialogue between educators and researchers in gifted education and our colleagues in economics and other social science fields of study.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
