Abstract
The authors describe a survey of graduates of a high school program for the academically gifted and argue that such data, both qualitative and quantitative, can add an important element to the evaluation of a program—in this case, a regional Governor’s School—especially by using a Comparison group of high-achieving graduates of the same systems’ regular programs. Confirming and amplifying the findings of state evaluators, the survey results show the value of the program’s elements, especially the research strand. These elements draw from recommended practices, applying them to a hybrid pullout/self-contained program that serves classes who are together for 4 years in four core subjects. The survey thus has implications for program design and evaluation, while yielding implications for further research.
“Being in classes with the same people every year allowed me to be more outgoing and made us strive to push ourselves.”
The purpose of this article is to describe a survey of graduates carried out by The Commonwealth Governor’s School (CGS), an academic-year program in Virginia for academically gifted students in Grades 9 through 12 in four core subjects—English, math, science, and social studies—operated by a consortium of counties. The purpose of the survey was to help evaluate, as a supplement to other evaluation efforts, the program’s benefits as seen by its graduates. The survey, conducted in the 2011-2012 academic year, had been recommended by two state evaluation committees as a way to provide evidence of the long-term benefits of the program. The survey employed a mixed-methods design—using both closed- and open-ended question sets and comparing those sets as recommended for ethnographic research (Brown, 2013; Creswell, 2009, 2014; Lincoln & Guba, 1985; Patton, 1980, 2002; VanTassel-Baska, 2004). The following general research questions were approved for inquiry by the CGS Governing Board:
We will describe the survey’s planning, administration, and results and discuss its value as a tool for assessing program benefits, particularly through the use of a Comparison group drawn from high-performing graduates of the regular (i.e., non-CGS) programs at the base schools from which CGS students were drawn. We believe this comparative element provides important evidence in assessing the program’s effects. As the school’s program designers have considered widely recommended national standards for gifted learners (National Association for Gifted Children, 2010) and curricular models such as the Parallel Curriculum (Tomlinson et al., 2002) and the Integrated Curriculum (VanTassel-Baska & Brown, 2007) in designing and implementing its curriculum, the survey results should be of interest, not only to program stakeholders but also to anyone concerned about program evaluation, development, and strategic planning. As that is the case, we will describe the school’s design, comparing it with other recent models and review practices in program evaluation applicable to the present survey.
Overview of the CGS Design
During the planning and administration of the graduate survey, the CGS, one of 18 academic-year Governor’s Schools in Virginia, was operated by a three-county consortium in the Rappahannock area of north-central Virginia and housed at five sites among the counties’ 11 high schools. The program typically serves between 500 and 600 students, most of whom apply as eighth graders and are admitted according to criteria agreed upon by the consortium, but who also may apply in any subsequent year. Students spend half the day at CGS (most are bussed to their CGS site), where they are with the same classmates in the core subjects—English, math, science, and social studies—for Grades 9 to 12, often with the same teachers for more than a year. Students spend the other half-day at their base schools, where they take electives and participate in extracurricular activities. Distance-learning technology and shared field activities are used to connect the separate sites.
CGS is thus a bit of a hybrid: self-contained in that the same cohort of students (roughly 30 per grade level at each of the sites) stays together through the 4 years of high school in all four of the core strands—rather than specializing in, say, humanities or math/science. Nevertheless, there are characteristics of a pullout program: Students still spend a half-day at their base schools. Thus, they are still truly part of their base schools while building closeness with CGS classmates and teachers over a period of years.
Beyond the “hybrid” qualities of its design, the relative success of the CGS model would seem to be worth assessing, in that it seeks to implement recommendations called for by widely recognized national standards (Johnsen, 2012; National Association for Gifted Children [NAGC], 2010). Indeed, the NAGC standards are explicitly referenced on Virginia’s Department of Education website (The Commonwealth of Virginia Department of Education, 2012, “Maintaining High Standards,” para. 2).
CGS seeks to meet these standards in several ways (CGS, 2012). One is acceleration, a widely recognized feature of effective programs for gifted learners (Colangelo & Davis, 2003; VanTassel-Baska & Brown, 2007). Students have accelerated access to college-level (Advanced Placement [AP] or dual-enrollment) coursework in every grade and typically at earlier grade levels than in the regular school programs. CGS students often graduate with enough college credit to enter as sophomores or even juniors.
A second way CGS strives to meet the needs of gifted learners is by providing expert teachers and facilitating mentorship (Johnsen, 2012; NAGC, 2010; Tomlinson et al., 2002). CGS has emphasized recruiting and retaining highly qualified staff and extending their contract years to allow team planning and professional development and to provide summer enrichment for CGS students. CGS teachers all have desks in the contiguous CGS “pod” of rooms and thus are easily accessible during and after the school day. An important aspect of the program is the close association of the same group of students (30 at each site) for 4 years with the same group of teachers, who often teach the same students for more than 2 years—producing a community-of-learners effect.
A third way the program meets students’ needs is through differentiation—not just between those identified as gifted and those not so identified, but among the students in the program. Differentiation is at the heart of gifted education (Conklin & Frei, 2007; Roberts, 2012; Robinson, Shore, & Enersen, 2007; Rogers, 2007; VanTassel-Baska, 2006). In the CGS curriculum, an important vehicle for differentiation and for academic challenge and mentorship is the research strand. The “Culminating Activity” is a yearlong research project required of every student each year. The research project relates to a topic chosen by the student with faculty guidance and culminates in both a written report and an oral presentation to faculty and peers.
As the CGS program has sought to apply such recognized practices, its effectiveness is worth examining—especially as perceived by its graduates in comparison with those of regular programs.
Review of Selected Literature
The following sections consider similarities of CGS to other special schools for gifted learners and key issues in program evaluation applicable to the CGS survey.
Special Schools and The Commonwealth Governor’s School
In addition to issues of curricular scope and sequence, an important feature of a gifted program’s design is whether it operates during the regular academic school year. A related question is whether it operates as a separate school. According to data collected by the National Association for Gifted Children (2013), 10 states listed “magnet schools” or “Governor’s Schools” among their most common means of addressing the needs of gifted learners (p. 195). Such schools may be operated during a vacation period or during the regular school year. (The 2013 National Association for Gifted Education [NAGE] report does not give separate numbers for each type.) The regular-term schools may be fully separate in character, even residential, or they may pull students out of the regular programs for certain parts of the day or for certain curricular strands. The importance of such design features is highlighted by the recent devotion of a special issue of Gifted Child Today to “special schools” that reviewed several successful types—none exactly like CGS. Roberts (2013b), guest editor for the issue, noted in her introduction that successful programs can take a variety of forms and that the examples treated in the issue represented three of them: “early entrance to college, differentiation, and mentoring” (p. 157). CGS reflects the latter two to a significant extent, and the first to a lesser degree.
The CGS does not offer early college admission as does the University of Washington Transition School (UWTS) described by Halvorsen, Hertzog, and Childers (2013)—which aims to prepare students selected as eighth graders to enroll as University of Washington freshmen after just 1 year (the ninth grade). Although Robinson et al. (2007) found scant research on the effectiveness of special schools outside those connected with universities, Halvorsen et al. (2013) found “no statistically significant differences in social well-being, academic achievement, or attitudes toward school” (p. 192) between the early-entering graduates of the UWTS, admitted after 1 year in high school, and high-achieving graduates (National Merit finalists) admitted as high school seniors. The result implies that the early-admission track provided no social benefit beyond that provided by programs like CGS, where students are not admitted early to college but often do enter with enough advance credit to put them years ahead.
CGS is also not a school devoted to particular branches of curriculum, as is the Gatton Academy, which emphasizes math and science (Roberts, 2013a). CGS is more like the Gary, Indiana, program for gifted learners (Gilbert, 2009), which concentrated identified students in magnet schools separate from the regular program—a separation CGS makes for only part of the day. Nor is CGS a residential school, as Gatton is. However, the effect of a select and cohesive community of learners in a “special school” for 4 years in four core subjects, even for part of the day, makes CGS similar at least in some ways to schools such as the Gatton and the UWTS (Halvorsen et al., 2013).
In sum, it is reasonable to conclude that CGS reflects principles recommended for special programs for gifted learners (as embodied in the NAGC Standards) and is similar in at least some ways to model academic-year schools. An evaluation of its relative effectiveness, in comparison with that of the regular school programs, should thus be of interest to those concerned with program models and their evaluation.
Gifted Program Evaluation and the CGS Alumni Survey
This section moves from a specific description of how the CGS was evaluated by Virginia’s Department of Education, to more general considerations of program evaluation, and finally to specific issues of data collection and analysis.
Virginia’s Evaluation Procedure and the Present Survey
Virginia’s evaluations of its Governor’s Schools reflect widely recommended case-study and ethnographic evaluation procedures. The state’s practices also reflect a concern expressed by Borland (2003) that evaluation must be ongoing and systematic to be effective in maintaining the quality of the program. Thus, Virginia requires evaluation of its academic-year Governor’s Schools at 6-year intervals, plus an ongoing program of self-evaluation. At the time of the present survey, CGS had been formally evaluated twice, in 2003 and 2009, by visiting committees from the Virginia Department of Education. Two strands of evaluation are required, one internal and one external to the program. Preparatory to both the 2003 and 2009 visits, the CGS program itself conducted a self-study process involving students, teachers, parents, and administrators. For the external component, as required by state policy, visiting evaluation teams comprised of school officials and teachers from outside the districts served by the school. They conducted classroom observations, surveys, interviews, and focus groups, addressing all the recommended stakeholder groups. The 2009 report summarized its endorsement in its “Conclusion”: “The program shows remarkable understanding of the needs of gifted learners, and The Commonwealth Governor’s School has established a firm foundation for the continued growth and support from the area divisions” (Commonwealth of Virginia Department of Education, 2009, p. 10). “Division” is Virginia’s term for a school district.
Both the 2003 and 2009 evaluation teams urged CGS to survey its graduates for evidence of the program’s benefits: “It is recommended that the CGS survey alumni to ascertain the effectiveness of the CGS program in their post-secondary experiences” (Commonwealth of Virginia Department of Education, 2009, p. 11). The graduate survey described here, in fulfilling that suggestion, built on other types of data collected in the on-site evaluations. The survey, in effect, added the program’s graduates as a stakeholder category. Through the survey’s Comparison group, the survey also responded to concerns expressed by Borland (2003) and Boyd (2002) that students and others outside the program are in some ways stakeholders.
Expanding the Concept of “Stakeholder”
Evaluation is an important consideration for any education program but especially so for those that concentrate attention and resources on special populations. Comparing outcomes of a gifted program to those of its more traditional counterpart is important. Such concerns are rather recent in evaluation literature. Borland (2003), for example, has explicitly said his thinking about program evaluation has evolved. In his chapter on the topic for the Third Edition of the Handbook of Gifted Education (Colangelo & Davis, 2003), Borland noted that when he reread his equivalent chapter in the previous edition, he still “liked what I had written, as far as it went, but that was the problem: It did not go far enough” (p. 293). Borland now realized that evaluators needed to look not just at the program itself but also at its function and influence within a larger educational and community setting. Because of such influences, intended or unintended, Borland noted that the concept of “stakeholder” might need to be broadened. Programs for the gifted, after all, are not without cost, and resources for them might mean less for others—thus making program design an ethical matter. Borland is not alone. Adams, Mursky, and Keilty (2012) have also observed the frequent effect of such limiting factors as resource allocation.
Borland did not say so, but perhaps he would agree that such influences could also be beneficent. Infrastructure and activities required or generated by the program may be judged too expensive, but, if funded, they may benefit the larger population (e.g., by providing sophisticated videoconferencing and lab equipment, visiting experts, special events, teacher training, online resources, student projects). Whatever the cost-benefit concerns, Borland’s argument makes the evaluation of gifted programs a matter not just of academic interest but also of ethical obligation.
Recommended Data Sources and Collection Methods: A Need for Comparison
Educational researchers in general, especially those proposing new methods and models, have been urged to support their recommendations with rigorous experimental methods—or, where randomized assignment is not possible or is too expensive, to use carefully designed descriptive, case-study methods (Punch, 2009; Slavin, 2002). As evaluating a program for a special population, such as gifted learners, involves understanding the perceptions of those with a stake in its success, a naturalistic/case-study approach has generally seemed advisable. Thus literature on program evaluation recommends gathering a variety of data on the perceptions of a variety of stakeholders. The latter are typically said to include students, faculty, parents, administrators, and members of the community (Brown, 2013; Tomlinson & Callahan, 1994; VanTassel-Baska, 2008). VanTassel-Baska (2004), in describing William and Mary’s model of program evaluation, noted that the model envisions a wide variety of data sources and types, both qualitative and quantitative, and that sound approaches to evaluation can vary. She notes that a case study or naturalistic approach may be particularly appropriate for evaluating gifted programs because it considers how programs naturally develop.
Evaluators responding to Borland’s challenge to broaden the definition of stakeholder might well consider graduates in that light. Virginia’s representatives evaluating the CGS thought so. Brown, the current director of the Hunter College (CUNY), Center for Gifted Studies and Education, has noted in correspondence with one of us that former students would indeed be stakeholders and a potential source of good data, as one element of evaluation (E. F. Brown, personal communication, 2013). Clearly, it matters how the graduates of programs designed for gifted learners evaluate their preparation in light of experiences beyond high school—contexts for which the program aims to prepare them. And it should matter how their perceptions compare with those of similarly qualified graduates of the regular college-preparatory programs.
In view of such considerations, it is not surprising that there have been calls for studies comparing effects of a program to its alternative(s). Boyd (2002) has noted that evidence of success quite apparent to those in a program may not be as obvious to the larger community. As a result, says Boyd, “Evaluations that help gather the most credible evidence that a program is making a difference in the lives of its participants include a comparison group” (p. 3). And yet, according to Boyd and to others (Gilbert, 2009), such studies have been rare, and comparative studies seem even rarer. In a broadly conceived comparative study, Brody and Mills (2005) reviewed data gathered through the Center for Talented Youth, finding considerable benefit in higher education for graduates who had taken challenging coursework in high school compared with groups with less challenging work. These groups, however, were from different schools and systems. The CGS survey aimed for a closer comparison using comparable populations of graduates from the same systems.
In summary, there is a need for more research on the outcomes of special gifted-education programs operated by the public school systems, as compared with those of the regular school tracks. The CGS graduate survey thus responded to a need. By including a Comparison group from the regular schools it addressed both Borland’s and Boyd’s concerns.
Recommended Data Types
In addition to seeking data from a variety of persons’ perspectives, evaluators should also seek a variety of data types, both qualitative and quantitative. It is our experience that gifted program evaluations tend to follow this model and use methods researchers would recognize as naturalistic, the stakeholder survey being particularly common (Feng, 2004a). That is, these studies do not try to carry out experiments or otherwise manipulate the environment to isolate variables and test hypotheses. Thus, the Comparison group in the CGS graduate survey was not selected ahead of time and assigned to a different treatment to test its effect. We instead selected the Comparison group after graduation, but before the survey, to represent a group comparable with CGS graduates.
Data Analysis: Recommended Approaches
A key question is what to do with the data once they are collected. Creswell (2009, 2014) insists that what is crucial in mixed methodology is the integration of the data types using rigorous analysis. Creswell (2009) has recommended using content analysis to generate grounded theory and to make possible coding that would show, say, the relative strength of some theme across populations—an approach Creswell sees as particularly appropriate for naturalistic methods. He also insists that such analytical techniques begin during the collection of data, to allow emergence of grounded theory, a procedure followed in the present study.
The most promising approaches in evaluating gifted programs do envision an eclectic mixture of data and methods and strive for the “rigor” called for by Creswell and others. In fact, the College of William and Mary’s Center for Gifted Education has used the term eclectic in naming its model (VanTassel-Baska, 2004). The Center has used the model to carry out a meta-analysis of surveys of separate stakeholder groups across multiple programs and years. Feng (2004b), in her report on the study, noted that the stakeholder survey is a mainstay of program evaluation, despite, as Feng admitted, the oft-criticized subjectivity of self-report.
However, Feng also noted, as have Punch (2009) and others, that rigorous triangulation can produce convincing evidence. Feng recommends the disaggregation of data sets when multiple groups are surveyed because it allows evaluators to compare the results across groups of stakeholders—a process also recommended by Creswell (2009), to unify the analysis of quantitative and qualitative data sets. The goal of such triangulation is to provide a picture of the program’s operation and effects or products as full and rich as possible. In designing this survey of graduates, CGS hoped to enrich the picture produced by previous elements of evaluation.
Method
The methodology of the CGS graduate survey can be described as mixed (Creswell, 2009, 2014; Punch, 2009) because it produced two types of data—qualitative (through free-response prompts) and quantitative (through Likert-type scale questions). The instrument was developed and the data analyzed using methods described below. Every effort was made to accept Callahan and Moon’s (2007) challenge to use methods of data collection and analysis rigorous enough to be worthy of a larger audience.
Participants
The CGS group
The first group comprised all CGS graduates for whom the school had email contacts. One of us was still on the CGS faculty and was able to coordinate acquisition of contact information from as many graduating seniors as possible. In addition, there were email addresses voluntarily posted on the alumni web page linked to the CGS website (www.cgs.k12.va.us). Some of the latter had given their years of graduation but some had not. Altogether, these sources produced 134 email contacts, all of whom received invitations to the survey. There was no sampling. Every graduate whose contact information was available was surveyed.
The Comparison group
Selecting and contacting members of the Comparison group presented by far the biggest design challenge. In designing this group, Boyd’s (2002) recommendations were kept in mind: that members of the Comparison group should not be program participants but be as comparable as possible to those with qualities central to the program’s mission—in this case, both achievement and aptitude. Class rank was used as an indicator of achievement and the final SAT total score as a measure of aptitude. Naturally, a fair number of the top graduates in schools from which the program drew were CGS alumni, but many were not—in fact, typically less than half of the top 25. From the years 2007 to 2011—years selected because the number seemed manageable and the data comparable—266 non-CGS graduates met both of two criteria:
They ranked in the top 25 of their class, and
Their best SAT total score (verbal + math + writing) was 1,800 or higher.
A few were added who narrowly missed one criterion but were extremely high in the other. Emails were not available for these graduates, so it was necessary to get home addresses from the school systems and send hard copy invitation letters from the district superintendents. As with the CGS graduates, every member of the target population was surveyed.
Design of the Survey
Decisions in designing the survey flowed generally from three goals: (a) to address specific questions of interest to stakeholders; (b) to produce valuable comparative data, which implied the use of a Comparison group; and (c) to enhance credibility of results by triangulation of open-ended (free-response) and closed (multiple-choice) questions.
Development of instrument questions. (See Appendix A for the survey text.)
Free-response question component
Questions were developed from elements of the school’s mission statement, which was developed from NAGC (2010) recommendations and others.
Care was taken to make the questions as open-ended as possible and to avoid establishing preordinate categories (Lincoln & Guba, 1985; Patton, 1980, 2002).
Stakeholders were asked for feedback as the instrument was developed. These included the director, faculty members, and selected students, graduates, and their parents.
Clarity and face validity of free-response questions were tested using available seniors and visiting graduates, with supplemental interviews to clarify responses.
Multiple-choice component
Multiple-choice questions using a Likert-type scale were added to the survey to be sure certain issues of key interest to stakeholders were addressed whether or not they were raised in the open-ended responses. The Likert-type scale allowed for ranking levels of agreement—and for comparing them between the two groups, because of what has been called a “semantic differential” (Punch, 2009, p. 249), a scale that can measure degrees between polarized words like “agree” and “disagree.” The two types of data—free-response and multiple-choice—would allow triangulation on key issues using different data types.
Before the survey was posted online, additional small groups of seniors and graduates were asked to take the full survey, including multiple-choice questions, as a way to pilot the validity and reliability of the instrument. Based on discussions with test-subjects, respondents were allowed to omit questions of either the multiple-choice or free-response type.
Demographic data component
Respondents were asked for information about their final SAT scores and class rank because those were elements used to gain an equivalency between the groups. They were also asked to identify their gender, race, and high school of graduation, each of which was of interest to stakeholders. For all questions, care was taken in following Feng’s (2004a) recommendations to make questions as one-dimensional as possible.
Administering the Survey
In December 2011, invitations and survey links were sent to both groups: to 134 CGS graduates by email from the director and to 266 members of the Comparison group by hard copy letter from the respective district superintendents. Both groups were given till January 31, 2012, to take the survey, with a reminder being sent out shortly after January 1.
Data Analysis
Multiple-choice. (Respondents were allowed to omit questions.)
Frequencies and percentages were calculated for each response.
General themes were discussed, considering the multiple-choice data as a whole.
Free-response. (Respondents were allowed to omit questions.)
Content analysis was performed using constant-comparative methods recommended by Lincoln and Guba (1985) and axial coding techniques recommended by Strauss and Corbin (1998) as well as Creswell (2009, 2014).
Each response was coded, the coding unit being the theme. For each theme, a typical quotation was selected and reported as an anchor example. In some cases, more than one anchor was provided to show a large or bimodal range of responses within the category (theme).
Close calls and outliers were adjudicated by member checking with students and teachers in the program.
Data integration/member-checking
To develop general conclusions for the study, results for the free-response questions were compared with those for the Likert-type and examined for prominent themes cutting across both types of data, as recommended by Creswell (2009, 2014). Selected stakeholders, including study participants, were then shown the draft conclusions and the selected example quotations, and asked to comment on specific issues. The following case is an example.
Follow-up email on testing
Use of a follow-up email (see Appendix B) to CGS participants shows how member checking was used as an analytical tool after initial data analysis. The purpose was to check a tentative conclusion: that a significant number of survey respondents saw a weakness in their preparation for high-stakes college testing.
Figure 1 shows a simplified flowchart of the process. Survey results are described in the following sections.

Overview of the method.
Results
Demographic Data Component
The response rate for the CGS group was 57.5% versus 34% for the Comparison group. As the latter group received hard copy invitations, not emails, and would have to copy the URL for the survey manually, we considered the rate substantial. It testifies to these high-performers’ interest in the issues.
Comparability of the two groups
As hoped, the Comparison group reported measures of academic aptitude (SAT scores) and performance (class rank) at least equal, as a group, to those of the CGS group: Of CGS respondents, 78% reported SAT total scores of 1,800 or above and 65% reported class ranks in the top 10, versus 78% and 72%, respectively, for the Comparison group.
Accelerated courses taken
Of CGS respondents, 69% reported having taken 10 or more AP or dual-enrollment courses versus 9.5% of the Comparison group. This difference itself was a goal of the CGS program.
Multiple-Choice Component
Both groups approved their high school (HS) preparation—CGS more “strongly.”
Large majorities of both the CGS and Comparison groups expressed satisfaction with their HS experience, especially in terms of preparation for college. In fact, as Table 1 shows, 97% of CGS graduates agreed either “strongly” or “somewhat” that their programs provided adequate challenge, as did 86% of the Comparison group. CGS graduates, however, were far more likely to agree “strongly” with the proposition, and only two graduates expressed any level of disagreement. These ratios were repeated in other responses.
Responses to Survey Question 8 (First of Likert-Type Section)
Note. CGS = Commonwealth Governor’s School; Comp G = Comparison group.
“Culminating” as the crown jewel at CGS
The difference in perceptions between CGS graduates and those in the Comparison group also comes through strongly in responses to Questions 12 and 13 (see Tables 2 and 3), which asked about development of individual abilities. The extent of differentiation and the development of skills in research, presentation, and collaboration are exactly those areas developed most strongly by the yearlong culminating project and also in the program’s interdisciplinary units—two major differences between CGS and the regular programs.
Responses to Question 12
Note. CGS = Commonwealth Governor’s School; Comp G = Comparison group.
Responses to Survey Question 13
Note. CGS = Commonwealth Governor’s School; Comp G = Comparison group.
Importance of teachers as mentors to both groups
Both the CGS and Comparison groups gave much credit to the support and mentorship of teachers. In fact, of all the Likert-type questions, this one produced the least difference between the groups, as Table 4 shows, if the “agree” and “strongly agree” are combined.
Responses to Question 16
Note. CGS = Commonwealth Governor’s School; Comp G = Comparison group.
The results reflect credit on the schools and their districts both in program design and in teacher recruitment and assignment. Students in the regular programs could take challenging courses, including AP and even International Baccalaureate courses—if they made those choices. The results, however, indicate the strength of the CGS program. There is a consistent differential in favor of the CGS group versus the Comparison group in strength of approval. For example, even though both groups tended to approve of support they received from faculty, 78% of CGS graduates strongly agreed with the statement “My high school program helped me gain higher-level conceptual understanding of content” (Question 9), versus 34% of the Comparison group.
As CGS teachers typically have additional assignments in the regular program, the results imply that the relative strength of CGS is due to qualities of the program itself, such as the fact that CGS students, unlike those in the regular school programs, benefit from interaction through videoconferencing and combined field experiences. It is true that students in the Comparison group, all top performers, may (like the CGS students) have taken more than one course under a particular teacher at increasingly advanced levels and thus benefited from some of the long-term mentoring experienced at CGS. It is reasonable to conclude that teachers and administrators in the regular program have worked hard (formally and informally) to see that gifted learners there have had as much close faculty support as possible but that the limits of the regular programs have occasionally frustrated the best students. Such comparative data are quite valuable to schools seeking to evaluate their programs.
Free-Response Question Component Content Analysis and Meta-Themes
The free-response data from Questions 18 to 29 reinforce the themes emerging from the Likert-type questions, especially the levels of challenge, individualization, and support from both peers and teachers at CGS. The sense of community is often mentioned: “I still remember the closeness of the small group” (CGS graduate). The following discussion draws from both the free-response results and those of the multiple-choice section.
Power of the CGS program as a whole
Similar to the multiple-choice responses, the open-ended responses strongly support the benefits of the CGS program. What the study adds to the on-site evaluation is the evidence of overwhelming approval after exposure to the challenges of college, in some cases long after (one graduate reported being a university professor of entomology). As Tables 5 and 6 show, more than two thirds of CGS graduates mentioned CGS or an activity available only through CGS as an important positive choice they made in high school—and this without prompting. As one put it, “CGS was the best choice I ever made” (CGS graduate). The following themes stand out in the free-response results.
Responses to Free-Response Question 19
Note. CGS = Commonwealth Governor’s School; EMT = Emergency Medical Technician; AP = Advanced Placement; DE = Dual Enrollment.
DE courses offered credit both in high school and through a regional community college.
International Baccalaureate courses required for the International Baccalaureate diploma. These are not offered at CGS.
Responses to Free-Response Question 25 (CGS Only; Total Responses = 71)
Note. CGS = Commonwealth Governor’s School; AP = Advanced Placement.
Academic challenge
Tables 5 and 6 illustrate the importance of challenge to both groups. CGS alumni frequently praised the academic challenge in their coursework at CGS, sometimes framing it in terms of higher-order thinking and communicating: “My writing skills and critical thinking skills are much more developed than other students at my university” (CGS graduate). They also appreciated the advanced course options made available by the program. Graduates from the Comparison group also approved of their advanced options but were more likely to single out particular courses and teachers—as 20% of them did, versus only two of the CGS graduates. The implication from the data is that challenge had mattered to all these graduates and that the CGS program itself was a choice from which other good ones flowed, whereas in the regular programs, students might or might not make challenging choices. Indeed, in a question about dissatisfactions with their preparation, CGS graduates were far less likely than their counterparts in the Comparison group (27% vs. 60%) to wish for even more challenge.
Unlike the CGS graduates, those in the Comparison group were much more likely to single out particular programs and teachers, praising the level of challenge there: “It all depended on the teacher” (non-CGS graduate). Nevertheless, in the free-responses as in the Likert-type, non-CGS graduates were much more likely than CGS graduates (30% vs. 9%, respectively) to complain of a lack of challenge in high school. Both groups noted the lack of challenge at their base schools in the non-AP and elective coursework. A common complaint was that even a supposedly “advanced” class emphasized rote learning for a test, instead of “real understanding” (non-CGS graduate). Like the CGS group, Comparison group graduates appreciated differentiation, but in their cases, it was the upshot of choices they had made among courses and sequences. At least in core subjects, the choices CGS students could make, say between AP Calculus and AP Statistics, were challenging no matter what.
Particular power of the research strand
CGS alumni were especially glad of the research and presentation skills acquired, which they often noted had put them ahead of their peers in college. Although the Comparison group only occasionally complained about preparation in research and presentation skills (12% vs. no such complaints from the CGS group), neither did they mention these as a strength, whereas CGS graduates routinely did so: “Governor’s School is basically the reason I’m so much more well prepared than my peers at college” (CGS graduate), a common sentiment. Indeed, the yearly “Culminating” project is the element of the program most often appreciated by CGS graduates—even by respondents who began by saying, “Much as I hate to admit it . . .” (CGS graduate).
The Culminating project’s key elements are the very ones that most distinguish CGS from the regular program in students’ eyes: individualization, close faculty coaching, depth of learning, and challenges with real relevance to students’ future lives as learners. Responses to questions other than those depicted in Tables 5 and 6 show similar themes. To Question 21, for example, about what graduates had found most beneficial in high school, 20% of CGS respondents mentioned either research or presentation skills, or a combination of both. No respondents in the Comparison group mentioned those skills in response to this prompt. Again, this is a perspective not available to evaluators without surveying alumni.
Sense of community
In addition to its academic challenge, CGS graduates also appreciated the sense of community among students and teachers that developed more than 4 years together. In fact, that sense itself was seen as a contributor to the level of challenge: “Being in classes with the same people every year allowed me to be more outgoing and made us strive to push ourselves. It was in CGS that I decided to major in chemistry” (CGS graduate). The following comments are typical: “The atmosphere was so amazing, the teachers were awesome” (CGS graduate). “I still miss the CGS teachers, friends, and environment” (CGS graduate). “My teachers, my peers, and my daily environment in CGS cultivated in me a love for learning” (CGS graduate). Graduates praised the collaboration and mentorship they experienced, often specifically mentioning “group projects” (CGS graduate) and being “surrounded by students of a high caliber” (CGS graduate)—which some noted was not always the case outside of CGS. The word “family” often came up. It is possible that the group is what mattered to these graduates—that a particular set of teachers working together for years, in some cases more than a decade, had developed a unique chemistry that would be difficult to reproduce.
Power of non-academic activities for academic development
In addition to the level of challenge in coursework, CGS graduates also appreciated the individualization and interdisciplinary connections allowed by the field experiences and research assignments, particularly the Culminating project: “Culminating gave wide variety of topics, followed own interests” (CGS graduate). It is not surprising, therefore, to see that in open-ended responses, both the CGS and Comparison groups reported substantial benefit from highly differentiated non-academic extracurricular activities such as sports, clubs, and the arts. They saw these as fostering personal attributes like self-discipline, self-expression, and leadership. One graduate focused on Navy Junior Reserve Officer Training Corps (NJROTC), which “developed me as a leader and successful student” (CGS graduate). In some cases, they even saw them as contributing to their development as learners—in, for example, responses to Question 21 about “lifelong learning,” in which 5% of CGS graduates and 15% of the Comparison group credited extracurricular activities. Although some “extracurricular” activities are quasi-academic—Robotics, Envirothon, Academic Team, theater, music, and so on—still, there is a lesson for programs like CGS: Ensuring that students can participate in auxiliary activities seems important to their academic development, or at least some of them see it that way.
A mixed picture on testing—and the follow-up query
The survey results are a resounding endorsement of the Culminating project, but less so for other kinds of assessment. Some graduates in each of the two groups—10% to 15%, depending on how responses are aggregated and interpreted—mentioned feeling unready for the kind of high-stakes exams they faced in college, where there might be only a midterm or term paper, then a final. AP exams are probably the best analog to college tests (as several respondents noted), but not everyone takes the actual AP exam. Some graduates wished they had taken more of them. Also some graduates noted that they had been exempted from many exams in high school, through high averages or certain test scores, such as those on the state Standards of Learning (SOL). The SOL tests are not much of a challenge for gifted learners. In view of the results and stakeholder interest in the question, a follow-up email was sent asking specifically about testing. However, the results of the query (see Appendix B for its text) showed a consensus (18 of 23) in favor of the present CGS approach to assessment, wherein the student’s Culminating project counts as the final (second semester) exam grade in all four subjects.
Lack of interest in racial or ethnic diversity as a problem
Question 27 asked CGS graduates to suggest ways to improve diversity, the “participation of ethnic and cultural minorities.” It is hard to see the data as showing anything but weak interest in the issue, or at least in commenting on it. The question was avoided by 25% of respondents. A quarter of those who did respond (15) rejected the importance of the issue. One CGS graduate said, “I don’t see this as an issue, and I am a minority” (CGS graduate). Indeed, according to one respondent, the real “diversity” issue for programs like CGS may be related to economics: Forced racial diversity causes real problems and its benefits are either coincidental or imaginary. If CGS wants genuinely to improve its “diversity,” it should focus on economic diversity. That tends to be more meaningful. (CGS graduate)
Still, one respondent had this to report: It felt isolating to be the only Black person in a classroom for 4 years, and the few times topics about any other culture came I had to be THE representative. That’s awkward at 16, and 16 is awkward enough. (CGS graduate)
Among those who made suggestions, the results were a mixture of what might be expected: community outreach, visits by minority students and teachers to middle schools, and early identification and recruitment for gifted programs.
Implications for Practice and Further Research
The Value of Surveying Graduates
As part of ongoing evaluation processes, high school programs of all kinds—but especially those for special populations—should make an effort to keep up with their graduates. The results of graduate surveys may well confirm the conclusions of site-based evaluations, as they generally did in the present study, but they will now represent the perspective of students now in college or beyond—contexts for which they were explicitly being prepared. CGS graduates in the study repeatedly contrasted their preparation with those of their peers. Such information is important, perhaps vital, in evaluating the success of a program for gifted learners.
Responses from graduates varied from those of graduating seniors. In a random sample of 54 responses to a question about suggested program changes, 15 respondents (28%) had major suggestions about changing or eliminating various elements of “Culminating,” ranging from weakening requirements to eliminating the project. In contrast, CGS graduates had nearly unanimous praise for the experience. Commenting on the “exit” survey, one staff member blamed the usual “whiny-ness” of seniors. Whatever the case, the graduate survey produced a different perspective.
The Value of a Comparison group
Evaluators should consider using carefully designed Comparison groups as an element of evaluation. The use of such a group in this survey responds to calls (Borland, 2003; Boyd, 2002, November; Callahan & Moon, 2007; Finn & Hockett, 2012) for more convincing data in support of special schools for the gifted and for the broadening of “stakeholder” categories beyond the program itself. With adequate contact data and the commitment of administrators, such a Comparison group increases the value of the study. Even if experimental controls are not possible, as they often are not, mixed methodologies like those suggested by Van Tassel-Baska (2004) and Creswell (2009, 2014) may yield valuable insights. Indeed, the present survey provides support for the total school programs in these counties. The relative strength of approval by the CGS group indicates the program’s unique benefits for the most gifted.
Support for Key Elements of the CGS School-Within-a-School Model
The results support the importance of key program and curricular elements: acceleration of coursework, intellectual challenge, interdisciplinary content, differentiation (notably in long- and shorter-term research projects), the centrality of the research strand (Culminating), fostering faculty teamwork across grade levels as well as disciplines, fostering mentoring, and facilitating full access—through the half-day feature—to extracurricular opportunities at the base schools.
Centrality of the research strand
The first three elements in the above list—acceleration, challenge, and differentiation—are closely related, with the Culminating strand bringing them together. Indeed the research-and-presentation strand is important enough that the School is petitioning the state to allowing awarding independent credit for this strand as a separate course sequence.
Limitations of the Study
The researchers’ own characteristics provide both strengths and limitations. Recently retired from the CGS faculty, one of us was in effect a participant-observer, knowing its faculty well, plus many of its graduates and current students, and having participated in both state evaluations of the program. Local experience and credibility also helped gain administrative support for the study and to ground its design. Such a status also carried risks. Because many of the participants knew the researcher, their responses to the survey may have been influenced by such knowledge. That one of us was the program director during the study also carried similar potential benefits and risks.
Further limitations arose from disparities in the makeup of the two survey groups. Unlike the CGS group, the Comparison group did not include graduates of classes prior to 2007. Attitudes toward one’s high school preparation may vary over time, and more recent graduates may be more or less likely to respond to such a survey. This study does not address such issues. Another complicating factor is that response rates for CGS graduates differed greatly among the individual schools, mostly because some schools did not have CGS sites, and, of those that did, some sites had been in existence longer and thus had more graduates. Variations in results among sites are thus less revealing than they might have been with more purposive sampling.
A related design complication is that some unknown number of the “high-achieving graduates” of the regular programs no doubt applied to CGS and were not accepted. Others, of course, may have considered and rejected that alternative. Both groups would know the CGS consortium was conducting the study (a few in the Comparison group actually commented on CGS). Such awareness could have influenced their responses.
Another limiting factor of the survey is that, despite the large numbers taking the survey, the numbers of African American respondents were quite low, about 3% in each group (yet more reason for the concern about diversity), so that generalizing from what they said is difficult. And as the survey did not seek to identify economic status, its influence on the results is not measurable.
Recommendations for Further Research
Gather Data Beyond Self-Report, Targeting Measurable Outcomes
More research is needed comparing the development of students in gifted programs of various kinds to that of students in the more traditional high school programs, especially using measurable results beyond self-report—such as college GPA, scores on graduate exams, admissions to graduate programs, publications and other performances, or achievement in universities outside the United States. Longitudinal studies are needed comparing cohorts in various types of special programs for gifted learners to those in more traditional college-preparatory programs.
Study Effects of Various Kinds of Diversity
More study is needed on the influence of student and faculty diversity on student perceptions and outcomes. Such studies should target a variety of types of diversity and their interactions—not only of race but also of economic status, nationality, birth language, immigration status, religion, and so on. What is the role of gender in gifted programs (among students or teachers)? Females typically outnumber males in the CGS student body by about three to two. Is it possible that girls and boys respond differently to some aspects of programs like CGS or to certain teacher traits? The data in this study did not clearly show such an effect—comparison of the disaggregated male and female results did not seem to show systematic differences—but that does not mean there were none and that sophisticated statistical methods could not discover them.
Study Effects of Preparation for Assessment Regimes in Higher Education
Concerns about preparation for high-stakes testing in college could be addressed through well-designed Comparison groups differing on the extent to which they were given specific training for such regimes—for example through sample tests, syllabi, and other information provided by colleges to which a program’s students typically apply.
Footnotes
Appendix A
Appendix B
Authors’ Note
The Governing Board of The Commonwealth Governor’s School provided access to data and online survey hosting, plus necessary postage and stationery. No compensation was paid to the researchers.
Conflicting of Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Bios
L. D. (Dan) Walker has taught at the high school and college level for 42 years, 13 of those at The Commonwealth Governor’s School, where he retired in 2011. He is currently an adjunct instructor at the University of Mary Washington’s College of Education and in the Virginia Community College System’s Career-Switcher program, EducateVA.
Merri Kae VanderPloeg has been working in the field of gifted education for 22 years. She was a gifted resource teacher in two states and has lead gifted programs in Anchorage, AK and Virginia. She currently holds the Vice-President position of the Northern VA Gifted Council. Ms. VanderPloeg is a doctoral candidate in Interdisciplinary Leadership from Creighton University.
