Abstract

Background
The Common Educational Proficiency Assessment (CEPA) is a large-scale, high-stakes, English language proficiency/placement test administered in the United Arab Emirates to Emirati nationals in their final year of secondary education or Grade 12. The purpose of the CEPA is to place students into English classes at the appropriate government institution. The first administration of the CEPA began in 2002 as a joint venture between the National Admissions and Placement Office (NAPO) in the Ministry of Higher Education and Scientific Research, and the three federal higher education institutions in the UAE, namely the Higher Colleges of Technology (HCT), the United Arab Emirates University (UAEU), and Zayed University (ZU). Currently, the CEPA office has full-time staff dedicated to the development, scoring, and administration of the exam. In addition to CEPA English, students can also sit CEPA Mathematics upon request.
Test description
Test specifications
CEPA-English is a two-hour written exam, which consists of three sections: Grammar and Vocabulary, Reading, and Writing. There is no break time between the three parts of the test. Instructions are in English and Arabic. The test is administered in two formats: paper-and-pencil and computer-based. There is a recommended time for each section: 45 minutes for Grammar and Vocabulary, 45 minutes for Reading, and 30 minutes for Writing. There are 45 grammar items and 40 vocabulary questions. All the questions on the grammar, vocabulary, and reading sections are multiple-choice. The grammar items measure a candidate’s ability to recognize common grammatical patterns in English, and the vocabulary items measure knowledge of common English vocabulary. The Reading section consists of two descriptive or narrative texts of around 400 words in length, and one non-prose text, such as a web page or a brochure, with a total of 25 multiple-choice questions across the three texts. The Writing section consists of an essay task of between 150 and 200 words. The quality of student’s writing is assessed in terms of grammar, vocabulary, spelling, and content. Students record their answers and write their essay on an Optical Mark Reader (OMR) sheet and these are later scanned and processed by NAPO. All CEPA test specifications are public and can be obtained at http://ws2.mohesr.ae/CEPA/.
Test purpose and use
CEPA-English, which is the primary subject of this review, was initially designed to place students into English classes in the preparatory or first-year program at the respective tertiary institutions that students planned to attend. Table 1 specifies how students are placed into different programs within each of the three federal institutions based on their CEPA overall score (i.e., Grammar and Vocabulary, Reading) and CEPA writing scores. Each of the three institutions sets their own CEPA score ranges that correspond with the different programs within each institution, and these score ranges have been finely tuned over time and are open to the public. Students with a high CEPA-English score may be eligible to bypass the preparatory program and enter the baccalaureate program directly. At ZU, for example, students with a CEPA overall score of 185+, a writing score of 4.75+, and an IELTS of 5.0, are eligible to sit an additional direct entry test, which they must pass in order to enter the baccalaureate program.
CEPA score required to enter directly into the baccalaureate program.
UAEU does not use the writing section of CEPA for placement.
Administration
There are six large (over 5000 students) official administrations of the CEPA-English exam per year, one official large make-up day, and certain sites offer smaller administrations. Students can take CEPA more than once during their final year of high school, with their highest score being used to determine eligibility and for placement. In the 2011–12 academic year, the CEPA office handled 52 administrations that were held in 23 official CEPA test sites. Not counting special administrations, CEPA tested 18,500 students in 2011–12. In total, 8100 students were tested via the computer-based CEPA and 17,600 were given paper-based tests (pers. com., Lange, 2012). CEPA is administered free of charge to all students. The technical reports are, however, not published anywhere; nor are they released to the public.
Invigilators are employees of the institutions wishing to use the CEPA test. Each institution determines how invigilators will be compensated. They are either paid for their time or given additional vacation days. Invigilators are trained using a “cascade model” on exam day. The CEPA Head and Supervisor personally visit each test site to train the Site Supervisor. The Site Supervisor, backups, and anyone who may take a leadership role on exam day are invited to attend the training session, which includes general information about best practices in test administration, as well as information specific to CEPA. The Site Supervisor then provides training to invigilators and other test administration staff before the test date. A soft copy of the presentation, as well as the Administration Guide and Registration, Invigilation and Authentication Guide are provided to the Site Supervisor.
Cornerstones of testing
To frame this review, we will look at CEPA-English and how it measures up to what Coombe, Folse, and Hubley (2007) call the cornerstones of testing, the guiding principles that underlie all good assessments.
Validity
Validity of an assessment is the degree to which it measures what it is supposed to measure. It is vital for a test to be valid in order for the results to be accurately applied and interpreted. On a test with high validity the items will be closely linked to the test’s intended focus. The validity of a test is critical because, without sufficient validity, test scores have no meaning. To this end, the CEPA has good construct validity, as the range of test task types is sufficient to ensure a good coverage of the constructs outlined in the CEPA test specifications. CEPA scores very highly on content validity as the test specifications were originally developed by representatives of the three federal institutions who used their respective curricula as a basis for exam content (Lange, 2012). In order to ensure continued content validity, the CEPA team maintains close contact with the three federal institutions to continuously dialogue about how effective the test is for placing students into levels and with the foundation programs to learn of any relevant curricular changes (Lange, 2012). Criterion-related validity is a statistical concept whereby a test’s results are correlated with already established tests of the same skill or domain to see the degree of correlation between test takers’ scores. Large-scale studies correlating CEPA scores to those of other major internationally recognized benchmarks like TOEFL and IELTS have been conducted and score comparisons and equivalencies are made available online at http://ws1.mohesr.ae/cepa/Files/Interpreting_CEPA_Scores.pdf.
Reliability
Evidence of CEPA’s reliability is reported in numerous reports, which are reported on the website. For example, on the ‘Interpreting CEPA Scores: FAQ’, stakeholders are provided with a wide range of information on the various statistical analyses run on the CEPA. This includes information on scaled scores, how the three federal institutions use CEPA scores, performance of the student population over the years on all parts of the CEPA, and empirical data on how much a student can expect to improve in a semester. Equivalencies between CEPA and several internationally recognized assessments (e.g. TOEFL and IELTS) are also claimed (see Table 2).
Equivalencies between CEPA and TOEFL PBT, iBT, CBT, and IELTS.
There are a number of quality control measures present in the writing section of CEPA that contribute to its reliability. NAPO requires double-blind marking of all writing scripts. To ensure inter-rater reliability, all scripts are double marked online and in the case of discrepancies or out-of-range marks a third marker is assigned. Although CEPA does not report any inter-rater reliability statistics for any given exam/year, the team looks at measures of consistency and severity for each rater and makes decisions based on this data. All markers must have an MA in a relevant area and must be currently employed by one of the three federal institutions. Before marking begins, markers must go through an online accreditation process, which includes practice scripts to familiarize themselves with the standard and accreditation scripts to apply the standards. The CEPA team uses Facets, a software program that adjusts scores according to a marker’s severity and reliability profile. This protects students from being penalized by two severe markers from the marking pool, and also serves to identify any markers who are marking erratically. All markers receive feedback on their marking, including their marker severity and reliability profile, shortly after the exam period. Once all the sections of the CEPA are marked, the scores of all parts are correlated. In the case of a high disagreement between the two (Grammar and Vocabulary/Reading and Writing), the test is then reviewed and marks are adjusted if necessary.
Usefulness
The CEPA has proved to be a very useful test for all three federal institutions in the UAE. As can be seen from Table 1, all three federal institutions use the results to determine not only whether or not they will accept students, but also to place students once they have been accepted. It was noted that in the first two years that CEPA was administered, the placement of students was not always accurate (Gjovig & Jaquith, 2012). This was largely owing to the fact that some students did not take the exam seriously, and many failed to complete the writing section of the test. However, students are now more aware of the importance of the CEPA overall and the writing section in particular and take it more seriously (Gjovig & Jaquith, 2012).
Washback
It is noted on the CEPA website that in 2003, the average CEPA score was 150, and in 2010 it was 161. There are a number of possible explanations for what we consider to be positive washback. First, the stakes for CEPA have increased significantly since 2003. Initially, CEPA was used as a placement test only, whereas now it is also used as an admissions test. As such, teachers and students take it much more seriously. Second, it is common for test scores to increase as teachers and students become more familiar with a particular test. In 2008, NAPO published CEPA Challenge (Lange & Brown, 2008), a test-preparation book that was widely used in schools. In addition, the CEPA website (www.napo.ae/cepapractice) also contains useful practice materials, all of which no doubt contributes to students achieving higher scores.
Authenticity
Communicative language testing theory tells us that tests should mirror authentic, real world tasks (Morrow, 2012). If we combine this notion with the fact that most educators in the UAE are communicative language teaching methodologists, the authenticity of CEPA must be taken into account. If we consider the format used – MCQ – the CEPA does not get high marks on authenticity. However, the texts found on the reading section of the CEPA are adapted from authentic texts and the writing prompt asks students to write an opinion essay, which is a relevant, purposeful task for students at this level entering their tertiary studies. Moreover, CEPA was designed specifically to meet the needs of Emirati students. As such, the test is targeted to the average Grade 12 student in the UAE. Because all of the test writers have lived and worked in the region, there is little chance of cultural bias affecting test scores (Lange, 2012).
Transparency
Transparency is a particular strength of CEPA. Test developers make every attempt to make the CEPA transparent without compromising exam security. Transparency is targeted at several stakeholder groups. First, NAPO sends representatives to most high schools in the UAE to answer questions about CEPA and the NAPO application process. For students, CEPA provides materials explaining the test and practice exams are available so teachers and students can acquaint themselves with the rubrics, test formats, and recommended timings. From a teacher professional development standpoint, the CEPA team has a presence at relevant local conferences, and gives professional development workshops and presentations at sites and prospective sites upon request. For the wider educational community, public specifications, FAQs, general information about scores and quality checks are all available at: http://ws2.mohesr.ae/cepa/Teachers_EN.aspx.
Security
Several new versions of the CEPA are written each year and strict security measures are in place to ensure that CEPA remains a secure exam. For example, invigilators are trained to observe student behavior very closely. Students who bring materials into the exam room or try to use mobile devices are disqualified from the exam and lose their right to free education for life. The CEPA team also uses special software to detect collusion among students. If past papers are made available to the public, they are withdrawn from use. Although test takers can ask for their scores to be reviewed, they cannot be given test copies.
Positive aspects of CEPA
Universal buy-in from stakeholders
The CEPA team has worked tirelessly to ensure buy-in from stakeholder groups in the UAE. There is support both academic and financial from the Ministry of Education via the creation of the National Admissions and Placement Office (NAPO) who oversee the development and administration of the CEPA. The three federal institutions that draw students based on CEPA scores cooperate extensively with CEPA/NAPO by providing faculty who serve as item writers for alternative versions of the exam as well as members who serve on the academic committee. Teachers recognize the importance of preparing their students effectively for CEPA and do so using teacher-produced materials and those made available by the CEPA team including the test preparation package entitled CEPA Challenge (Lange & Brown, 2008).
Use of local expertise for test administration and development
The CEPA was established and is now run by knowledgeable, experienced test administrators who are well respected in the region. The CEPA team offer training sessions to faculty members of the three federal institutions and teaching staff of the Ministry of Education who want to write test materials for CEPA. Once faculty members have passed an item-writing exercise, they are commissioned to develop test materials for live versions of the exam. The use of experienced faculty in developing test materials has ensured to the greatest extent possible that the materials are free of cultural biases and more closely reflect the culture and religion of the country. These item writers and developers have extensive experience in the cultural/educational context, which reduces the potential for bias. The staff in the NAPO/CEPA office includes several Emiratis, many of whom participate in the vetting and development process, which also helps minimize cultural bias.
Effective use of technology
CEPA/NAPO has implemented the use of technology since the present version of the exam was launched in 2002. Students complete both the objectively scored section as well as the writing sample on an OMR sheet, which is then sent to the CEPA offices for computer grading and analysis. The writing section of the exam is scanned and disseminated across the country to writing markers who then engage in online writing-marking training and calibration. Once markers have completed training and calibration they then mark an allocated sample of writings online. In 2007, a computerized version of the CEPA was developed and launched and this version now accounts for 50% of all CEPA test administrations.
Areas for development
Addition of listening and speaking sections
To date, CEPA assesses the skills of reading and writing only in addition to the sub-skills of grammar and vocabulary. The reason for the exclusion of listening in the early days of CEPA was purely for practical reasons as developers were unsure of the technology available for administrations of a listening section of the test. Speaking was not part of the original specifications as it was considered too difficult to assess given the numbers of students to be tested. Now, however, the CEPA is an established and internationally recognized benchmark with well-developed administrative procedures. We feel that the CEPA team should investigate the feasibility of adding a listening and speaking component. This would increase the content validity of the CEPA, as the three federal institutions that draw students based on CEPA scores are all integrated four-skills English language programs.
Inclusion of additional test task types and item formats
In its present form, the CEPA assesses through one item format: multiple-choice questions (MCQs). Although there are a number of advantages of using MCQs, the general assessment literature recommends that we assess students through multiple-measures assessment as no single type of assessment or in this case item format can tell us all we need to know about a student’s level. As such, we encourage the CEPA team to consider using a wider variety of item formats both fixed choice and open-ended.
Development of a research agenda and research funding mechanism
As previously stated, the CEPA is an extremely transparent instrument. Despite this transparency, there is a dearth of journal articles in the literature that research various aspects of the CEPA and its impact on the UAE educational community. To this end, we recommend that CEPA develop a research agenda and allocate funding to locally based ELT professionals who have an interest in conducting research on CEPA.
Footnotes
Funding
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
