Student views of peer assessment at the International School of Lausanne

Abstract

This article explores student attitudes and perceptions relating to peer assessment, as observed at the International School of Lausanne, where the case study was restricted to students in the International Baccalaureate (IB) Diploma Economics course of the programme. Informed by a review of literature on the relative merits of peer assessment, this article highlights its specific strengths and weaknesses as assessment for learning and highlights specific gaps in the literature, before investigating students’ perceptions of its value and social dynamics within small student groups. The article concludes by considering the clear preference expressed by the majority of students participating in the study for anonymity in the peer assessment process. It is clear that this adds a more complex and not yet thoroughly explored set of consideration to the debate about the merits of peer assessment. Students’ preference for anonymity concurs with Falchikov’s findings that such concern was more likely to be found in relatively small, well-established groups, exactly like the one that is the focus of this study.

Keywords

Assessment assessment for learning Economics International Baccalaureate Diploma programme peer assessment

Introduction

Peer assessment (PA), initially defined in this article as ‘an arrangement for learners to consider and specify the level, value, or quality of a product or performance of other equal status learners’ (Topping, 2009: 20), is a common practice among teachers both in the United Kingdom and within schools following a Western style curriculum and programmes, such as those of the International Baccalaureate (IB). IB programmes, including the Primary Years Programme (PYP: ages 4–11), Middle Years Programme (MYP: 11–16) and Diploma Programme (DP: 16–19), are offered worldwide in both international schools and schools in national systems (IB, 2013). As an IB Diploma Coordinator and teacher of IB Diploma Economics, I have found PA to be a valuable and effective classroom practice over the past 15 years. It is considered a key component of pedagogical strategy at my present school to the extent that its use is one criterion, among many, for effective teacher appraisal. Furthermore, the IB articulates PA as being required by all IB Diploma schools/colleges via their Standards and Practices document for authorised schools, which requires that ‘learning at the school involves students in both peer and self assessment’ (IB, 2004: 13).

PA is considered by many teachers to be a valuable formative assessment tool but not without its limitations. Students have provided me with various comments and feedback about its use, some positive and some negative, provoking a desire to explore the value of PA in my everyday teaching: more specifically, the issues around the complex social relationships relating to PA among adolescents, and the extent to which PA is ‘accurate’. It also has left me with a question, in the context of the IB Economics course, as to who garners more benefit from the process: the peer being assessed or the peer assessor.

This article investigates the perceived value to IB Diploma Economics students of PA in the context of internal assessments coursework. The article also explores the thought processes and emotions students experience when making a judgement about the work of a classmate – someone with whom they share an everyday existence. Can students make an objective assessment of their peer’s work when they will not only be acutely aware of the other student’s overall ability but may also wish to protect their feelings and avoid destabilising the specific social dynamics of the group?

The context of the case study

The case study took place over an 11-month period at the International School of Lausanne (ISL), Switzerland. This is a private, non-profit, non-selective, co-educational school, catering largely for expatriate families in the Lac Léman region. The school offers the IB PYP, MYP and DP. The study related to one DP Economics class of 18 students and 1 teacher (the author of this article) and involved a specific component of internal assessment. At ISL, Economics falls within the social studies department, which consists of seven staff members. DP Economics is taught by three teachers, all of whom also have other responsibilities in the school. It is also relevant that I am the Assistant Principal and IB Diploma Coordinator in the school, and therefore I have responsibility for curriculum and academic progress of all students in Year 12 and Year 13, the last 2 years of the school.

Within the IB DP, Economics students are required to produce a portfolio of four economic ‘commentaries’, each of which entails writing a 750-word economic analysis and evaluation of a newspaper article of their choice. This is normally completed over a 2-week period and is assessed using specific criteria: the maximum counts for 20 marks. The internal assessment constitutes 20% of students’ final grade at higher level (HL) and 25% at standard level (SL). The present procedure within the class is for students to complete one ‘practice’ commentary, which is assessed formatively, followed by four ‘real’ commentaries which constitute the final portfolio.

Peer Assessment

Peer Assessment within the context of formative assessment

A detailed consideration of the value and role of formative assessment is beyond the scope of this article. However, many readers will find themselves making links with the Assessment for Learning movement that grew out of Black and Wiliam’s (1998a) original work, in which they summarise the key ingredient of formative assessment by saying that ‘For assessment to function formatively, the results have to be used to adjust teaching and learning’ (p. 4), while offering a definition of formative assessment as ‘all those activities undertaken by teachers, and/or by their students, which provide information to be used as feedback to modify the teaching and learning activities in which they are engaged’ (p. 10). It is a movement away from the managerial role of assessment, towards one where students take more responsibility for, and a role in, their own learning. It is usually associated with the ideas of constructivism, with its primary role being to inform the teacher of students’ progress rather than seeking to assign grades or make final judgements. It is defined by Sadler (1998) as ‘assessment that is specifically intended to provide feedback on performance to improve and accelerate learning’ (p. 77). This is in contrast to summative assessment, where assessment is used to make a judgement about progress or ‘certify’ levels of achievement. As the name suggests, summative assessment normally comes at the end of the process of learning and does not have improving learning as its primary goal.

Black et al. (2004) describe PA as an important ‘complement to self-assessment’ (p. 14). Nevertheless, they argue that the role that self-assessment plays within formative assessment is highly valued, even going so far as to argue that ‘assessment by pupils, far from being a luxury, is in fact an essential component of formative assessment’ (Black and William, 1998b: 8). However, ample literature exists on PA in its own right, particularly at the higher education level. Sadler and Good (2006) and Topping (2009) provide a detailed summary of PA and its perceived value. Topping’s (1998) earlier article provided an exhaustive insight into the academic work undertaken in this area prior to that time, and is considered here in detail as it provides a useful summary of all major prior works. A significant proportion of the PA literature, including that of Topping, is based on higher education, although Noonan and Duncan (2005) present a detailed work on PA at the high-school level. Helpfully, a major step forward in the literature was taken in 2010 with a number of articles published in the Journal of Learning and Instruction that year (Cho and MacArthur, 2010; Gielen et al., 2010; Kollar and Fischer, 2010; Strijbos et al., 2010; Strijbos and Sluijsmans, 2010; Topping, 2009; Van Gennip et al., 2010; Van Steendam et al., 2010; Van Zundert et al., 2010).

Defining Peer Assessment

There is little debate in the literature over how to define PA, with the definition offered in the introduction by Topping being recognisable within all articles on the topic. In addition, although highlighted by Black and others, PA is not a new phenomenon; as Gaillet points out, George Jardine, a professor at the University of Glasgow from 1774 to 1826, described the methods and advantages of PA of writing (1992, in Topping, 2009). PA is often seen, however, as being linked to self-assessment. Black (1998) was certainly of this opinion, although in 2004 he commented that ‘some experts view PA as a strategy on its own, but more often it is seen to be complementary to self-assessment’ (Black et al., 2004). It should also be noted that PA can take place within a bewildering range of different contexts. It is recognised by Noonan and Duncan (2005) that the majority of PA takes place early in the assessment cycle of a course and is used formatively. Topping (1998) goes further, developing a useful typology of PA.

Benefits of Peer Assessment

The claims made for PA are extensive and wide ranging, although only in the most recent literature do studies provide more detailed, research-backed evidence about which specific components of PA bring which benefits and to what extent. Black (1998), Black et al (2004), Sadler (1998) and Topping (1998, 2009) all highlight a range of benefits. These are mostly related to students becoming more engaged with and taking more responsibility for their own learning, coupled to the view that students are unlikely to reach a goal if they are unable to understand what is required. Topping (2009) neatly summarises a number of specific advantages of PA, while arguing that ‘the overriding goal of peer assessment is to provide feedback to learners’ and that the ‘most significant quality of peer assessment is that it is plentiful’ (pp. 22–23). This view builds upon ideas from his previous article where Topping, in much the same way as Black and Wiliam (1998a), views the main benefits of PA in terms of ‘promoting a sense of ownership, motivation and personal responsibility’ (p. 256). Johnson (2004), while mainly agreeing with other authors, also considers the concept of trust among students and ensuring students are focussed on the process as well as the product.

The major strengths of PA as seen by Black et al. (2003) are that it improves motivation, promotes the use of the language of the assessment and helps to improve communication about learning between teacher and student. Less convincingly, Black et al. also propose that students are more likely to accept criticism of their work if it comes from a peer, rather than from a teacher. I would question this argument within my own educational context and will explore it in more detail later.

One obvious benefit of PA, particularly when undertaken in the context of generic external assessment criteria such as those in the ISL case study, is a deeper understanding for the assessor of the assessment criteria themselves. Despite a teacher’s best efforts, students can find reference to assessment criteria something all too easy to overlook. It is possible that this benefit accrues more from assessing a peer’s work than from the feedback obtained regarding one’s own. This is perhaps particularly pertinent in the context of this study, where all students are required to produce work that will be assessed using the same criteria. By having to make objective decisions in relation to specific criterion levels to be awarded for each commentary, students should gain wider overall familiarity with the assessment criteria and greater understanding of their application by teacher experts, having undertaken the same exercise themselves. Topping (1998) is not blind to this benefit as he describes PA as involving ‘cognitively demanding activities which could help to consolidate, reinforce, and deepen understanding of the assessor’ (p. 254). He also argues, building upon the work of Fry (1990), that PA can provide insights into the thought processes of the teacher, particularly with respect to the margin between specific grades. It is worth remembering here that Topping’s work is based around PA in higher education, although there is no reason to believe that the same arguments are not transferable to the upper end of secondary education.

Concerns relating to Peer Assessment

No single author has summarised the main concerns relating to PA and its effectiveness, although a neat overview is provided by Strijbos et al. (2010) in their introduction to a consideration of detailed research relating to the role of feedback in PA. The concerns they highlight are as follows:

Feedback does not automatically lead to an improvement in student performance.

Students are not experts in their area, and therefore the quality of the feedback is variable at best.

Students doubt their own knowledge, and that of their peers, and therefore significant doubt about the reliability and validity of the feedback exists.

Students’ emotional state, linked to complex social relationships and psychological safety (extent to which they feel psychologically secure in those relationships), can affect the outcome of PA.

Students feel that assessment is the role of the teacher.

The perceived actual ability of the peer assessor has a significant impact on both the acceptance of feedback and its application to future student work.

These concerns strike a particular chord with me as some of them were alluded to by students in this specific study. Strijbos et al. (2010) summarise nicely the idea that students ‘value peer feedback, but prefer teacher feedback because they consider them more competent’ (p. 293), something also seen in this study.

The role of feedback in Peer Assessment

The main focus of the article by Strijbos et al. (2010) is whether the nature and extent of feedback during PA has a tangible impact on the benefits of PA for learning. The study makes what seem to be very obvious assumptions in commenting that ‘feedback from a person with a high level of expertise is assumed to be more positive than from a person with low expertise’ and that ‘satisfaction with feedback is influenced by the perception of the source’s familiarity with the work unit: feedback from a person with low familiarity was perceived as less positive than from a person with high familiarity’ (Strijbos et al., 2010: 293). However, their results provide a rather perplexing outcome in terms of student performance. In their study, the group of students who had peer assessors with low familiarity with the task outperformed that of those with high familiarity, and the group with concise general feedback outperformed the group given detailed and elaborate feedback. Furthermore, there was no correlation between either of these factors and students’ perceptions of PA. The study’s conclusions leave the reader with little in the way of useful information when attempting to ascertain the potential benefits of PA.

The role of psychological safety in Peer Assessment

Van Gennip et al. (2010) provide a fascinating insight into this area through an article which facilitated a more academic context in which to place collected responses from students regarding social dynamics in the PA process. Van Gennip et al. (2010) also recognised that previously there had been a gap in the literature in relation to the role of complex relationships, what is referred to as ‘psychological safety’: the extent to which individuals feel secure in their relationships with peers. Falchikov and Goldfinch (2000) and Topping (1998) had recognised that research in this area needed to consider perceptions of peers in the process. They were in agreement that the phenomenon of ‘friendship marking’ (Van Gennip et al., 2010: 282), where peers would tend to award good marks regardless of the quality of the work, was much more likely to take place if certain interpersonal values were not in place. These included ‘a sense of confidence that the team will not embarrass, reject, or punish someone for speaking up’ (Edmonson, in Van Gennip et al., 2010: 282).

The present study

The lack of clear evidence in the literature created some specific obstacles relating to this study, which was concerned with the extent to which social pressures and group dynamics had an influence on the peer assessor’s feedback and grade, as well as preconceived ideas of the academic ability of both the assessor and assessed. It also hypothesised that the majority of perceived benefits accrued to the assessor.

Research design

The main research method for this case study was an in-class survey with one class of 18 students, followed by targeted interviews. I was interested to explore specifically the perceptions of this small group, rather than those of all ISL classes where PA is used or of all students studying IB DP Economics. The choice of my own IB Diploma Economics class arose from a desire to critically examine and thus improve my own practice. The issues arising from familiarity between students and researcher would, I felt, be balanced by the in-depth knowledge of individuals.

Students undertook a PA using the same criteria on four occasions over the course of a calendar year. On each occasion, after each of the four commentaries was peer assessed, students’ views were collected via a questionnaire. The purpose of the questionnaire was to consider students’ opinions, ideas and thoughts throughout the PA process, particularly relating to their perceptions of its effectiveness and social dynamics. The questionnaire was followed by in-depth interviews with a subgroup of four students. In such situations, questionnaire respondents would often be guaranteed anonymity. Cohen et al. (2007) argue that anonymity leads to greater honesty. This may be true, but for this study I wished to investigate students’ views in the context of their academic ability and the relationship this might have with the perceived value of PA. Anonymity therefore needed to be sacrificed, since only if the author of each questionnaire response was known could I match their response with my prior knowledge of each student. Removing anonymity creates interesting issues. However, providing clear explanations for these students, and given that they had been taught by me for at least 18 months and were used to giving me work each week to ‘judge’, reduced the concern that might sometimes arise in such situations. Naturally, in this article the students’ anonymity has been maintained by removing real names and other identifying characteristics in line with standard ethical guidelines.

Although the PA grades arising from this study did not count towards any class credit, I expected that the input students received could be taken on board and lead to improvements in their work which would have a direct impact on their final IB Diploma Economics grade. Using Topping’s (2009) typology of PA, the study could be summarised as a 2-year study, qualitative, in-class, compulsory (in that students were required to participate), mutual PA system of written commentaries that does not directly contribute to official grades, with no anonymity.

Findings

Student attitudes to Peer Assessment

The main areas considered in this study relate to the value students place upon the PA process, including an understanding of its purpose and an exploration of the suspicion that PA holds at least equal value for the peer assessor as it does for the person being assessed.

Value of the process for students, including an understanding of its purpose

Results of responses, based on a Likert scale, are summarised in the following tables. The results shown in Table 1 are noteworthy in that overall only 11 of the 18 students found PA to be valuable. Partly for triangulation purposes but also to address the issue of opportunity costs in terms of lost time in an examination-driven, content-heavy programme, a further question was asked. The results of this further question are consistent with this first set of responses as shown in Table 2.

Table 1.

Student views as to whether PA is valuable in the IB Diploma.

PA is valuable within the context of the IB Diploma
	Strongly agree	Agree	No opinion	Disagree	Strongly disagree
PA is valuable for a person whose work is being assessed by a peer	6	5	2	3	2
PA is valuable for a person who is a assessing a peer	8	5	1	2	2

Table 2.

Student views as to whether time should be dedicated to the PA of internal assessment.

Should time be dedicated to the PA of IA in IB Diploma Economics?
Yes	No	Unsure
11	2	5

Within the context of the IB Diploma, a content-driven, 2-year examination-based course, some students felt that one must be very careful not to devote too much time to PA, although a majority clearly were in favour of its use. It is worth remembering here that the survey was undertaken at the end of four separate pieces of internal assessment being peer assessed, and it seemed that some PA fatigue was beginning to set in towards the fourth commentary. A comment from the last survey seemed to summarise a feeling of some students that PA ‘should be used at the beginning of the course only’. Another comment read that ‘the process becomes repetitive’. The opportunity cost in terms of lost ‘syllabus time’ was also confirmed by a small minority of comments within the surveys and interviews, such as ‘it can be useful but I would prefer to get along with the syllabus’, and ‘it is wasting precious learning time’. This point was then elaborated upon further in an interview by one of the less enthusiastic PA students, who said ‘It can be distracting, as there is so much content that even though it does not take too much time, it can make the course feel rushed’. However, the majority of comments were positive, and the five students who were unsure whether time should be dedicated to PA gave comments suggesting that ‘it depends’, with a common dependent variable being who was their peer assessor and how often the PA was undertaken. Overall, these results coincided with suggestions in the literature that PA is considered to be a valuable tool by most students (Black, 1998, 2004; Sadler, 1998; Topping, 1998, 2009).

A belief that PA held equal or more value for the peer assessor than for the person being assessed

In questionnaires and subsequent interviews, a clear message to come through was that although PA could be seen as valuable for both the assessor and the person being assessed, the main beneficiary was the person providing the feedback. This was indicated not only by the previous question responses outlined above, but also by responses to a more direct question summarised in Table 3.

Table 3.

Student views as to whom the process is more valuable.

For whom was the process more valuable?
The assessor	The assessed
16	2

This is quite an astonishing result, and something not anticipated after reviewing the literature, which suggests only that PA may be considered of ‘equal value’ to both groups. There are some specific aspects of the nature of PA in this context that may have led to this result, and these will be discussed later. It does raise a number of questions. Returning to the earlier review of the benefits of PA, and Topping’s (2009) assertion that ‘the overriding goal of peer assessment is to provide feedback to learners’, it would seem this was not the perception of the students who were the focus of this particular study. Though neither Topping (2009) nor Fry (1990) were blind to the benefits to the assessor (as noted earlier), the strength of view from students regarding the principal beneficiary does not concur with the general view of the literature. It must be noted that increased familiarity with the standard internal assessment criteria and an associated in-depth understanding of their application was an area where students saw clear benefits. Students becoming more familiar with the criteria, and then applying them to their own commentaries, became an advantage. This could also have been partly influenced by my own role in the class, and my repeated reminders of the need to be really aware of the criteria in order to be successful in meeting them. Fairly typical comments from the students in response to the question ‘What do you find most valuable about the PA process and why?’ were ‘Helps me understand the criteria better’, ‘Understanding the criteria better as I read them more times to grade the other students’ work’ and

I felt that the most valuable thing I learned from the peer assessment process was getting some useful ideas for my own internal assessment. Things like how they structured it, the diagrams they used, and the Economics that was contained in the internal assessment could be useful for my own internal assessment.

In fact, of the 18 responses to this open-ended question, only two students considered there to be any value to having a peer assess their work. One described a benefit to be ‘To get feedback from people other than my teachers because I get different suggestions of progression’ and another, more altruistic spirit, looked at the benefit for the whole class in considering ‘The fact that people are told about what areas they need to improve in and also you are able to give in a good first draft if it is peer assessed before handing in the draft’.

This unexpected outcome was explored during the subsequent interviews. When pressed on this issue, one student responded:

When another student provides me with feedback, I take some of it on board, but I know that they may also just be wrong. If the teacher does it, I trust them. So because I don’t really believe a student can make a true judgement, I only see the value to me of looking through another student’s work and thinking of my own.

This is an enlightening comment, although it is just one comment, as it points to the idea of trust and Topping’s (2009) critical success factors. Without trust in the other student’s ability to make a solid judgement about one’s own work, students tended to disregard the input, or at least some of it, although they could clearly see the benefits to themselves of applying specific criteria to someone else’s work, and then coming back to reflect on their own, using the same criteria.

As this result was somewhat unexpected, the questionnaire had not included any questions relating to reasons why students may not have valued other student input highly, or as highly as they did that of the teachers. In each of the follow-up interviews, however, the question was asked: ‘What could I have done as the teacher to improve the quality of the PA process?’ All four students interviewed responded regarding social dynamics. This will be discussed later, but it is of interest here that two students also commented on the lack of training for PA before the process started. ‘An explanation of the criteria in more detail would have helped’ was one comment, and ‘I was quite unsure what to do as I did not understand exactly what we were looking for in each criteria’. I was not surprised to hear these comments as, on reflection, the PA was actually introduced to the group before sufficient consideration had been given to what might have informed its introduction if a closer review had been undertaken previously.

Social dynamics and peer assessment

This was the most complex aspect of PA to explore as it involved engaging with people’s views, ideas and feelings about complex social interactions. The specific interest in this area was to explore my hypotheses that students ‘friendship marked’ for fear of hurting the feelings of peers whom they perceived to have produced low-quality work, and that advice from peers would not be held to be valuable, compared to that from the teacher (regarded by students as the expert).

Student ‘friendship marking’

In an attempt to ascertain whether students ‘friendship marked’, the survey had one clear question, as shown in Table 4.

Table 4.

Student views about assessing work of a friend.

If you had to return a piece of work which was very low quality, and it was to a friend, would you be:
Exactly accurate and put what you thought?	Slightly more generous?
17	1

With hindsight, this was a poor question, as it asked individuals to admit to being influenced in their assessment by social dynamics. In my experience, the majority of 17-year-olds are reluctant to admit to being influenced by peer pressure, and they seem even less likely to have admitted it when they knew that for PA to be valid as a process, they would have needed to remain impartial. Some comments from the interview were ‘if I felt that they had done poorly, I would give them a low mark’, and ‘I would definitely give them the mark they deserved even though it might be very low’. However, one of the participants then went on to say that if they were choosing between two specific levels within a criterion, they would always err on the positive side. When asked whether they thought other students might do this there was little agreement, with all students saying they thought it unlikely, but possible. When asked in the interview why someone might do this, all students responded that it would be to prevent hurting the feelings of the weak student. Two of the three girls interviewed took time to explain that they had erred on the positive side but had couched the feedback ‘very kindly’.

It became evident here that students were clear that they would give the mark they thought the work deserved regardless of whose the work was, even if it was that of a close friend. Considering the way in which the question was framed, this is perhaps not unsurprising.

Preconceptions of student ability and marks awarded

Although students felt they would give a piece of work the grade it deserved, they responded slightly differently to three questions designed to elicit responses linked to whether preconception of a student’s ability had an influence on the grade they would award to that student’s work. Table 5 summarises responses to the first of these questions.

Table 5.

Student views about standard of work of peers.

Was the internal assessment the standard you were expecting when you saw the name of the person it belonged to (marked out of 20)? Please expand upon your answer.
Better	At the same level	Worse
1	15	2

This is quite a subtle question. It may have been that students simply had a clear understanding of the ability level of the student whose work they were assessing, or that their preconception about the standard of work they were going to assess meant that they only found what they were looking for. Although 15 out of 18 finding what they expected is a high proportion, it does not necessarily follow that students were influenced by preconceived ideas. It does, however, hint at this possibility, particularly when responses to the next two questions are considered. The first was open-ended as follows:

When you received the piece of work to be peer assessed and saw whose it was, did you expect anything in terms of quality?

Responses provided some valuable insights, including ‘I had x, who is a massive slacker’, ‘I was marking x’s and so I knew it would be almost perfect as she is super smart’ and perhaps the most notable: ‘I cringed when I saw whose I had to mark because I can’t understand a word they say and I knew it was going to be painful’. Almost all students made some comment about what they had expected. Only one wrote ‘I did not really know much about x so just got started’. The insight provided by responses to this question was expanded upon in the follow-up interviews. One socially challenged student perceived that she had been a victim of the preconceptions of others. In interview, she said, ‘Someone was marking my work and they thought it was someone else’s. Had they known it was my work they would have given me a lower score’. When questioned on why she thought that this was the case, she replied ‘because of the relationship I have with that person. It is pretty much a hate kind of thing and they thought they were marking a friend’s’.

In each of the four interviews, participants believed that they would not be influenced by knowing whose work was being assessed. However, two went on to contradict themselves in further questioning while all four believed that other students would be so influenced. One student contradicted himself by saying ‘I knew x would have done it at the last minute so kind of expected it to be bad’. It is interesting that all interviewees believed other students would be influenced by knowing whose work they were assessing, but that they themselves would not be. This does not provide evidence that students were influenced in PA by their preconceived ideas, but it does suggest that students may not be entirely objective about the individual whose work is being peer assessed.

A final question relating to both preconceptions and friendship marking related to the issue of anonymity (see Table 6). Did students feel that anonymity of the work being assessed would lead to a preferable outcome?

Table 6.

Student views as to effect of anonymity.

How would the process change if students’ work was made anonymous?
It would improve the process	It would devalue the process	It would have no effect on the process
14	2	2

The rather surprisingly high number believing that anonymity would improve the process was worthy of further consideration. Comments related to this question provided further indications that students are indeed influenced by the identity of those whose work they are peer assessing. Some interesting comments included:

People would try to do their best as they don’t know whether it belongs to a close friend of him/her and thus not want to risk giving him a bad assessed piece of work.

It doesn’t really affect me … it probably does but I don’t want to admit it does, I try to remain objective.

I don’t think peer assessment is a good idea. Anonymous peer assessment may be, but to certain students there is pride and shame involved.

It’s helpful knowing who is assessing your work and who you are assessing because you know their strengths and weaknesses.

Perhaps the most interesting response was ‘Personally I did not feel comfortable giving a mark lower than what I gave, even though it was deserved’, which provided clear evidence of friendship marking despite responses to an earlier direct question suggesting that this was not an issue.

Another student noted that anonymity should not have been an issue, in arguing that

Personally, I think that we judge work by what we have written in front of us, independent from the name at the top of the page. If the IA is good, it is good and it does not depend on the person.

This response was, however, the exception to the apparently widely held belief by students that anonymity was better, which helps to support the hypothesis that students are influenced by knowing whose work they are marking and the associated complex social relationships. It is worth noting here that other studies have also highlighted issues relating to what might be described as ‘peer embarrassment’. Falchikov (1995), when discussing student perceptions analysed in a study set in higher education, noted that this was an issue ‘prone to be a more of a problem in small groups of long standing than in large newly constituted groups’ (p. 253). The fact that this set of IB Diploma students was made up of small and long-standing groups may thus explain this particular outcome.

An interesting side question related to exactly why students may have been influenced by knowing whose work they were assessing. Although this is particularly difficult to ascertain and largely outside the realms of this study, some hints were provided in responses arising from the question relating to what could be done to improve the process. This question was asked in the interviews and provided some informative responses, including the following four responses:

Giving the work of those who work hard and understand Economics to others who also care will put the work into properly evaluating the assessment. I know students who are not very successful often benefit from the help of more successful students, but I don’t believe that those who do not care about their success actually exert effort into evaluating a peer’s piece of work. Therefore, no benefit is derived by a good student when a poor student evaluates his or her assessment. … [P]oor students should evaluate other poor students’ work.

Maybe we can practice marking more often because I don’t really feel comfortable giving people marks when I don’t have a lot of experience doing it.

I think that a way to improve the quality of peer assessment would be to just have a short session on what things to look out for. Reading the criteria is helpful, but maybe just 10 minutes of going through an example as a class would show everyone what they’re looking for.

Maybe give more specific requirements other than the criteria. It is vague.

All four interviewees reported not being fully confident in their own judgements, and it was also clear that none of them believed the peer judgements to be as valid as those of the experts (teachers).

Advice from peers would not be held to be valuable, compared to the teacher (expert)

It was noted earlier that students in this study were not confident in their ability to make judgements based upon the specific IB Diploma Economics criteria. It was also clear that they felt the expert advice was more valuable. The idea that assessment is the responsibility of the teacher is noted in much of the literature as being a common belief at the start of the PA process (Topping, 1998, 2009; Van Gennip et al., 2010), although most authors note a reduction in this view with familiarity. That was not the case in this study. Relevant responses in the questionnaire include the following: ‘peer assessment is valuable but it is your (the teacher’s) marks which count’, and ‘I like peer assessment but only if the teacher then re-marks the work’. Remembering that these responses were provided at the end of four instances of peer assessment over an 11-month period, some interesting issues are raised that challenge the claims of Black et al. (2003) that students are more likely to accept feedback from peers than from teachers. This may be the case in certain areas, but in this particular context a lack of genuine understanding of the exact requirements of the IB Diploma criteria for internal assessment held students back both from being able to assess peers’ work with confidence and from taking on board a peer’s advice and judgement. A repeated theme in comments from students was a lack of understanding of the criteria or that the criteria were ‘vague’. Apart from this lack of understanding of the criteria, it is difficult to suggest reasons why findings here should have been different from those of other studies.

Conclusion

This study set out to address students’ views regarding the validity of peer assessment at the International School of Lausanne, whether they believed complex social dynamics had an influence on marks awarded and the value of peer assessment, and whether it was the peer assessor or the peer assessed who benefited more from the process. Overall, attitudes to peer assessment within the study were positive, although not overwhelmingly so, and some students, while seeing the benefit, had reservations regarding the curriculum time lost to peer assessing. The wide range of perceived benefits anticipated by Sadler (1998) and Topping (1998, 2009) did not materialise in this study, with the most surprising outcome being that students found the major benefits to accrue to the person acting as the assessor. This perception may be linked to the repeated use of the same IB published criteria rather than student-developed and jointly agreed upon criteria. It also needs to be noted that, as Falchikov (1995) highlights, establishing good-quality peer assessment requires time for organisation, training and monitoring – something that is lacking in this study.

Responses from students provided some real insight into social dynamics and friendship marking, and despite their protestations to the contrary, there is evidence that at least some students felt they had been the victim of preconceptions or bias. Remembering the clear preference of the majority of students for anonymity, it is clear that a complex and not thoroughly explored set of variables need to be considered, concurring with Falchikov’s (1995) findings that social relationships are more likely to affect the implementation of peer assessment in smaller, more established groups, exactly like this one.

As a result of undertaking this study, I plan in future to continue to use peer assessment but to spend longer introducing it to students, and to provide much more guidance and training on how criteria should be applied and what to look for. In such a small group, where the students know each other very well and may have complex relationships, the extent of dynamics unknown to the teacher is so great that it is simply safer and probably more effective to remove the possibility of knowing who the peer assessor is and whose work is being assessed until this dimension is better understood. Peer assessment can clearly have benefits, but in such contexts it may well be that anonymous peer assessment will be more likely to lead to outcomes in which all involved can have confidence.

Footnotes

Author biography

Simon Foley has taught the International Baccalaureate (IB) Diploma since 1997, first at Bangkok Patana School in Thailand and then at the United World College of the Adriatic in Italy. Since 2005, he has been at the International School of Lausanne in Switzerland, where he is Assistant Principal–IB Diploma Coordinator. A former principal moderator for IB Diploma Economics, he is an IB Economics workshop leader and school authorisation/verification team member. He recently completed his MA in International Education at the University of Bath.

References

Black

(1998) Testing: Friend or Foe? London: Falmer Press.

Black

Wiliam

(1998a) Assessment and classroom learning. Assessment in Education: Principles, Policy & Practice 5(1): 7–74.

Black

Wiliam

(1998b) Inside the Black Box: Raising Standards through Classroom Assessment. London: School of Education, King’s College London.

Black

Harrison

Lee

(2003) Assessment for Learning: Putting It into Practice. Maidenhead: Open University Press.

Black

Harrison

Lee

(2004) Working inside the black box: assessment for learning in the classroom. Phi Delta Kappan 86(1): 9–21.

Cho

MacArthur

(2010) Student revision with peer and expert reviewing. Learning and Instruction 20(4): 328–338.

Cohen

Manion

Morrison

(2007) Research Methods in Education, 6th edn. Abingdon: Routledge.

Edmonson

(1999) Psychological safety and learning behavior in work teams. Administrative Science Quarterly 44: 350–383. In: Van Gennip

NAE

Segers

MSR

Tillema

(2010) Peer assessment as a collaborative learning activity: the role of interpersonal factors and conceptions. Learning and Instruction 20(4): 280–290).

Falchikov

(1995) Peer feedback marking – developing peer assessment. In: Topping

(1998) Peer assessment between students in colleges and universities. Review of Educational Research 68(3): 249–276.

10.

Falchikov

Goldfinch

(2000) Student peer assessment in higher education: a meta analysis comparing peer and teacher marks. Review of Educational Research 70(3): 287–322.

11.

Fry

(1990) Implementation and evaluation of peer marking in higher education. In: Topping

(1998) Peer assessment between students in colleges and universities. Review of Educational Research 68(3): 249–276.

12.

Gaillet

(1992) A foreshadowing of modern theories and practices of collaborative learning: the work of the Scottish rhetorician George Jardine. In: Topping

(2009) Peer assessment. Theory into Practice 48(1): 20–27.

13.

Gielen

Peeters

Dochy

(2010) Improving the effectiveness of peer feedback for learning. Learning and Instruction 20(4): 304–315.

14.

International Baccalaureate (IB) (2004) Diploma programme assessment: principles and practice. Available at: http://www.ibo.org/diploma/assessment/documents/DPAssessmentPrinciplesandPractice.pdf (accessed 12 July 2009).

15.

International Baccalaureate (IB) (2013) http://www.ibo.org (accessed 29 September 2013).

16.

Johnson

(2004) Peer assessments in physical education. Journal of Physical Education, Recreation and Dance 75(8): 33–41.

17.

Kollar

Fischer

(2010) Commentary: Peer Assessment as collaborative learning: a cognitive prespective. Learning and Instruction 20(4): 344–348.

18.

Noonan

Duncan

(2005) Peer and Self-assessment in the High Schools. Practical Assessment Research and Evaluation 10(17): 1–8.

19.

Sadler

(1998) Formative assessment: revisiting the territory. Assessment in Education 5(1): 77–84.

20.

Sadler

Good

(2006) The impact of peer-grading on student learning. Educational Assessment 11: 1–31.

21.

Strijbos

Narciss

Dünnebier

(2010) Peer feedback content sender’s competence level in academic writing revision tasks: are they critical for feedback perceptions and efficiency? Learning and Instruction 20(4): 291–303.

22.

Strijbos

Sluijsmans

DMA

(2010) Unravelling peer assessment: methodological, functional and conceptual developments. Learning and Instruction 20(4): 265–269.

23.

Topping

(1998) Peer assessment between students in colleges and universities. Review of Educational Research 68(3): 249–276.

24.

Topping

(2009) Peer assessment. Theory into Practice 48(1): 20–27.

25.

Van Gennip

NAE

Segers

MSR

Tillema

(2010) Peer assessment as a collaborative learning activity: the role of interpersonal factors and conceptions. Learning and Instruction 20(4): 280–290.

26.

Van Steendam

Rijlaarsdam

Sercu

(2010) The effect of instruction type and dyadic or individual emulation on the quality of higher-order peer feedback in EFL. Learning and Instruction 20(4): 316–327.

27.

Van Zundert

Sluijsmans

DMA

Van Merriënboer

JJG

(2010) Effective peer assessment processes: research findings and future directions. Learning and Instruction 20(4): 270–279.