Abstract
Gaining insights from domain experts into how they view communication in real world settings is recognized as an important authenticity consideration in the development of criteria to assess language proficiency for specific academic or occupational purposes. These “indigenous” criteria represent an articulation of the test construct and should therefore reflect what is germane to the particular domain of language use rather than general language-focused criteria familiar from other language tests. The methodological question of how to elicit such insights is, however, complex and has been addressed by various researchers using different methodological and theoretical frameworks.
The paper draws on data from a larger research project to explore the affordances and constraints of more or less direct approaches to eliciting domain experts’ perspectives on what matters for effective communication in the workplace. The domain experts in this case were physiotherapy educators and supervisors. The study offers a qualitative comparison of expert feedback gathered from three different sites. Two were in the workplace where the communication skills of physiotherapy students in training were assessed routinely and the feedback given to them was naturally occurring rather than elicited. The third was a more artificial workshop setting in which video-recorded interactions between student and patients or simulated patients (i.e., actors role-playing a patient) were shown to two groups of expert informants who were then asked by the researcher to comment on the strengths and weaknesses of each performance.
A qualitative analysis revealed that the nature of expert feedback differed significantly at each site, with the routinely occurring feedback containing scant and vague reference to language and communication aspects. The workshop setting, although it was less authentic, yielded much richer insights into the physiotherapists’ views about workplace communication. The implications of our findings for the development of relevant language test criteria are considered.
Keywords
Introduction and literature review
The fields of languages for specific purposes (LSP) and the testing of language for specific purposes are by their very nature interdisciplinary. Conceptualizing the specific context of test use, whether professional or academic, requires collaboration between content experts who have unique understandings of the context of interest, and applied linguists, who need to interpret these understandings within their own frame of reference for language teaching or testing purposes. The need for such interdisciplinary collaboration has been recognized from early in the conceptual development of LSP, for example in Selinker (1979), and in the work of Douglas and Selinker (1985) on applying the theory of discourse domains to language testing, and more recently in the collection of papers in Long (2005) on ways in which applied linguists and content experts may collaborate in defining the domain. There are a number of difficulties in this enterprise: in the words of Basturkmen and Elder (2004, p. 677), “the problem of reconciling the different perspectives of language and non-language professionals is a perennial concern for LSP practitioners.”
One issue for which this reconciling of different perspectives is particularly important is determining the criteria by which performance in specific language domains is to be evaluated. These criteria represent an articulation of the test construct and should therefore reflect what is germane to the particular professional or academic context rather than general language-focused criteria familiar from other language tests. Research (e.g., Brown, 1995; Elder, 1993; Plough, Briggs, & Van Bonn, 2010) has shown that applied linguists and content experts may not always be oriented to the same criteria in evaluations of oral performance in specific purpose contexts.
In order to address this issue, Jacoby and McNamara (1999) proposed the use of criteria indigenous to the specific communicative context within specific purpose language tests. Jacoby first developed the idea of ‘indigenous assessment’ in her PhD study of feedback on rehearsals of conference presentations by members of a research team of physicists (Jacoby, 1998). The presentations were by a mixed group of native and non-native speakers, including novices and veterans. A lengthy and detailed ethnography, and close Conversation Analysis of stretches of discourse, formed the basis for Jacoby’s conceptualization of criteria indigenous to this particular communicative context. These criteria were activity specific (i.e., concerned in this case with issues related to the particular task of presenting a multi-media report to a live audience) and tacitly known to insiders in the group. Sustained engagement with the context was required to render them meaningful for and accessible to an outsider (the applied linguist). This paper addresses this question of access from a methodological perspective: how can localized indigenous criteria be accessed in a manner that is both authentic to the context of concern and at the same time interpretable by applied linguists? Researchers have tackled this question of how to access authentic data from content experts in various ways.
Erdősy (2005, 2009), following Jacoby’s ethnographic approach, produced a thick description of a Canadian university academic’s practice of framing, grading, and giving feedback on student essays written by a group of 12 native and non-native undergraduates taking his course on Chinese history. To do so, he used a case study methodology, involving grounded theory, longitudinal observation and triangulation of multiple data sources (e.g., class discussions, conversations after class at the lectern, verbal protocols gathered during the grading process, annotated assessments of students’ written assignments and interviews). Through a meticulous process of cross-referencing these data sources, Erdősy established fundamental principles behind the professor’s scoring criteria for test answers. He, like Jacoby, emphasizes the context-specificity of these criteria, which can only be understood with reference to substantive content within the specific context of the professor’s course.
Abdul Raof (2011), departing from the ethnographic approach, investigated the criteria oriented to by Malaysian civil engineers in evaluating conference presentations by means of semi-structured interviews taking place after the informants had viewed such presentations. The post hoc elicitation of these criteria raises doubts as to whether what the informants said in the presence of the interviewer reflected the actual basis for the decisions made as they watched the presentations.
Although all the above studies have used naturally occurring events as the stimulus for investigating domain experts’ indigenous assessment criteria, only Jacoby restricted herself to these; the latter two studies supplemented this evidence through elicitation techniques, such as verbal protocols (Erdősy) and semi-structured interviews (Abdul Raof). Other studies have been more or exclusively reliant on less direct methods. Douglas and Myers (2000) sought to identify appropriate criteria for assessing the communication skills of veterinary students by inviting three groups of informants (veterinary students, veterinary professionals, and applied linguists) to comment on role-play interviews between veterinary students and simulated clients. While these role-plays were conducted live as a routine part of students’ communication training, the samples used for the research were video-recorded and the commentary was elicited for the research rather than for formative assessment purposes. Likewise, Kim (2013), in an attempt to identify domain experts’ perspectives on what mattered for effective communication in the aviation airspace, convened focus groups of aviation personnel to comment on audio-recorded episodes of actual radiotelephony discourse between pilots and air-traffic controllers. Her findings were used to interrogate the construct validity of the International Civil Aviation Organization (ICAO) policy and associated test of aviation English in Korea. Again, while the discourse stimuli provided for her participants were sampled from recordings of actual interactions in the aviation airspace, the expert commentary was elicited retrospectively rather than naturally occurring.
Sato (2014) explored the criteria oriented to by linguistic laypersons (those without a background in language study or applied linguistics) in evaluating the spoken language performance of candidates of varying levels of proficiency on major international standardized tests of spoken proficiency, using an analysis of the discourse produced in think-aloud discussions following viewing of the test performances on videotape. Although Sato’s work is not a study of assessment criteria indigenous to a particular context of use, but rather of performance on general proficiency tests, it is interesting for its choice of methodology. The verbal protocols elicited from lay informants as they were rating each performance were thematically coded and used to identify what values or criteria underpinned their judgements. While Sato argues that rating scales should be informed by the criteria important to lay persons, the process of eliciting such criteria in relation to performances in artificial test conditions may have affected the findings.
Fulcher, Davidson, and Kemp (2011) sought to develop a rating scale for judging the successfulness of service encounters based on current theories and empirical descriptions which capture the richness of actual performance in such contexts. However, their data, drawn both from the research literature and from original discourse data, are largely based on native speaker interactions, with the assumption that these represent competent exchanges. The resultant rating categories, while derived from a close empirical analysis, cannot strictly be defined as indigenous, since they do not directly capture the perspectives of insiders on what matters for effective performance.
It can be seen from the above studies that while the idea of indigenous assessment criteria is attractive and has been invoked by a number of different researchers in our field, its actualization is in fact complex and challenging. Only in Jacoby’s (1998) study were the indigenous assessment criteria construed by a participant observer from the actual performances of assessors in a setting that occurred routinely, and without recourse to reflections of informants or other indirect means. Instead, the subsequent studies by other researchers, while sometimes involving ethnographic methods and careful discourse analysis, have tended to rely instead or as well on a variety of less direct methods, such as the use of simulations, and interventions such as interviews, focus groups and think-alouds, or recourse to the literature. These methods may well be the only practical option in many cases. The question arises as to how much is lost, gained, or altered by the use of these indirect methods. The present study considers this question by comparing three methods of investigating indigenous assessment criteria, of varying degrees of authenticity, in the context of the evaluation of clinical communication with patients by health professionals. Authenticity in this context means the degree to which the criteria were observed in implementation in routinely occurring assessment practices rather than in activities which did not occur routinely in the work setting but had been specifically designed for the study.
Research context
The data discussed in this paper are drawn from a larger Australian Research Council Linkage project (Elder et al., 2013) designed to interrogate current language proficiency standards for non-native English speaking health professionals applying to practise their profession in Australia. The study’s aim was to investigate the indigenous assessment practices of health professionals when evaluating the clinical communication skills of trainee clinicians. These insights were used to determine the extent to which the criteria oriented to in such practices were aligned with current linguistic criteria used to assess speaking performance on the Occupational English Test (OET), a specific purpose test of language proficiency for healthcare. The OET, as noted in Elder (this issue), is recognized by 12 different health professions in Australia as one means of establishing whether an applicant’s language proficiency is adequate for entry to clinical retraining programmes prior to professional registration. Here the focus is specifically on physiotherapy, one of the three health professions included in the larger study. (The data from nursing and medicine are presented in Pill, this issue.) Data collection was conducted in two phases, with each phase differing in its methodology for studying such practices, and in degree of directness or authenticity, as defined above. In the initial workshop phase, physiotherapy educators from a large university in Melbourne, Australia, gave feedback on video recordings of students interacting with patients or simulated patients in a clinical setting. The feedback took the form of commentary on what the educators considered to be effective and ineffective aspects of performance by the trainee in the video stimulus. The subsequent phase, the hospital phase, was conducted in a metropolitan teaching hospital in Melbourne and focused on the routinely occurring spontaneous feedback given by physiotherapy clinical supervisors to individual students while undertaking the clinical component of their university degree. The feedback was given twice: once for formative purposes during or immediately after their interactions with patients, and later, for both formative and summative purposes, on completion of their clinical practicum. In both phases the aim was to capture the physiotherapists’ “indigenous criteria” relating to communication, that is, what they valued about communication in healthcare. The differences in the approaches taken – commentary in the workshop phase and the immediate and delayed feedback in the hospital phase – allowed us to consider the relative authenticity (or inauthenticity) of the data gathered in each case and how the approach might have affected the nature of the feedback. It also allowed a comparison between the kinds of insights that could be gained from each context.
Ethics approval was granted by the relevant Human Ethics Advisory group at the University of Melbourne for the workshop phase of the study and, for the hospital phase, by the Austin Health Non-Drug Study Ethics Committee. Further details on each phase are provided under “Methodology” below.
Research questions
The aim of the paper, as already noted, is primarily methodological: to consider the constraints and affordances of each dataset as a means of accessing the views of health professionals on what constitutes successful communication in the clinical context. The research question is therefore:
What is the relative value of different possible data sources for capturing indigenous communication criteria from physiotherapists evaluating the performance of trainees in interaction with patients?
Two sub-questions guiding the analysis and presentation of results are as follows:
Is there evidence of authenticity constraints in the datasets from each phase?
How do the datasets differ with regard to the insights they offer into the construct of effective healthcare communication?
Methodology
The workshop phase: Educators’ feedback on video-recorded stimuli
Two workshops of one hour’s duration were held: one in the department of physiotherapy at the selected university in Melbourne and the other in the associated metropolitan teaching hospital. Twelve participants were recruited by email for the study. All were clinical supervisors or academics affiliated with either the university or the hospital and had at least five years’ experience of clinical practice. Seven attended the university-based workshop and five attended the hospital-based workshop.
The video stimuli used to trigger participant feedback involved three physiotherapy students (one of whom was a non-native English speaker (NNES)) in the final year of their physiotherapy degree interacting with patients or simulated patients in the hospital setting for a period of 4–8 minutes. Two stimuli were played at each workshop. The participants and focus of each interaction are shown in Table 1.
Workshop stimuli.
ESB = English-speaking background; non-ESB = non-English-speaking background.
Simulated patient (actor).
Used at both workshops.
Workshop participants were asked by a member of the research team acting as facilitator to consider what aspects of the viewed performance they might comment on in a post-observation feedback session with the student and were given a simple worksheet with two columns headed Stronger/Weaker aspects of performance on which to take notes. These headings were deliberately general, as the researchers did not want to skew the participants’ feedback towards communication issues if these were not indeed the features of interaction that concerned them. After observing each interaction the participants were invited by the facilitator one at a time to report their comments to the group. The workshop ended with a general discussion amongst participants. Dataset 1, the dataset for the workshop phase, comprises both the individual commentaries and the ensuing general discussion.
The hospital phase: Observation and feedback in the clinical setting
Two further datasets were generated in the hospital phase during routinely occurring interactions between 14 supervisors and 16 trainee students. All but two of the students were native English speakers (NES). The two NNES were from South-east Asia and had high levels of fluency.
In one of these hospital datasets (Dataset 2), 11 supervisors observed and provided feedback to physiotherapy students on their interactions with physiotherapy patients while on clinical placement in the final year of their Bachelor course. 1 Supervisor feedback was given to the students either individually or in small groups at different times during the students’ four-week clinical placement in a variety of settings including the neurology and cardiothoracic wards, the outpatient clinic, the rehabilitation clinic and the hospital gymnasium. Feedback usually occurred shortly after the student had seen a patient, but sometimes during the course of the clinical encounter with the patient present. Audio-recordings of the feedback were made via a lapel microphone attached to the supervisor, without the researcher being present. The other dataset (Dataset 3) consisted of audio recordings of end-of-term feedback sessions in which one to three supervisors, using a pro-forma marking sheet (see Appendix A), gave individualized feedback on student performance throughout their placement in the intensive care unit (four students) and the cardiothoracic and neurology wards (three students). The feedback was given inside a private office in the hospital, again without the researcher present. The feedback addresses the four overarching categories of performance on the marking sheet: Assessment, which is about history-taking; Analysis, based on the assessment findings; Action, which involves selecting and implementing appropriate forms of intervention; and Planning and Education, which is about monitoring the success of the intervention and giving advice to the patient. Communication (with the patient and professional peers) is mentioned explicitly under the subcategory Affective, which is assessed three times on a scale of 0–5 as part of the Assessment, Action and Planning and Education categories respectively, giving it an overall weighting of 15/75 or 20% of the total score for the clinical placement.
The three datasets used for the research thus differ significantly in terms of their naturalness in relation to the work place setting as summarized in Table 2.
Comparison of datasets.
Analysis
After the workshops, the feedback sheets were collected and the individual commentary and group discussion were transcribed. The transcripts were analysed qualitatively to identify salient themes emerging from participant feedback, which were coded and cross-checked according to procedures developed and described by Pill (this issue). Participants’ notes were not coded as it was found that the audio transcripts yielded a fuller record for analysis. Two overarching themes were identified in the Physiotherapy workshop data: Generic communication skills and Clinical skills for information gathering and management. Each of these themes was divided into various subthemes (see Woodward-Kron et al., 2012, pp. 9–13, for further detail on these categories, since only those relating to communication are treated here). In addition, the data were scrutinized for evidence of any effect of the facilitator’s presence on participant responses.
The recorded feedback sessions in the clinical hospital setting were also transcribed. A thematic content analysis was undertaken using the same categories established for the workshop data. The transcripts were again scanned for any evidence of a research(er) effect on what the physiotherapy supervisors were saying.
An overview of participants, data and the analytical process is provided in Table 3, together with the codes used for data identification purposes when reporting the study’s findings below.
Overview of participants, data, and analytical procedure (adapted from Woodward-Kron et al., 2012, p. 169).
Findings
Findings are reported in relation to each of the research sub-questions posed above. Note that any names cited in the extracts are pseudonyms to protect the identity of the participants.
Sub-question 1: Is there evidence of authenticity constraints in the datasets from each phase?
Workshop feedback
Recall that the workshop feedback was in response to video-recorded stimuli involving interactions between students and an actual or simulated patient. While such stimuli are a routine component of physiotherapy training, the use made of them in our workshops does not reflect the routine use of such materials for training purposes. Analysis of the workshop commentary revealed a number of instances of where this unnaturalness of the research approach appears to have affected participant behaviours. Some examples of these possible effects are set out below.
In the first workshop one of the participants (P3) commented after viewing the second video-taped stimulus that the video-recording of students’ interaction with the patient or simulated patient may have affected his behaviour.
yeah-exactly, and he’s going “oh shit, I’ve got a camera on me: um”, so he looked quite nervous to me. (PHY WK1 0:40:01.3)
Perhaps more relevant to the problem of eliciting indigenous criteria is a comment made by one of the physiotherapy educator participants on the difference between what was required for the workshop and what would occur in everyday practice. The comment is made to one of the workshop facilitators (R2), who is setting up the first of the two workshop sessions. After viewing the first video stimulus P1 seeks clarification of the procedure and then notes the difference between how she would give feedback to a student in the clinical situation and what she is being asked to do in this session:
So, so are we saying this is the sort of feedback conversation we’d have with [the student], or are we talking about what we thought was strengths in his appearance or weakness in his interaction?
Yeah, yes. Um, so, if you could point out what you think are the strengths and weaknesses, but as you would give feedback to, the student.
OK. Tricky, because I’d be going through a conversation
Interactive, yes
So I’d have to respond to what he had to say, so …
Okay
Probably starting off by asking him what his priorities had been now he’d talked with them
Mmhmm
So that’s a little tricky to do it in this format. But I can certainly tell you what I thought he’d done well. (PHY WK 1, 0:14:33 – 48.30) 2
The discourse structure of the workshop data, with each participant commenting in turn in monologic fashion on the issues arising from the video stimuli, contrasts with the way feedback is routinely provided in the clinical setting. This issue will be discussed further under “Research Sub-question 2” below.
One reason for this difference in feedback may be that the workshop participants are tailoring their responses to what they perceive to be the expectations of their peers (the other clinical educators) and also to those of the researcher. In the following extract, for example, one of the physiotherapy educator participants, when reflecting on the workshop process, admits to having confined her comments to whether the student is communicating in an appropriate manner with the patient rather than considering whether the student has covered the necessary ground in eliciting clinical information.
I was trying really hard not to look at the “Has he asked all the questions?”, and just, “Is he trying to use some of the strategies of communication?” (PHYS WK 1, 0:24:57)
Presumably this is because she is aware of the applied linguistic focus of the research project. Whether the same issues of communication would be raised with a student in the clinical context therefore remains open to doubt.
The active role played by researcher/facilitator in eliciting commentary from participants is a further departure from what might happen in a regular feedback setting. There were numerous instances in the workshop transcripts of the researcher asking the educator to clarify or elaborate, as can be seen in following questions:
“So they were assumptive type … leading statements?” (PHYS WK 1, 0:34:30) “I think you said the “wording could be better”. Do you have any [notes] there? (PHYS WK 1, 0:47:20) “And when you say ‘specifics of the situation,’ what do you mean exactly?’ (PHYS WK 2, 0:18:44)
This prompting may have resulted in the participants mentioning or giving greater emphasis to certain issues that would not otherwise have been articulated. There was, however, also an instance in the hospital data (discussed below) where the student requested a specific example of the generic behaviour the supervisor was commenting on, so these requests for clarification by the researchers may not be entirely at odds with what sometimes occurs in the clinical training situation.
Hospital feedback
The hospital setting is, as already noted, less contrived than the workshop in that it involves students interacting with real patients or with their supervisors, either on the ward or in a private office, as routinely occurs during the clinical practice component of their training programme. In contrast to what is evident in the workshop data, the hospital feedback shows no direct evidence of any observer effect on the data, apart from an isolated comment from a student (“Jackie”) who seems concerned about the confidentiality implications of her comments to her supervisors being audio-recorded:
Okay, yeah, I mean, I think this is all being recorded which is weird because you can play that back to all the other supervisors and everything.
This is purely for the lady who is recording it. I don’t even listen to them. [laugh] (HOSP 2 00:16.33 – 17.2)
The response of her supervisor (S2), however, suggests that the fact of being audio-recorded by an unknown researcher (“the lady”) is of no consequence from her point of view. We may surmise from her comment that the presence of the audio-recorder on the table has not affected the nature of the feedback given to the student. Nevertheless we can never be entirely certain that this is the case either for this or other interactions in the dataset. The audio-recordings, after all, do not capture gestures or other non-verbal cues that might indicate discomfort or self-consciousness at the idea of the interaction being reviewed by an outsider and might cause supervisors to deviate from their normal feedback routines.
Sub-question 2: How do the datasets differ with regard to the insights they offer into the construct of effective healthcare communication?
While it is clear from what has been presented above that the workshop setting yields a commentary that is different in nature from what routinely occurs in the context of clinical training, the question of which data source lends itself best to illuminating the construct of speaking proficiency for health professionals paradoxically suggests that the less natural workshop setting is the more useful and informative.
Analysis of transcripts of the three datasets revealed stark differences in how feedback was framed in each context as well as how and how often issues of language and communication were raised. The main features of each type of feedback (manner of delivery, focus of the feedback, who gives the feedback, content of the feedback) for each dataset are summarized in Table 4.
Comparison of structure, content and nature of feedback in each dataset.
Space constraints preclude our reporting extensively on different instances within the data. The extracts in Table 5 have, however, been chosen because they are broadly representative of the different character of feedback in each datasset.
Extracts of physiotherapist feedback drawn from the three datasets.
In the workshop extract (Table 5, Dataset 1), the participant evaluates a number of aspects of the interaction between the student “Sam” and a simulated patient “Tony”, listing for the benefit of the researcher and possibly the other participant educators who have watched the same video stimulus features such as information gathering, listening to the patient, framing and sequencing of questions, manner of responding. The discourse is monologic and the informational content is highly condensed compared with what occurs in the second extract from the hospital setting (Dataset 2). Here, the supervisor directly addresses the student (“Jason”) and limits the feedback to one aspect of his recent interaction with a patient in the cardiothoracic ward. The exchange begins with a positive comment followed by a call-response-evaluation sequence typical of the teaching situation. The supervisor asks the student about the clinical reasoning underpinning his questioning of the patient (“Why do we normally get people to do a supported cough?”) and it is the student’s response (“because of pain”) that triggers the negative evaluation of the recent interaction with the patient (“I didn’t hear you asking about any pain”) and subsequent explanation by the supervisor as to why this was a significant omission from a clinical perspective. The formative feedback presented to the student in this context is less direct and more narrowly focused than that offered to the researcher and fellow educators in the workshop context. The summative end-of-placement feedback to the student (Dataset 3) tends to be more general in nature than that which occurs in the other two contexts. Comments refer not to any specific interaction but to the quality of the students’ overall performance demonstrated across the range of interactions that have occurred during the clinical placement. The feedback is directly evaluative and often explicitly linked to the categories in the pro-forma assessment sheet (Appendix A). Although there is sometimes more than one supervisor giving feedback, each tends to deal with a different category. In this extract there is only one supervisor present and she has moved from an overall evaluation through the various categories on the assessment sheet to the comments on communication presented above.
As well as differences in the way feedback is framed or structured in each dataset, there are also differences in the attention given to issues of communication. Note the repeated and detailed reference to language and communication issues in the extract taken from the workshop data (Dataset 1) in Table 5 – listening, making empathetic statements, simple questions, paraphrasing, acknowledging – and its absence from the feedback from Dataset 2 which is concerned with the relevance and comprehensiveness of the clinical information gathered rather than with the manner of questioning and responding to the patient. Scrutiny of all instances of feedback in this second dataset revealed that this approach was typical. The brief extract below contains the sole mention of a communication issue (i.e., the importance of self-introduction) in ongoing feedback to students while on their clinical placement (Dataset 2):
Yeah. Kay. So … um … (long pause) I thought what you did was good. Um, the only thing I would change, probably, um, just introducing yourself to the family members. Yeah.
Okay. (PHY HOSP 2/46, 00:02.52)
No further direct mentions of communication style or strategies (present or absent in the trainees’ interactions) were found in the ongoing feedback from physiotherapists in the hospital setting, although there were occasional mentions of non-verbal behaviours such as physical positioning in relation to the patient that could perhaps be seen as communication related, albeit not named as such. By contrast, in the end-of-placement feedback sessions (Dataset 3) communication was regularly mentioned. However, comments to students, whether approving or not, were very general, as shown in Table 5, Column 3 and in the following examples.
I think you are very clear in the way you communicate. You know you’ve got a confidence and also a bit of a presence about you with the patients who aren’t intubated and sedated. I think your rapport and your ability to interact with them, you’re performing really nice in that area. (PHY HOSP 2/74, 00:01.31)
And I think if you are as timid and as quiet as you are at the moment, you’re gonna really struggle to be heard. Particularly, if you choose work in a hospital like this, or even anywhere where you’re part of a team, you’re gonna find it hard to develop your, get your viewpoint heard. You know, in team meetings or talking to the doctors, or talking to the rest of the allied health team. (PHY HOSP 2/73, 00: 00.38)
The word “communicate” appears in the first example and is linked to clarity and rapport with patients. In the second example communication is not named as such but the mention of “talking to the doctors, or talking to the rest of the allied team” makes it clear that quietness and timidity are seen as impacting negatively on communication with peers. It is also clear that the feedback extends to how the individual is likely to perform in a variety of contexts, and not just the one in which the student is assessed. In neither case, however, is there any reference to what is actually said by the student, or to any non-verbal indicators of confidence or timidity. From the point of view of understanding what particular features of interaction are seen as contributing to effective workplace communication this kind of generalized feedback has little to offer the language testing researcher.
The only instance in the end-of-placement feedback where any specific interaction is mentioned is when an assertive student asks for examples of where she fell short. Reference is then made to an episode in which she has responded defensively to her supervisor’s advice about how to use a particular piece of equipment in the presence of the patient. Although this is judged to be inappropriate behaviour, the feedback has more to do with the student’s management of the training situation rather than with her manner of communicating with patients.
The data in the further examples below from the first workshop shows how the physiotherapy educators comment on particular features of communication displayed by one trainee physiotherapist. This type of commentary is very different from that which occurs in the hospital datasets.
Um, his ability to … show that he’d actually engaged fully with the patient was a little bit, ah, problematic. I didn’t think he responded to the patient’s joke. Um, and, he wasn’t able to clarify, or establish that understanding between you know what the patient was trying to achieve. He was a little bit straight-faced through the entire communication, a little bit stiff, um, which meant that I didn’t know if he was establishing enough rapport, although he had a lot of strategies in place to show that he was actually paying attention. He probably overused the “so” at the beginning of every question and the “okay”, just too many times to show that he’d heard. And, the wording occasionally wasn’t as good as I thought, okay. He he was – the seriousness of it all, I wasn’t sure how comfortable the patient was but the patient gave him a lot of information and I don’t think he stopped the patient, from sharing information. But overall I thought it was a good job. (PHY WK 1, 0:16.65)
This participant’s comments on the video-recorded interaction he has observed is far richer and more detailed than those appearing in the naturally occurring hospital feedback, either during or at the end of the placement. It becomes obvious from the workshop commentary that patient-centred communication is a core value for the physiotherapy educators and that engaging with and responding to the patient’s concerns is critical for both getting information and giving treatment and advice. More importantly, we are offered instances of which features of communication are seen as helpful or less than ideal (and this is certainly made easier by the fact that all parties, including the researcher, have just viewed the same video-recorded interaction). P1’s verdict is that the student has not established sufficient rapport with the patient. Details are given about language features (a somewhat mechanical overuse of so and OK), failure to respond appropriately or clarify what the patient has said, body language (straight-faced, stiff) and the general demeanour of the trainee (seriousness), all of which are seen as detracting from the goal of achieving empathy.
In sum, the sequential and cumulative nature of the workshop commentary, with each participant building on what the previous ones have said, results in a rich and nuanced picture of what kinds of communicative behaviours are valued (or said to be valued) in physiotherapy practice. Workshop participants also have the opportunity to elaborate in a final discussion with the facilitator/researcher issues such as the extent to which language use matters in the context of the overall clinical performance, as illustrated below in an extract from the final stages of the second workshop:
Can I also … on that point, I thought the first guy’s communication, his English is quite good. He just missed on the slang. But to me, missing on the slang affects your rapport et cetera, but if you still do all the appropriate questioning and you’re still a good clinician, it doesn’t ultimately change too much. (WK 2, 0:33.09)
Here it is useful to learn that, at least for some educators, issues such as mastery of colloquial English, often problematic for the second language user, while seen as useful for bonding with the patient, are not considered critical to effective functioning as a physiotherapist. The absence of such reflective opportunities in the hospital setting places limits on what can be inferred from those data.
Discussion and conclusion
This study has addressed the methodological problem in indigenous assessment of how to establish the criteria that are oriented to by professionals in routine workplace and training assessments which could potentially form the basis of criteria to be used in a specific purpose language test. It is notable in the literature that the complex ethnography and discourse analysis which was the basis of Jacoby’s original study (Jacoby, 1998) has not been replicated in any of the subsequent studies in language testing inspired by her work, although Erdősy (2005, 2009) has used similar methods with necessary adaptations for an academic writing context. Generally, however, researchers have favoured a range of indirect methods at one or more removes from any context in which assessment will occur indigenously to that context. This study is perhaps an exception, in that it has identified sites where naturally occurring feedback on clinical communication takes place, and has gathered data from those sites along with other evidence from a less natural setting. This has enabled a comparison of methods of establishing what content experts (in this case, health professionals, and specifically physiotherapists) orient to in workplace training.
As indicated in the “Methodology” section above, the approaches chosen can be distinguished in terms of authenticity. On the one hand, we located two workplace sites involving physiotherapists in training where assessment of communication skills potentially routinely takes place: one-on-one feedback to trainees immediately following observation of a specific interaction with an individual patient; and a formal summative assessment discussion at the end of the clinical placement, where generalization across occasions of interaction occurs. Both of these data sources we consider authentic in the sense that they conform with what normally happens in workplace training of physiotherapy students. On the other hand, we created an additional, contrived research site by convening groups of physiotherapy educators, presenting them with video-recordings of interactions between trainees and patients or simulated patients and asking them to comment on features of the performances which they regarded as stronger or weaker, as if they were about to provide feedback to the trainees in question. While video-recording of interaction with simulated patients is often undertaken in the context of physiotherapy clinical education and assessment, the method of eliciting commentary was inauthentic. In our study (like that of Douglas & Myers, 2000), the video recordings were not used for feedback to trainees but as stimuli for individual and group discussion of the quality of the communication in each in a purposely established research workshop. Scrutiny of the data from these different sites yielded evidence that the contrived nature of the workshop procedure yielded feedback that differed in style and content from what emerged in the workplace settings.
Frustratingly and paradoxically, however, we were able to gain little insight into the orientation to communication skills of participants in the more authentic workplace sites. In the one-on-one feedback to individual trainees there was virtually no mention of features of communication at all. It seems likely that the busy hospital setting allowed the supervisor limited time to do anything but ensure that the information gathered was sufficient to make an accurate assessment of the patient’s condition and that the physical safety of the patient was not compromised. Note, also that the patient was sometimes present when feedback was given, which may have constrained what the supervisor felt was relevant and appropriate to comment on. The more formal summative assessment discussion in a private room at the end of the clinical placement focused more usefully on communication skills, partly because the pro forma guiding the supervisors’ evaluation specifically mentioned communication. However, the articulation of the issues was rather vague – perhaps because of the wording on the assessment pro-forma where communication is subsumed under the Affective category and described in very general terms. The fact that communication was not particularly problematic for the students being observed may also have contributed to its receiving such limited attention. In contrast, the more inauthentic workshop context yielded far richer data on the orientation of clinical supervisors to communication skills: a rich vocabulary was used, many specific features were isolated for comment, some of them proffered in response to prompting from the workshop facilitators. There was also general agreement among participants about the centrality of communication in the eyes of these clinical educators to the success of the clinical consultation. Strictly speaking, though, the workshop context fails to meet the in-principle requirements of indigenous assessment, stated by Jacoby and McNamara (1999, p. 214) to be “the close analysis of members’ own socialization and assessment practices in particular professional cultures, since in such activities communicative competence is a problem for the participants themselves to display, monitor, and comment upon.”
Overall, this study has confirmed the difficulty of establishing indigenous assessment criteria for this context, that is, the assessment of clinical communication skills in health professionals in training, specifically physiotherapists. Gaining access to the sites was difficult and the criteria being sought proved elusive, given how little explicit mention was made of them. This is in stark contrast to the evident centrality of communication in the evaluation of clinical interaction in the eyes of clinical educators who, in the research-focused setting, articulated a complex awareness of features of communication in clinical interaction and had a ready language to describe them. Why were criteria so readily acknowledged in one setting elusive in the other? One difference is that the workshop setting encouraged meta-awareness of criteria, particularly as the workshop participants were often senior clinical educators or academics whose role it is to explicate ideas and values, and who together as a group convened by researchers were oriented jointly to the articulation of issues of relevance to the research. It is unlikely that the presence of audio recording equipment in the routinely occurring sites compromised their potential as sources of insight; only one comment in the dataset showed that some of the participants were aware of the recording, but there is little to suggest that their behaviour was substantially affected by it. Moreover, if there were such effects, supervisors, in the knowledge that their practices were being observed, might be expected to deliver more detailed and explicit feedback to students than was evident in much of the hospital data.
Another issue is that the question of the adequacy of language and communication did not arise in the trainees presenting in the authentic contexts, perhaps because they were mostly from English-speaking backgrounds and any who were not were either immigrant students schooled in Australia or international students who had already met or surpassed the University’s minimum language requirements (namely IELTS Band 7 or a recognized equivalent) on commencement of their four-year study programme. This is a limitation of our study, given that its ultimate aim was to identify appropriate criteria for assessing performance on the Occupational English Test, designed for overseas trainees from NNES backgrounds, many of whom have completed their professional training in non-English medium institutions. Since one key function of the Occupational English Test is screening for entry into clinical retraining programmes, where communicative competence is at issue, a clear direction for further research is a study which includes data from assessment of those reported to be experiencing greater difficulty in communication, especially overseas-trained NNES health professionals. At this point, however, we are forced to conclude that the criteria articulated in the artificial workshop setting provided more material for the development of relevant language test criteria, despite its relative inauthenticity. It remains to be seen whether these criteria can be substantiated in settings that more authentically represent routine workplace clinical training events.
This study has underlined both the potential for involving content experts in the definition of relevant criteria for evaluating workplace communication, and the difficulty of doing so in a way which emerges from contexts routinely and naturally occurring in the workplace setting. Even if such a context for eliciting the indigenous criteria could be established, of course, the question of reformulating these in a manner that could be applied to the very different context of a stand-alone language test remains to be addressed (see Pill, this issue, and O’Hagan, Pill, & Zhang, this issue, for discussion of how this might be done).
Footnotes
Appendix A: Sample pro-forma summative assessment sheet
Acknowledgements
Our thanks to other project team members, especially John Pill, Robyn Woodward-Kron and Diana van Die for assistance with preparing the relevant Human Research Ethics applications and gathering data for this project. We are also indebted to Cathy Nall and Gillian Webb for assisting us with access to the research sites. Finally, we thank the physiotherapy students and their supervisors as well as the workshop participants whose participation was essential for this research.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
