Abstract

Adoption of ‘the intention-to-treat principle’ in designing and analysing controlled trials
By the late 1950s, the key role played by the Medical Research Council (MRC) in the development of controlled clinical trials – and of Hill’s leadership specifically – had become widely recognised. The example that had been set by the MRC during the 1950s led the Council for International Organizations of Medical Sciences (CIOMS) to invite Hill to plan and chair a conference on ‘Controlled Clinical Trials’. The conference was convened under the joint auspices of the UN Educational, Scientific and Cultural Organisation (UNESCO) and the World Health Organization (WHO) ‘to discuss the principles, organization and scope of “controlled clinical trials”, which must be carried out if new methods or preparations used for the treatment of disease are to be accurately assessed clinically’. 1
The conference was held over five days in Vienna between 23 and 27 March 1959. 1 Hill had arranged for 23 papers to be presented by British statisticians, physicians and a surgeon.1,2 The programme of the conference in Vienna covered a wide range of issues relevant to developing expertise in the planning, conduct, analysis and reporting of controlled clinical trials, and it attracted international interest. The organisers of the conference had not envisaged formal publication of the proceedings, but such was the demand for copies of the mimeographed papers made available to the hundred or so participants in the conference that these papers were published in a 177-page book the following year. 2
The presentations covered ethics, criteria for diagnosis and assessment, definition and measurement, clinical trials using group comparisons, within-patient crossover comparisons, factorial designs, statistical requirements and methods, monitoring using sequential analysis, organisation of clinical trials, design of records and follow-up, and the analysis and presentation of results. The methodological presentations were brought to life with illustrative examples of trials in tuberculosis, upper respiratory tract infections, acute rheumatoid arthritis, coronary thrombosis and cancer. 2
The CIOMS also asked Daniel Schwartz, Professor of Medical Statistics at the University of Paris, to prepare a French ‘rapport interprétatif’ of the conference. Schwartz co-authored the report with Robert Flamant, Joseph Lellouch and Claude Roquette, all of whom were colleagues at the Unité de Recherches Statistiques de l’Institut National Hygiène à l’Institut Gustave-Roussy in Paris. 3 The report opens with a seven-page introductory chapter entitled Buts et Méthodes with the following sections: Le jugement de signification; Le jugement de causalité; Constitution de groups comparables; Le tirage au sort; La suggestion du malade; L’essai anonyme; La suggestion chez le médecin; L’essai complètement anonyme; Des principes à l’application. Chapter II provides some examples of controlled clinical trials; Chapter III concerns estimation of the number of participants needed in an experiment; Chapter IV discusses the importance of defining criteria; and Chapters V and VI conclude the report by referring to additional methodological and ethical considerations.
The Vienna conference1,2 was not the only gathering considering developments in testing treatments, but it may well have been the first conference in which several speakers had considered how to deal with biased losses from unbiased treatment comparison groups assembled using random allocation, and to have begun to develop a terminology for discussing the issue.
Peter Armitage, a statistician colleague of Hill’s, presented a paper on ‘The construction of comparable groups’. Armitage asked how one should deal with non-random losses of trial participants after they had been allocated at random to comparison groups to ensure that any differences between them reflected only the play of chance. The penultimate paragraph of Armitage’s
4
presentation reads as follows (pp. 17–18): However carefully a trial has been planned, occasionally things will go wrong. Perhaps, after a patient has been entered into the trial, a more accurate diagnosis shows that she should have been excluded. Perhaps the wrong treatment is given, or for some reason the selected treatment cannot be given in the prescribed way. Perhaps the results of treatment have been inadequately recorded. Can these subjects be confidently excluded from the analysis? The main rule is that exclusion is safe only if one is quite certain that the mishap can apply equally easily within each treatment group. In a trial to compare the effects of radiotherapy and surgery in some form of malignant disease it might be decided to include only operable patients for whom either form of treatment could be carried out. It is likely that some of the patients allocated to the surgical group would be found, at the start of surgery, to be inoperable, but there would be less opportunity to make this discovery if the patient had been allocated to radiotherapy. Exclusion of these patients (who would be the most severely affected) would tend to favour the surgical group. For comparative purposes, therefore, they must be left in, although one would naturally wish to examine the results in the smaller group which is left when the unsuitable subjects are omitted.
In response to the questions he had posed in the previous paragraph, Armitage enunciated the concept of If a particular treatment cannot be performed correctly on all patients to whom it was allocated, the difficulty will usually apply in any future widespread use of the treatment just as forcibly as in the clinical trial. It could, then, be argued that the complete group of patients for whom the treatment was
After Armitage’s presentation at the Vienna conference, ‘the intention-to-treat (ITT) principle’ was mentioned in three other presentations. Rheumatologist Eric Bywaters
5
described how he and his colleagues had dealt with withdrawals from randomised cohorts in trials needing longer than usual follow-up (pp. 77–78). In any long-term trial carried on, as we have done, over a period of years, some patients will emigrate, others will dislike their doctors and go elsewhere, some may die, others will recover and refuse treatment, still others may have their treatment changed to something else on the grounds that the evil we know is better than that we know not. How should these withdrawals be treated? The ideal is to have none, but that is only possible in a trial lasting a few hours. We have used two methods. 1. We have tried to define carefully in advance under what circumstances a patient should be withdrawn from consideration and have confined our analysis to those still remaining in the trial and on the specified treatment at each annual point. If this is done it is essential that comparisons should be made of the starting state of each residual group at each point of time. Careful consideration must be given to the reasons for withdrawal within each treatment group. Thus an equal number in each group could be withdrawn, but drug A withdrawals could all be because having got better they failed to attend, and all drug B withdrawals could have been changed to drug A. A high completion rate should be built into the design of the trial, but this is never completely achieved. 2. In the second method we have analysed the group as a whole at each point in time, excluding only those who could not be assessed due to death or non-attendance, irrespective of whether they have remained on the specified therapy. There are objections to both these methods. (Bywaters,
5
pp. 77–78)
In a presentation on clinical trials in malignant disease, radiotherapist Ralston Paterson
6
(pp. 125–133) commented as follows: Once admitted to the randomized group the case must not be extracted therefrom whatsoever happens and regardless of what is actually done to the patient. What in fact is being measured is the result of an
Finally, the medical epidemiologist John Knowelden,
7
also a colleague of Hill’s, drew attention to relevant aspects of the analysis of randomised trials (pp. 155–159): The analysis [of trial results] proper should begin with a statement of the number of patients who entered the trial and who satisfied the diagnostic criteria. An account should then be given of those patients who withdrew from the trial at different stages and the reasons for their withdrawal. Sometimes the withdrawal may be coincidental and unrelated to the treatments being given; the diagnosis may be revised and found to fall outside the category specified for the trial, an intercurrent infection or distinct additional illness may occur, or the patient may be uncooperative or be moved elsewhere. With this type of exclusion it is usually sufficient to show in the report that it occurred with equal frequency in Treated and Control groups and cannot have disturbed the group comparisons. A more difficult problem arises with exclusions which may be related to the treatments given, for example, a patient who is found to be sensitive to penicillin, who develops salicism on the agreed dosage of aspirin, or haemorrhagic complications when given anti-coagulants. There will always be a group of patients who exhibit side-effects, and while with some it may be possible to continue treatment, with others it may be necessary to stop. Exclusions of this kind operate unequally in Treated and Control groups, so that those who continue the full course are not necessarily alike as those originally allocated. Here are two alternatives: 1. The exclusions can be counted as failures to the selected treatment, and the further analysis made on a comparison between the remainder who completed Treated and Control régimes. 2. The groups, as originally allocated, can be compared in their progress, although some members failed to keep to the treatment, making here a comparison between those One or other or both methods of analysis may be presented, the choice depending on whether it is important to emphasize the disadvantages of a particular therapy. (Knowelden,
7
pp. 156–157)
The ‘leitmotif’ of ‘
Hill did not make any reference to the ITT Principle in his opening and closing remarks as chair of the Vienna conference (or in any of the first six editions of his Principles of Medical Statistics. He had noted in the sixth edition of his book that ‘every departure from the design of the experiment lowers its efficiency to some extent’ (Hill,
8
p. 245). However, the seventh edition of Principles of Medical Statistics, published two years after the Vienna conference, was different. The chapter on clinical trials contained an important new section entitled ‘Differential Exclusions’: ‘Unless the losses are very few and therefore unimportant, we may inevitably have to keep such patients in the comparison and thus measure the
This left little room for uncertainty about Hill’s
9
view of ‘the intention-to-treat principle’ (p. 259). The following year, at the invitation of the Institute of Actuaries, Hill
10
gave the Alfred Watson Memorial Lecture (pp. 178–191), in which he said: In many trials the original careful randomization of patients to treatment and control can be later disturbed by selective withdrawals of patients who cease to take a treatment or are proved sensitive to it so that they have to be withdrawn. The experiment is necessarily weakened – indeed we may on occasions have to assess the value of an
The term ‘intent-to-treat’ was eventually ‘canonised’ in 1977 when it was added to the index in the 10th (Hill 1977) and subsequent editions of Principles of Medical Statistics.
Five editions of the book later, the section on ‘Differential Exclusions’ in the 12th and final edition of the book (co-authored with Hill’s statistician son) reads as follows: In the protocols of a proposed trial, specifications for any exclusions from it, e.g. of the old and severely ill, should be laid down explicitly in advance. They should not be determined after the entry of a patient and the allotment of a treatment. However, some exclusions at this latter point of time are usually inevitable. In analysing the results of a trial we have, therefore, a vital question to consider – have any patients after admission to the treated or control group been excluded from further observation? Such exclusions may affect the validity of the comparisons that it is sought to make; for they may differentially affect the two groups. For instance, suppose that certain patients cannot be retained on a drug – perhaps through toxic side effects. No such exclusion may occur on the placebo or other contrasting treatment, and the careful balance, originally secured by randomisation, may thereby be disturbed. Another specific example might lie in a trial of pneumonectomy versus radiation in the treatment of cancer of the lung. At operation there is no doubt that pneumonectomy would sometimes be found impossible to perform and it would seem only sensible to exclude these patients. But we must observe that no such exclusions can take place in the group treated by radiation. If we exclude such patients on the one side and inevitably retain them on the other, can we any longer be sure that we have two comparable groups differentiated only by treatment? Unless the losses are very few and therefore unimportant, we may inevitably have to keep such patients in the comparison and thus measure the
The following year (1992), in a helpfully illustrated article on the implications of ‘intention-to-treat (ITT) analysis’ for quantitative and qualitative research, David Newell 12 used data from the Coronary Artery Bypass Surgery (CABS) trial to illustrate the consequences of failing to analyse trial results using ‘the ITT principle’. He showed the results of an analysis of 2-year mortality rates using three methods – ‘ITT’, ‘compliers only’ and ‘as treated’ (Table 1).
Estimates of differences between medical and surgical treatments after analysis by ‘intention to treat’, ‘compliers only’, and ‘As treated’.
‘ITT analysis’ (which had been used correctly by the CABS trialists) yielded a 7.8% mortality rate among patients allocated to medical treatment, and a 5.3% mortality rate in those allocated to surgery, a difference that could easily have reflected the play of chance (p = 0.17). If the analysis had been restricted to those who had actually received the treatment to which they had been allocated (‘compliers’), the two mortality rates would have been statistically significantly different (p = 0.018), and an analysis comparing participants ‘as treated’ would have wildly exaggerated the apparent value of surgery (p = 0.003).
Adoption of ‘the ITT principle’ in designing and analysing randomised trials
The 7th to 12th editions of Hill’s popular Principles of Medical Statistics published between 1961 and 1991 seem likely to have been important in promoting awareness of ‘the ITT principle’ in the English-speaking world. This had led to widespread endorsement of the principle.12–22
However, there have been some detractors, for example, Sheiner and Rubin 23 and Richard D Feinman. 24 Feinman suggested that ‘At first hearing the idea of ITT is counter-intuitive if not completely irrational – why would you include in your data people who are not in the experiment?’ 24
Lack of support for the ITT principle was evident when the principle was first enunciated. No explicit reference was made to ‘the ITT principle’ (or to ‘l’intention de traiter’) in the ‘rapport interprétatif’ prepared by Schwartz and his colleagues for the Vienna conference, 3 although their report did imply a lack of acceptance of the rationale for ‘ITT’ analysis in clinical trials requiring prolonged follow-up: ‘Si on prend tous les sujets au départ, sans clause d’exclusion, il se produit un grand nombre de defections, qui rend difficile ou impossible l’analyse des résultats’. Schwartz and his colleagues provide three examples of trials to highlight the issue with withdrawals, but they were not clear about how they had analysed data relating to the withdrawn patients.
There was no reference to the ITT (or to ‘l’intention de traiter principle’) in an article on controlled trials by Schwartz published the following year. 25 Furthermore, we have not been able to find any mention of the principle in L’Essai Thérapeutique chez L’Homme – either in a substantial book co-authored by Daniel Schwartz, Robert Flammant and Joseph Lellouch published 10 years after the Vienna meeting, 26 or in an English translation of L’Essai Thérapeutique chez L’Homme by the British statistician Michael Healy a decade after publication of the book.
Although ‘the ITT principle’ is not mentioned in these books,3,26 they illustrate the problems created by withdrawals. In correspondence during the late 1990s, Daniel Schwartz made clear to Peter Armitage that he would only support the adoption of the ‘ITT approach’ if this was authorised in the trial protocol, as defining the strategy under study. In other circumstances, they would regard protocol deviations as regrettable, and as a mark of unsatisfactory research methods. (Armitage,
27
p. 2677). Armitage commented on Schwartz’s and Lellouch’s position as follows: The pragmatic attitude is often taken to be exemplified par excellence by the intention-to-treat approach to the analysis of results, whereby comparisons are made between the outcomes for the complete groups assigned to different treatment regimens, irrespective of the extent of departure from the treatment schedules laid down in the protocol. Professors Schwartz and Lellouch have pointed out (in a personal communication) that this identification would go beyond their intentions. They would support the intention-to-treat approach only if this was authorized in the trial protocol, as defining the strategy under study. In other circumstances they would, I think regard protocol deviations as regrettable, and as a mark of unsatisfactory research methods. In almost any clinical trial, departures from protocol are likely to occur to some extent. A patient may experience unwanted side effects; deterioration of the patient’s condition may lead the physician to prescribe alternative treatments, or the patient may decline to co-operate for any number of reasons. The trial statistician, therefore, is inevitably faced with the problem of how to take them into account in the analysis. In a trial conceived of as essentially pragmatic in nature, the investigators are likely to lean towards an intention-to-treat approach. (Armitage,
27
pp. 2677–2678)
Armitage goes on to list the arguments usually adduced for the ITT approach: The benefits of randomization are maintained; differences in outcome between the groups cannot be ascribed to systematic differences in the pre-treatment characteristics of the patients. The comparison is essentially one of different strategies of treatment, defined by ideal schedules but with the recognition that, in the trial just as in clinical practice, departures from these schedules will occasionally occur. In this sense, the trial may be said to simulate routine practice. Any attempt to measure relative efficacy, by comparing groups of patients with 100% compliance to protocol, is deeply suspect because of selection biases. (Armitage,
27
p. 2678)
Quite apart from the differences between the Hill and Armitage position and the Schwartz and Lellouch position, surveys of clinical trials reported in major general medical journals 40 years after the Vienna meeting have shown that only half of the trial reports assessed had observed ‘the ITT principle’.16,28 The surveys made clear that the phrase ‘intent-to-treat’ seemed to have different meanings for different authors; that ITT analyses were often inadequately described and inadequately applied; and that there was no consensus about how to handle deviations from randomised allocation. Despite this, trial reports that made no mention of the ITT principle were judged to have been of lower quality than those that did. In 1999, in response to this situation, the International Conference on Harmonisation published statistical principles for clinical trials. These principles emphasised that ‘primary’ analyses should be based on an application of the ITT principle.
In 2009, after considering evidence of the importance of taking account of adherence to treatment and how one should analyse and interpret clinical trials in which patients do not take the treatments assigned to them, Curt Furberg listed the lessons learned as follows
29
:
Good and poor medication adherers seem to have different prognoses. Good adherence to harmful drugs is associated with worse prognosis. Good adherence to beneficial drugs is associated with better prognosis. Specific reasons that could account for the relationship between good adherers and favourable outcomes and poor adherers and unfavourable outcomes remain unclear. There is no established method to adjust for adherence-related participant factors. There is no guarantee that subsets of participants with high or low adherence within two study groups are comparable in terms of risk. Analysis of clinical trial data by treatment administered can be misleading. The intention-to-treat approach to analysis remains the safest or least biased way of analysing clinical trial data. This is the reason why reputable medical journals and regulatory agencies adhere to the intention-to-treat approach.
Acceptance of the desirability of ‘ITT’ analysis has prompted a series of analyses by Iosief Abraha, Alessandro Montedori and their colleagues to audit the extent to which the principal has been applied in practice by researchers. They have drawn attention to researchers’ increasing tendency to adopt the usually undefined term ‘modified intention-to-treat (mITT)’ analysis. They found that the definition of mITT was irregular and arbitrary and open to manipulation and consequently to bias. 30 When they compared the characteristics of trials that had reported having used a ‘mITT approach’ with trials reporting having used ‘unmodified ITT’ they found that the mITT trials were significantly more likely to have made post-randomisation exclusions and were strongly associated with industry funding and authors’ conflicts of interest. 31 In a third study, Abraha et al. 32 used 43 systematic reviews of interventions and 310 randomised trials to assess whether deviation from the standard ITT analysis influenced treatment estimates of treatment effects. They found that ‘Trials that deviate from the ITT approach overestimate the treatment effect of meta-analyses compared with those trials that report a standard approach’. 32
In 2019, the International Conference on Harmonisation published an addendum to its 1999 report in which it observed that the term ITT had been gradually degraded by applying it to cases where the data were missing but had been imputed. 33
Despite the undoubted challenges and compromises entailed in applying ‘the intention-to-treat principle’ in practice, research funders, journals and readers should require trialists who state that they have used a ‘
Though commonly confused, the issues of ‘deviation from allocated treatment’ and ‘missing outcomes’ are separate, and each requires its own reporting and analysis. There are several options for missing outcomes (such as ‘last observation carried forward’ and imputation in various guises), but deviations from allocated treatment, ‘the intention-to-treat principle’, remains the mainstay. In a secondary analysis, it may be useful to adjust for non-adherence to allocated treatment to estimate the ‘explanatory’ effect. For example, Robert Newcombe 34 has suggested a simple adjustment to the ITT estimate.
The conclusion reached by White et al.
22
is worth repeating: Clinical trials should employ an intention-to-treat analysis strategy, comprising a design that attempts to follow up all randomised individuals, a main analysis that is valid under a stated plausible assumption about the missing data, and sensitivity analyses which include all randomised individuals to explore the impact of departures from the assumption underlying the main analysis. Following this strategy recognises the extra uncertainty arising from missing outcomes and increases the incentive for researchers to minimise the extent of missing data (see Box 1).
22
Concluding reflections
Concealed random allocation is a key feature of fair treatment comparisons. It ensures that –
From trial planning onwards, obsessional efforts are needed to protect the unbiased status of these randomised comparison groups. The longer the duration of follow-up after random allocation the more likely it will be that there will be loss of participants, missing outcome data and non-random withdrawals, with the result that treatment comparisons will become biased.
A variety of strategies have been used in attempts to minimise and take account of biased loss of trial participants from the groups to which they have been allocated. Missing outcomes and deviation from allocated treatment are separate issues, and each requires its own reporting and analysis.
Strenuous efforts are needed to minimise missing outcomes and other important data. Greater use of record linkage to mortality registers and hospital admission statistics may help to identify missing outcomes.
Application of the intention-to-treat principle remains the mainstay for dealing with deviations from allocated treatment. 16 A secondary analysis may be useful to adjust an estimate of treatment effect that takes account of the extent of non-adherence to allocated treatment. 34
The current lack of research transparency jeopardises the research efforts needed to identify research design features that minimise losses when randomised cohorts are to be followed for many years. Explicit statements about post-randomisation exclusions should replace the ambiguous terminology of ‘modified intention to treat’. 30
Despite the undoubted challenges and compromises entailed in applying ‘the intention-to-treat principle’ in practice, research funders, drug licensing authorities, journal editors and readers should require trialists who state that they have used a ‘modified intention-to-treat analysis’ to make clear how they have handled ineligible inclusions, missing outcomes, and deviations from random allocation.17,33 However, as observed by Austin Bradford Hill and his son David Hill more than 40 years ago: There can be no hard and fast rules for there is no correct answer to all situations. (Hill and Hill,11 1981)
Dedication
We are indebted to the late and much missed Tony Johnson (1943–2022) and his colleague Vern Farewell for creating an invaluable annotated bibliography of early textbooks and other publications on controlled clinical trials.
Author's note
This paper is the second part of a two-part series.
