This paper presents a subjects × raters partially nested factorial analysis of variance model for estimating coefficients of interrater reliability. Procedures and formulae are described for computing unbiased estimates using the between-subjects and error mean squares from the model. Inclusion or exclusion of rater variance in the estimates is also discussed. Nine specific advantages for using the analysis of variance approach over existing methods are listed. A direction for future research on reliability is suggested.
References
1.
BerkR. A.Utility of analysis of variance with repeated measures programs for estimating reliability. Perceptual and Motor Skills, 1975, 41, 441–442.
2.
BijouS. W.PetersonR. F.AultM. H.A method to integrate descriptive and experimental field studies at the level of data and empirical concepts. Journal of Applied Behavior Analysis, 1968, 1, 175–191.
3.
BurdockE. I.FleissJ. L.HardestyA. S.A new view of interobserver agreemenr. Personnel Psychology, 1963, 16, 373–384.
4.
CochranW. G.The comparison of percentages in matched samples. Biometrika, 1950, 37, 256–266.
5.
CohenJ.A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 1960, 20, 37–46.
6.
CohenJ.Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 1968, 70, 213–220.
7.
CohenJ.Weighted chi-square: An extension of the kappa method. Educational and Psychological Measurement, 1972, 32, 61–74.
8.
CornfieldJ.TukeyJ. W.Average values on mean squares in factorials. Annals of Mathematical Statistics, 1956, 27, 907–949.
9.
CronbachL. J.GleserG. C.NandaH.RajaratnamN.The dependability of behavioral measurements: Theory of generalizability for scores and profiles. New York: Wiley, 1972.
10.
CronbachL. J.RajaratnamN.GleserG. C.Theory of generalizability: A liberalization of reliability theory. British Journal of Statistical Psychology, 1963, 16, 137–163.
11.
EbelR. L.Estimation of the reliability of ratings. Psychometrika, 1951, 16, 407–424.
12.
EverittB. S.Moments of the statistics kappa and weighted kappa. British Journal of Mathematical and Statistical Psychology, 1968, 21, 97–103.
13.
FlandersN. A.The problems of observer training and reliability. In AmidonE. J.HoughJ. B. (Eds.), Interaction analysis: Theory, research, and application. Reading, Mass.: Addison-Wesley, 1967. Pp. 161–166.
14.
FleissJ. L.Estimating the accuracy of dichotomous judgments. Psychometrika, 1965, 30, 469–479.
15.
FleissJ. L.CohenJ.The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 1973, 33, 613–619.
16.
FleissJ. L.CohenJ.EverittB. S.Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 1969, 72, 323–327.
17.
GarrettC. S.Modification of the Scott coefficient as an observer agreement estimate for marginal-form observation scale data. Journal of Experimental Education, 1975, 43, 21–26.
18.
GoodwinD. L.Training teachers in reinforcement techniques to increase pupil task-oriented behavior: An experimental evaluation. Unpublished doctoral dissertation, Stanford Univer., 1966.
19.
GuilfordJ. P.Psychometric methods. New York: McGraw-Hill, 1954.
20.
KassR. E.O'LearyK. D.The effects of observer bias in field-experimental settings. In Behavior analysis in education. Symposium presented at the University of Kansas, Lawrence, Kansas, 1970.
21.
KendallM. G.The advanced theory of statistics. Vol. 1. (4th ed.) London: Griffin, 1948.
22.
LightR. J.Measures of response agreement for qualitative data: Some generalizations and alternatives. Psychological Bulletin, 1971, 76, 365–377.
23.
LightR. J.Issues in the analysis of qualitative data. In TraversR. M. W. (Ed.), Second handbook of research on teaching. Chicago: Rand McNally, 1973. Pp. 318–381.
24.
LipinskiD.NelsonR.Problems in the use of naturalistic observation as a means of behavioral assessment. Behavior Therapy, 1974, 5, 341–351.
25.
MedleyD. M.MitzelH. E.Measuring classroom behavior by systematic observation. In GageN. L. (Ed.), Handbook of research on teaching. Chicago: Rand McNally, 1963. Pp. 247–328.
26.
O'LearyK. D.KentR. N.Behavior modification for social action: Research tactics and problems. Paper presented at the Fourth Banff International Conference on Behavior Modification, Banff, Alberta, Canada, 1972.
27.
O'LearyK. D.O'LearyS. G.Classroom management: The successful use of behavior modification. New York: Pergamon, 1972.
28.
ReidJ. B.Reliability assessment of observation data: A possible methodological problem. Child Development, 1970, 41, 1143–1150.
29.
RomanczykR. G.KentR. N.DiamentC.O'LearyK. D.Measuring the reliability of observational data: A reactive process. Journal of Applied Behavior Analysis. 1973, 6, 175–184.
30.
RosenshineB.FurstN.The use of direct observation to study teaching. In TraversR. M. W. (Ed.), Second handbook of research on teaching. Chicago: Rand McNally, 1973. Pp. 122–183.