Abstract
This article charts the history of mental testing in the context of the rise and fall of Russian child science between the 1890s and the 1930s. Tracing the genealogy of testing in scientific experimentation, scholastic assessment, medical diagnostics and bureaucratic accounting, it follows the displacements of this technology along and across the boundaries of the child science movement. The article focuses on three domains of expertise – psychology, pedagogy and psychiatry, examining the key guises that mental testing assumed in them – namely, the experiment, the exam and the diagnosis. It then analyses the failed state-bureaucratic harnessing of mental testing in early Soviet attempts to manage mass education, discussing the peculiar dynamics of the (de)legitimation of testing, as it swung between black-boxing and instrumentalization, on the one hand, and scandal and controversy, on the other. The article argues that mental testing thrived in Russia as a strategically ambiguous and flexibly interpreted ‘boundary object’, which interconnected a highly heterogeneous field, enabling the coexistence and cooperation of diverse occupational agendas and normative regimes.
Introduction
On 4 July 1936 the Central Committee of the Soviet Communist Party issued the decree ‘On the Paedological Distortions in the System of the People’s Commissariats of Education’ (KPSS, 1985: 364–7). In it the party accused the state’s educational administration of committing the gravest of errors by introducing, in the late 1920s, the network of ‘so-called “paedologists”’ into Soviet schools and for relying excessively on this service in the delivery of the state’s educational policies. In the course of the 1920s, ‘paedology’ or ‘child science’ – a multidisciplinary field devoted to the bio-psycho-social study of child development and socialization – had become the principal framework of Soviet educational research (Shvartsman and Kuznetsova, 1994; Etkind, 1997). The Bolshevik political elite had conceived it as the principal scientific basis for transforming child welfare, education and health, and a means of fulfilling their revolutionary social engineering ambitions (Bauer, 1952; Balashov, 2012; Trombetta, 2013). Paedology’s peak, in terms of official state support, came in 1927–8, with the staging of a high-profile conference, the launching of a specialist journal, and the setting-up of a ‘paedology service’ across Soviet schools (Baranov, 1991).
After Stalin’s ‘Great Break’ in 1928–9 and the push to complete the First Five-Year Plan (1928–32), significant changes took place in Soviet educational policy as part of the drive to speed up the forging of a disciplined, loyal and effective workforce. Changes included the final implementation of universal primary schooling [vseobuch] in 1930, and a return, in 1931, to more traditional and disciplinarian school programmes and teaching paradigms. Among paedology’s core technologies, developed as a tool for streaming an expanding mass of schoolchildren, were various kinds of psychometric and other tests (Kurek, 2004). These tests became one of the principal means of evaluating poorly performing and disruptive children, who would then be referred to a fast-growing number of special schools. This ‘sorting of the wheat from the chaff ’ was increasingly linked to the pressures put on schools to deliver on the new programmes and targets, with many teachers being more than happy to see the back of pupils who slowed things down (Ewing, 2001).
However, by 1936 – the year of the so-called ‘Stalin constitution’, which declared that the USSR had successfully ‘achieved socialism’, and which was also the year when the propaganda motto ‘Thank You Comrade Stalin for Our Happy Childhood’ was launched (Kelly, 2005) – the Soviet political elite judged that the work of the school paedology service had resulted in a catastrophic (and in the atmosphere of Stalinist conspiratorialism, supposedly malicious) over-diagnosis of ‘backwardness’ in the Soviet child population, particularly, and disproportionately, among children from the ideologically valorized labouring classes and ethnic minorities (Kurek, 2004). Thus, paedology’s most direct contribution to the state’s management of mass education had become irreversibly entangled in the ideologically wrong kind of class bias, raising serious concerns among key figures in the party leadership (Rodin, 1998), and eventually prompting a radical response in the form of the 1936 anti-paedology decree (Petrovskii, 1991).
The immediate outcome of the 1936 document was the dismantlement or renaming of the entire network of institutes, research groups, laboratories, training courses and school posts previously associated with paedology, the banning and censoring of textbooks and monographs, the reassignment of ground-level staff and trainees to new jobs and degree courses, a spate of vitriolic denunciations in the press, and the forcing of a number of high-profile former ‘paedologists’ into public declarations of repentant self-criticism, with many being subjected to further repressive measures (Kurek, 2004). In the campaign to stigmatize paedology as a ‘pseudo-science’, the mental test was singled out as the paedologists’ trademark tool – an instrument that had, in this context, inflicted the greatest harm on the Soviet working class and its revolutionary cause (Figure 1). From 1936 onwards all forms of ‘testing’ were officially banned from Soviet research and education, and remained so until the collapse of the regime in 1991 (Kadnevskii, 2004).

‘The paedologist at work’ (by Kukriniksy), Pravda, 31 August 1936.
In order to fully appreciate how and why mental testing became such a focus of attention in the demise of Soviet paedology in the 1930s, it is vital to understand the unique part that it played in Russia’s child science movement more generally, as the latter arose in the late tsarist period and then blossomed in the early USSR. The prominent role in which mental testing was cast in the anti-paedology campaign is in large part due to the functions it served as a very distinct kind of ‘support’ in the making of child science itself. This is not to say that mental testing was the primary technology of Soviet child science. As was the case in other countries, child science included a raft of technologies for inscribing, recording, measuring, categorizing and diagnosing children, and mental testing was but one of them (Turmel, 2008).
The child science movement developed at the turn of the 20th century as a highly heterogeneous field of professional and scientific work, carried out through collaborative interactions between actors belonging to a range of different (themselves complex, and at that time often only emergent) disciplinary, occupational and administrative structures and environments – above all, those associated with teaching, psychology, particular areas of medicine (hygiene, psychiatry and neuroscience), and criminology. The movement also involved members of the general public, especially parents (as both amateurs and clients), civic associations devoted to public health, education and welfare, as well as, later on, state-bureaucratic structures, including, in the 1920s USSR, the highest echelons of the Bolshevik political elite. This entailed continuous negotiations across a multiplicity of professional territories, disciplinary boundaries, institutional structures and communities of practice. It involved a considerable labour of ‘translation’ between diverse stakeholder interests, languages and social worlds. Given this heterogeneity, articulating child science as a ‘joint enterprise’ was far from straightforward – both at the time when this movement was taking shape and subsequently, in historical reflection. 1
Star and Griesemer (1989) have argued that to explain collaborative scientific work that takes place across diverse groups of actors (as was the case with child science), one should not only look for the establishment of a ‘consensus’ – the effort of creating, however provisionally and imperfectly, a collective (conscious or unconscious) paradigm, a platform of integrated disciplinary definitions, shared meanings, common aims and agreed-upon methods. Nor should one limit one’s analysis to particular formal arrangements of ‘collaboration’: for example, the institutionalization of a negotiated division of labour or the carving-out of respective territories of action between participants – both commonly accompanied by the rhetoric of ‘interdisciplinarity’. Indeed, insofar as such arrangements (and demands for them) are invariably underpinned by continued misunderstandings and conflicts between purported collaborators, these divergences run the risk of remaining unaccounted for as a vital part of the explanation of ‘consensus’ and ‘collaboration’ themselves.
Instead, as Star and Griesemer argue, one should also look for the generation of what they have dubbed ‘boundary objects’ – strategically ‘ill-structured’ social artefacts (material and symbolic at the same time), produced in, through and for particular scientific work (thereby becoming ‘instrumental’ to it, both in the literal and the figurative senses of the word), while at the same time allowing flexible interpretations and mobile uses across distinct areas of a heterogeneous field. The principal property of a ‘boundary object’ is its ambiguity, which emerges in and through the object’s displacement, and which, in turn, enables scientific work itself, as well as its effects, to be displaced across boundaries and between environments, despite the continued and maintained divergence of perspectives, priorities and conventions of practice of the various participants involved in such work (cf. also Star, 2010).
This article will explore the ways in which ‘mental testing’ operated as such a ‘boundary object’ in the rise and fall of Russian child science. This approach differs from and complements established historiographies of mental testing, as developed on the examples of the United States, England and France, in particular. Earlier research has contextualized ‘tests’ as distinctive methodologies instrumental to the formation of certain scientific movements or fields, such as ‘experimental educational research’ (Depaepe, 1992) or ‘testology’ (Kadnevskii, 2004). Others have analysed them as key supports in the formation of particular professional and scientific identities, especially those of psychologists (Brown, 1992; Woolridge, 1994). Others still have placed them in the context of (broadly) Foucaldian-inspired analyses of social technologies of normative knowledge/power – especially different ways of constructing forms of ‘normality’ and ‘subnormality’ (Zenderland, 1998; Thomson, 1998; Turmel, 2008). Finally, mental testing has been discussed in the context of the rise and fall of particular political cultures, above all the early-20th-century attempts to reconcile democracy and meritocracy (Sutherland, 1984; Thomson, 1998; Carson, 2004).
In what follows, the definition of the ‘mental test’ remains open, the term essentially a placeholder, since its meanings, in their full complexity and ambiguity, are expected to emerge as an outcome, rather than an a priori condition, of the analysis. 2 What the ‘mental test’ might have been, as a boundary object, in early-20th-century Russia, is precisely what is at stake in the analysis: it will crystallize only through the tracing of its genealogy in scientific experimentation, scholastic assessment, medical diagnostics and bureaucratic accounting. Put another way, our understanding of the mental test as a boundary object emerges through following its strategic displacements (in rhetoric and practice) along and across the fuzzy internal and external boundaries of the Russian child science movement – specifically those between its key domains of expertise: psychology, pedagogy and psychiatry. I shall do this by focusing on some of the key ‘hypostases’ that mental testing assumed in this context – namely, ‘the experiment’, ‘the exam’ and ‘the diagnosis’. I shall then discuss the peculiar dynamic of the (de)legitimation of testing between the 1890s and the 1930s, as it swung between black-boxing and instrumentalization, on the one hand, and scandal and controversy, on the other. I shall conclude with an analysis of the ultimately failed state-bureaucratic harnessing of mental testing in the early Soviet attempts to manage the rapid expansion of mass education.
The experiment
The primary framework within which forms of mental testing were elaborated in Russia at the turn of the 20th century was that of psychological experimentation. In Russia, as elsewhere, the notion of ‘experiment’ was crucial to debates over the identity of psychology as a positivist human science (Lomov, Budilova and Kol’tsova, 1990; Sirotkina and Smith, 2012). The issue of ‘mental testing’ became heavily embroiled in these debates in the course of the 1900s–10s. By the end of the 19th century, psychology still lacked the status of an independent academic discipline at Russian universities (Joravsky, 1989; Petrovskii, 2000). It was taught as a component of philosophy and had the reputation of the latter’s ‘handmaiden’. Its scientific credentials were often under attack from physiologists, neurologists and psychiatrists, who sought to redefine it from their own, biological point of view, with some of them even denying psychology the right to a legitimate existence as a science. Laboratories in experimental psychology were being established at a number of Russia’s university psychiatric clinics, but here they retained a strong physiological and medical bias (Budilova, 1961; Sirotkina and Smith, 2012). Although a number of Russian psychologists with a background in philosophy perceived the introduction of the experiment into psychology (in emulation of Wundt and his followers in Germany) as vital to warding off criticisms of psychology’s lack of scientificity (Grot, 1896), university philosophy departments were slow to follow this up, with a number of professors expressing considerable epistemological ambivalence about psychology’s development in the direction of natural-science-like positivism (Budilova, 1960).
However, the appearance, towards the end of the century, of non-lab experimental methods – i.e. the early forms of mental testing – started to complicate matters. This development seemed to open up the possibility of experimental methodologies entering Russia’s philosophy departments by the back door as it were, even in the absence of fully equipped laboratories. A. P. Nechaev, then an up-and-coming young psychologist, privatdocent at St Petersburg University, who had spent a portion of his postgraduate training in Germany in 1898–1900, became enthusiastic about experimental psychology and wrote his dissertation on its role in the field of education, based largely on experiments that he conducted on schoolchildren in several St Petersburg schools in 1899 (Romanov, 1997; Anshakova, 1999, 2002). The dissertation met with considerable hostility from a number of professors in philosophy, including Nechaev’s own mentor, the neo-Kantian A. I. Vvedenskii, who controversially failed the thesis at the viva (Lomov, Budilova and Kol’tsova, 1990: 200–13). Nechaev was thereby effectively barred from a university career, leading him to establish his institutional base elsewhere – namely, in the then growing field of teacher-training (Byford, 2008a: 273–6; 2008b: 64–6). Thanks to support from the Army Ministry’s Department of Education, Nechaev set up a fully equipped laboratory in educational psychology at the latter’s Pedagogical Museum near St Petersburg and started mobilizing support for experimental psychology among student-teachers.
Nechaev and his group emphatically foregrounded the term ‘experiment’ when legitimizing their research practice (e.g. Nechaev, 1902), giving it a radical ring, arguing that psychology could build its legitimacy only on a positivist, natural-scientific-like methodology. They placed this in explicit contrast to what was current practice at the philosophy departments of Russian universities, where psychology continued to be taught primarily as a theoretical discipline, where ‘introspection’ [samonabliudenie (literally ‘self-observation’); cf. Levchenko, 2007] was still privileged as the epistemological foundation of psychological knowledge, and where experimentation seemed merely to be paid lip-service. 3
The militancy of the term ‘experiment’ was maintained and strengthened in Nechaev’s group’s insertion of it into the realm of educational research, which targeted teachers first and foremost. The experiment was presented as a way of revolutionizing not just psychology, but also pedagogy, by way of creating a new educational science. Nechaev and his followers thus promoted the experiment simultaneously and ambiguously as the defining method of both ‘experimental psychology’ and ‘experimental pedagogy’, so much so that the two often appeared interchangeable (Byford, 2008b: 73–7). 4 Indeed, when added to ‘pedagogy’, the adjective ‘experimental’ implied, in this group’s framing of it, not experimentation with different teaching and learning methods, but the educationally pertinent study of children’s developing minds by means of psychological experiments.
The distinction between experiment for the purposes of research [issledovanie; Forschung], associated with general psychology, as taught at university, and experiment as ‘test’ or ‘examination’ [ispytanie; Prüfung], associated with the emerging field of individual/differential and applied/diagnostic psychology, especially in domains such as education and psychiatry, was certainly something Russia’s up-and-coming experimental psychologists were aware of (Zinov’ev, 1912). However, Russian debates about the experiment in psychology usually blurred the two insofar as both were practised by the same people in overlapping contexts. It was assumed that an applied test, the experiment as ispytanie, had to be based on experiment as fundamental research or issledovanie; but it was simultaneously argued that the urgent demands faced by Russian education made it impossible for application to wait for research, and thus the two had to go hand in hand.
Nechaev’s principal opponent in the 1900s–10s was G. I. Chelpanov, professor of psychology at Kiev University from 1897, and at Moscow University from 1907 (Kozulin, 1985). Chelpanov, who had also received some training in German labs, accepted and advocated, as a matter of principle, Wundtian notions of experimental psychology as crucial to psychology’s disciplinary autonomy and scientific status, but he insisted on the careful philosophical grounding of psychology’s epistemology. He remained sceptical of simplistic positivism, continued to argue for the primacy of introspection, and was hugely irritated by the overblown scientist rhetoric that his rivals deployed in their promotion of ‘experimentation’ among teachers. Chelpanov worried greatly about what he perceived as the slippery slope of psychology’s profanation in their hands (Chelpanov, 1910). He was especially concerned about non-lab experimentation that was being increasingly carried out in schools in the course of the 1900s, and which his opponents, being embedded in educationalist institutions, focused on in particular. Chelpanov was able to exploit the ‘softness’ of non-lab experimentation as a methodological weak spot, mounting, almost singlehandedly, a vociferous and effective critique of the younger experimenters’ agenda. However, he himself was vulnerable to their counter-attacks since he did not establish a psychology lab of his own until the 1910s, whereas Nechaev had founded his in 1901 (Chelpanov, 1912a). It was only after securing substantial private funding in 1912 to establish the Moscow Psychology Institute (the MPI, affiliated to Moscow University), that Chelpanov set up what became the best-equipped psychology lab in Russia (Psikhologicheskii Institut, 1914; Afanas’eva, Vorchenko and Iasnova, 1914). The creation of the MPI was fundamental to his efforts to protect the claim of university-based and philosophically grounded psychology in defining the discipline’s legitimate methodology – above all, what counted as a proper psychological experiment. Nonetheless, the creation of the MPI’s otherwise superior lab came too late to prevent Chelpanov’s rivals from successfully mobilizing wider support for their own paradigms at his expense – especially among portions of the teaching profession, but also, by 1916, even some key officials in the Ministry of Public Education.
The laboratory
Although a substantial amount of mental testing research done by the young ‘psychologist-experimenters’ [psikhologi-eksperimentatory] was conducted outside the laboratory – in kindergartens, in schools and in homes for abandoned, disabled, or delinquent children – the scientist rhetoric of Nechaev and his colleagues led them to emphasize ‘the laboratory’ as a strategic base. Nechaev’s lab, in particular, equipped with most of the core psychological apparatuses of this era, specially imported from Leipzig, served this purpose in the early 1900s (Grin, 1910). The fact that this group developed its science outside the traditional ‘temples’ – the universities – meant that the laboratory became even more important to the members as a legitimizing framework, which they presented as the true temple of modern, positivist science. In this context, Nechaev and his associates regularly displayed laboratory apparatuses as key exhibits at Russia’s major educationalist and childcare exhibitions and conferences. And yet, this group’s position in the field of teacher-training and education research demanded a strategic juxtaposition of ‘the psychology lab’ with ‘the school’.
The first and most basic way in which this juxtaposing was achieved involved a metaphorical redescription of schools as ‘laboratories’ and teachers as ‘psychologists’. This was a widespread strategy used by Nechaev’s group to promote experimental psychology among the education profession. Teaching was considered inferior to the work of other established professions, such as law or medicine, and efforts to transform it involved constructing the new teacher as a researcher of schoolchildren’s souls (Byford, 2008b: 69). The experimenters regularly cited the preface to the Russian translation of Wundt’s treatise, penned in the mid-1890s by one of the founders of the Moscow Psychological Society, N. Ia. Grot. In it Grot had proclaimed enthusiastically that ‘every teacher, strictly speaking, [was] obliged to be a psychologist experimenter’ (Grot, 1896; cited in Rumiantsev, 1908b: 118) and that ‘schools [would] become wide-ranging laboratories for all kinds of psychological experiments, which [would], of course, firstly have a practical objective, namely that of making the art of education more conscious and rational’ (Grot, 1896; cited in Rumiantsev, 1909: 212). Grot also argued that
… in the hands of an experienced teacher every class work could become a psychological experiment: pupils would not even guess that they were subjected to an experiment and consequently would not be nervous about it. It [was] always desirable for the psychological experiment to be as close as possible to ordinary school work and for it to be carried out by the teachers themselves. (Grot, 1896; Rumiantsev, 1908b: 119)
The second strategy of juxtaposing ‘lab’ and ‘school’ involved the setting-up of mini psychology labs and cabinets in schools. In 1906–7 Nechaev and his group designed a special kit that consisted of a collection of mental test cards and simplified experimental apparatuses (produced more cheaply and designed to work without electricity) that enthusiast teachers could purchase for their schools (Konorov and Nechaev, 1907; Rumiantsev, 1908a; Grin, 1910). The exact purpose of these kits remained vague. Officially, they were to serve as aids in the teaching of psychology, which had been introduced as a new subject in the Russian high-school curriculum in 1906 (Byford, 2008a: 277–97). However, these kits were also promoted to teachers as part of their own initiation into experimental psychology in the context of their occupational training, and, relatedly, as something that was meant to enhance their professional understanding of their pupils’ mental development (Feoktistov, 1909a; Nechaev, 1911a). In fact, a number of teachers – to the horror of university professors, such as Chelpanov – gave papers at teachers’ conferences, presenting the results of ‘psychological experiments’ which they had performed on their pupils using these kits (Chelpanov, 1909; Feoktistov, 1909b; Volyntsevich, 1910).
The third way of transposing the lab into the school took place through the creation of new experimental methodologies in which school activities were to be transformed into virtual experimental apparatuses. This was done especially in what was dubbed ‘the natural experiment’ – a method developed in the 1910s by one of Nechaev’s friends and collaborators, A. F. Lazurskii, who at that time taught principally at the Pedagogical Faculty of the Psycho-Neurological Institute in St Petersburg (Lazurskii, 1911). As in Nechaev’s case, Lazurskii’s methodology emerged in the context of teacher-training: the research that went into devising the method was carried out by student-teachers under Lazurskii’s supervision (Val’vat’eva, 1913; Kenigsberg-Kovarskaia, 1913; Korenblit and Nadol’skaia, 1914; Kovarskaia, 1916; Lazurskii and Filosofova, 1916).
Lazurskii initially construed the natural experiment as a method that could be placed midway between systematic diary-based observation and an artificially mounted lab experiment. However, the development of this method in the direction of increasing precision, standardization and quantification led Lazurskii and his students to define it as an experimental method proper. They designed a number of special ‘experimental lessons’ in different subjects (and eventually an entire ‘experimental school day’) as carefully crafted frameworks for performing personality and mental ability tests. The lessons, and the specific tasks within them, were to appear entirely ‘natural’, resembling the most regular of school activities, while in fact assessing specific cognitive functions and personality traits in a precise and even quantifiable way. The behaviour and performance of individual children was observed, registered and measured following established experimental protocols which specified which psychological characteristic was being observed when, and how it was measured and quantified.
The output of the experiment was a standardized verbal profile of the child, known as a kharakteristika [character description]. Significantly, though, the latter was accompanied by a graph based on a quantification of experimental results. Even before developing the natural experiment Lazurskii had devised a graphic, map-like representation of a child’s personality, dubbed ‘schema’ [skhema], which was the outcome of his systematic programme of objective observation of child behaviour (Lazurskii, 1908). However, this ‘schema’ was criticized in the pedagogical press as insufficiently precise, unnecessarily complex and generally impractical (Iakovenko, 1909). This prompted Lazurskii to devise a much simpler ‘star diagram’ [zvezdochka] as the graphic output of the natural experiment. The diagram was devised in such a way that the size of each vertex corresponded to the degree of development of a particular set of mental faculties, based on the data obtained in the experiment. The zvezdochka had the virtue of representing the outcome of the experiment much more elegantly, as a harmonious whole that could be grasped at a glance (Figure 2). Similarly to Nechaev with his school lab kit, Lazurskii and his team promoted the natural experiment simultaneously and ambiguously as: (1) a method intended for more general experimental research in psychology (especially in the sphere of individual/differential psychology or ‘characterology’); (2) a ready-made tool for performing the psycho-pedagogical profiling of individual schoolchildren; and (3) a methodology for improving teaching and learning by providing psychologically more meaningful lesson designs (Kovarskaia, 1916; Lazurskii, 1918).

A. F. Lazurskii’s ‘star diagram’ [zvezdochka].
Finally, the fourth and most comprehensive way in which ‘the school’ was juxtaposed with ‘the lab’ was through the creation of the experimental or laboratory school [eksperimental’naia shkola or shkola-laboratoriia]. Here the adjective ‘experimental’ did not refer to the development of non-standard (e.g. progressive, free-educational) teaching methods, as was the case with a number of private schools appearing in Russia at this time, which also marketed themselves as ‘experimental’ (Durylin, 1907–8). Instead, the term pointed to the deployment of systematic experimental-psychological and school-hygiene research and a continuous monitoring of schoolchildren and their learning conditions – something that the promoters of these schools considered essential to establishing scientifically grounded educational best practice (Faddeev, 1907).
This kind of experimental school was created in 1910 at St Petersburg’s Pedagogical Academy (PA) – a non-government-funded institution, established in 1907–8, where Nechaev played a key role as co-founder and professor. PA provided postgraduate training for future researchers and managers in education, recruited from the teaching profession (Nechaev, 1910). Thanks to a generous private donation by a philanthropist from Samara, PA opened its experimental school on the above model, with Nechaev as school principal and with PA’s trainees as teachers and school managers (Nechaev, 1911b; NA RAO f. 47, op. 1, d. 95). This school had its own psychology lab and school hygiene cabinet. Entrance assessments included medical exams, anthropometric and psychometric tests, and detailed interviews with parents. Psychological experimentation and observation were carried out continuously during and outside lessons. Again, this was framed simultaneously as general psycho-pedagogical research, the systematic evaluation and monitoring of individual schoolchildren, and a way of scientifically developing teaching itself; and all this as part of training high-level professionals in education who were to be empowered with ‘science’.
The exam
Although many Russian teachers were inspired by the promises of making their own professional expertise more ‘scientific’ through systematically engaging in psychology, and psychological experimentation in particular, what appeared more directly relevant to their regular work, and was therefore of interest to a far larger contingent of them, was the potential of using testing as an instrument of pupil evaluation, profiling and classification. Techniques of measuring psychological functions, establishing general levels of intelligence and charting personality traits were here interpreted as close homologues of the standard school practices of examining, marking and reporting (Rumiantsev, 1911: 67).
The reason why purportedly objective, ‘scientific’ assessment methods of schoolchildren’s abilities and personalities seemed so important to the Russian education profession was that the legitimacy of more traditional forms of evaluation and classification, and especially of exams, had become widely disputed around this time (Lebedintsev, 1913–14, 1915–16). Exams, and related forms of assessment, were entangled in a complex web of power-relations in tsarist Russia’s education system – both literally, as technologies of power in their own right, and symbolically, as emblems of the negatively connoted ‘bureaucratic’ power that extended beyond education itself. ‘Exams’ were vital not only to relations between teachers and pupils (whose fates were being decided in this way), but also to teachers and parents (Byford, 2013a). The latter fought regular battles at the end of the school year, with parents challenging teachers’ assessment methods, especially if their children were forced to repeat a year (Litvinskii, 1893). ‘Exams’ also played a part in the teaching profession’s rather poor public image, particularly among the educated: they were regularly portrayed in the press as exemplary of arbitrary authoritarianism supposedly typical of the Russian teacher as a servant of the state. The practice of exams was thereby identified with the tyranny of tsarist autocracy, especially in the politicized rhetoric surrounding the 1905 revolution. Schoolchildren were often cast in the press as woeful victims of ‘exam torture’ (Krainskii, 1912), cited as one of the major causes of the child suicide ‘epidemic’ that rocked Russia in the 1900s–10s (Morrissey, 2007: 312–45; Liarskii, 2010). Exams were also one of the major targets of medical professionals heading Russia’s school hygiene movement. Some of the earliest experimental research in Russian schools involving forms of testing was developed precisely in the context of a critique of school assessments by doctors studying exhaustion levels in schoolchildren (Sikorskii, 1900; Vysotskii, 1894; Kaminskii, 1911). Finally, pupils’ exam results were used as a vital measure in the evaluation of teachers’ performance by their immediate bureaucratic superiors – school principals and inspectors (Ekzamenator, 1913). Consequently, many teachers recognized in the ‘scientific’ forms of assessment a solution to their problems of professional authority and autonomy – a way of legitimating as ‘objective’ the evaluation of their pupils and any consequent decisions, be it to make the child repeat a year, or expel from school, refer to a special class, or award the child a scholarship.
Early interest in testing among Russian schoolteachers focused on the problem of evaluating the basic abilities and knowledge of the ‘unknown quantities’ entering the primary school system – a growing issue from the 1890s in the context of imperial Russia’s industrialization drive that led to rising numbers of schoolchildren from the labouring classes, coming largely from the illiterate migrant peasantry – groups that most urban teachers had previously not been dealing with and whose pre-school upbringing was perceived as badly wanting. Teachers were concerned about what seemed like an epidemic of ‘pathological’ cases of mental deficiency in the schoolchild population (Odesskii, 1898); they were uncertain about how to identify and best deal with ‘borderline’ cases of still salvageable low ability [malosposobnost’] resulting from early pedagogical neglect (N, 1900: 99); and they were frustrated about how to adapt educational practice to what appeared to be a surge in the diversity of abilities, leading to calls for the (somewhat misleadingly phrased) ‘individualization’ of teaching. At the same time, many were keen to sponsor talented children from the peasantry, yet believed that they themselves lacked the means of objectively ascertaining such children’s intellectual potential and justifying sponsorship on grounds of merit rather than disadvantage (Odesskii, 1898). Inspiration to improvise systematic testing of new school entrants was drawn especially from German initiatives dating back to the 1860s (Kapterev, 1892), but the practice took some time to be implemented in Russia and was carried out only through relatively isolated local initiatives, mostly in Moscow (Rybnikov, 1912).
Significantly, anxieties that the Russian professional intelligentsia expressed about the difficulties of socializing new generations in the context of rapidly changing ‘modern times’ – whether these referred to imperfect parenting, to the exposure of children’s delicate nervous systems to the stresses of modern life, or to worry about ‘degeneration’ (Beer, 2008) – were by no means confined to the problem of managing the offspring of the lower classes. These issues seemed even more urgent when it came to ensuring the ‘healthy’ (socio-cultural as well as biological) self-reproduction of the vanguard of Russia’s body politic – the educated elite, i.e. the professional middle classes and the bourgeoisie more generally, which, though still small, was itself expanding and differentiating at increasing speed at this historical juncture (Liarskii, 2010). The secondary schools in which these classes were being formed occupied a strategic position in this process of socio-cultural self-reproduction. A number of these schools’ representatives became keen to take on greater responsibilities in accumulating essential empirical knowledge about the child population in their care. And initially, given that Russia’s own child science had only started to develop at this point, some of them reached for cutting-edge western models.
For instance, in the late 1890s, certain parts of the Russian education profession became inspired by developments in France, led by the rising star of French psychology, Alfred Binet. The principal of one of Warsaw’s Realschulen, N. Agapitov (1900), adapted some of the early tests of Alfred Binet and Albert Leclère, whose work had been published in L’Année psychologique in 1897–8 and then summarized in Russia’s Vestnik vospitaniia [Education Herald]; cf. also Matveev, 1900; Nechaev, 1901; Noveishie opyty, 1901). These tests involved asking schoolchildren of different ages to describe in writing a range of objects presented to them (a painting, a stuffed animal, a plant, a watch, and some slightly unusual object, such as a magnet). This was done in time-limited sessions resembling exam conditions and the pupils’ essay-like answers Agapitov analysed in terms of a typology of descriptive approaches, measuring accuracy and detail of observation, degrees of imagination, aesthetic sensibility and the like, attempting even a quantification of results in terms of percentages of dominant traits.
Agapitov’s initiative was welcomed in the Russian pedagogical press and others sought to emulate it. In 1902 some of the principals of the empire network of commercial schools [kommercheskie uchilishcha] – general-educational secondary schools under the control of the Ministry of Finance, expensive and mostly attracting the wealthier bourgeoisie – became interested in deploying similar forms of psychological monitoring and assessment, citing precedents in America, France and Germany (N[echaev], 1903a). S. L. Stepanov, director of the commercial school in Baku, wanted to see the setting-up of a central, government-sponsored psychology lab which would provide instructions on how to carry out the necessary tests and process the data collected by different schools. Another principal, A. Fon-Ern, suggested that a detailed programme of tests measuring such qualities as the pupils’ basic understanding of the world, strength of different types of memory, accuracy of reproducing information, ability to concentrate, power of imagination, and mental endurance, should instead be designed by the schools’ pedagogical committees, based on their own standards. The principal of a girls’ commercial school in Kiev, N. N. Volodkevich, developed his own programme of testing, mostly copying Agapitov, although he admitted that he found the proper processing of the data he had collected beyond his capabilities. In the end nothing came of these proposals, not least because a number of other principals thought that the whole enterprise was likely to amount to little more than ‘a superfluous, if scholarly, amusement’, quite unnecessary, given that children’s abilities could be assessed perfectly adequately through the usual means of attentive observation and evaluation by experienced teachers. A number of others stressed that existing testing methodologies were far too undeveloped to be applied with any confidence, especially en masse and by non-specialists.
These initiatives were emerging at exactly the time when Nechaev was establishing his leadership role on this territory. Indeed, as soon as he set up his laboratory at the Pedagogical Museum and started carrying out his own programme of research on memory, attention, exhaustion and the like – mostly in St Petersburg’s military cadet corps schools [kadetskie korpusa] – he launched a reprimand to those school principals, like Agapitov and Volodkevich, who had attempted to go it alone. He emphasized that ‘the most valuable school research was always based on … preliminary lab work’ and that school ‘experiments’ [opyty] needed to be initially prepared in the lab and carried out by qualified specialists on a small number of subjects, before being applied more widely in schools by those lacking the requisite expertise (Nechaev, 1902: 154; Nechaev, 1903b). Nechaev and his group still included in their promotional rhetoric the idea of substituting for the discredited exams (which they dismissed as judging all pupils by a single ‘bureaucratic’ yardstick) the new, supposedly more differentiating and accurate, as well as more objective and psychologically meaningful, forms of evaluation; however, at the same time they sought to ensure that as trained psychologists they controlled the use of these techniques, based on expertise associated with scientific ‘experimentation’ (R[umiantsev], 1910; Tikhomirov, 1911).
This issue was hotly debated at five successive conferences in St Petersburg, organized by Nechaev and his allies, but where secondary schoolteachers from across the empire formed the bulk of the audience. The first two of these events, in 1906 and 1909, were dubbed conferences in ‘pedagogical psychology’, while the next three, in 1910, 1913 and 1916, were renamed as conferences in ‘experimental pedagogy’ (Sokolov, 1956a, 1956b; Bogoiavlenskii, 1977; Budilova, 1990). At the first two conferences, the imperative of mobilizing wider support for their enterprise prompted Nechaev and his fellow experimenters to actively encourage teachers to train and engage in ‘experimentation’ (albeit always under the guidance, if only distant, of specialists like Nechaev), presenting it as something decisive to improving the teachers’ expertise and practice. However, in the 1910s, in response to growing criticism – both by people like Chelpanov, who insisted that teachers were not qualified to carry out any kind of psychological experiments in schools, and by some teachers themselves, who argued that these ‘experiments’ did not, in fact, respond directly enough to their specific professional needs and did not empower them with expertise as such – the experimental psychologists’ rhetoric shifted. They gradually ceased to present the education profession as ‘disciples’, and instead envisaged them as ‘users’ of a ready-made technology that was produced, black-boxed and trademarked by an increasingly specialized group of researchers in experimental pedagogy, based in labs and institutes (Byford, 2008b: 76–9).
Although it was assumed that educational establishments could use tests for a variety of purposes (entrance evaluation, profiling and streaming, medical and pedagogical monitoring), the question of who was suitably qualified to carry out testing in schools remained unresolved. By the early 1910s ordinary practising teachers were certainly no longer trusted with using the technology independently. Ideally, from the perspective of Nechaev and his allies, testing was to be carried out by, or at least under the direct supervision of, trained educational psychologists versed in experimental techniques. Yet such figures were only beginning to be trained at establishments such as the Pedagogical Academy or the Psycho-Neurological Institute, and even their expertise in psychology continued to be challenged by university psychologists, such as Chelpanov.
In this context, the professional figure viewed as the only one trustworthy enough to adequately perform testing on schoolchildren was the school doctor. In the school doctor’s hands, mental testing appeared to be the natural extension of the school’s medical exam – i.e. part of the regular monitoring of schoolchildren’s health, which already included anthropometric measurements (of children’s height, weight, chest and cranium size, physical constitution, and the like) as well as the statistical processing of data. Such monitoring was already mandated by the tsarist Ministry of Public Education as part of its school hygiene programme, especially once it established its medical-sanitary division in 1904, but even before then (Bekariukov, 1910; Byford, 2006; Liarskii, 2010). It was also encouraged by local authorities and non-government organizations, led by leading professionals in the field, such as the Society for the Protection of Public Health, which developed its own ‘Programme for the Study of the Sanitary Conditions of Educational Establishments, School Programmes and Pupils’ (Nechaev, 1902). And yet, although school doctors were deemed to have adequate professional training to be relied upon to carry out methodically psychometric measurements in schools (as they did anthropometric ones), their relationship to mental testing technologies was not dissimilar to that of the teachers: school doctors were entrusted merely to apply the tests in specific prophylactic or clinical situations.
The diagnosis
In the hands of the medical professionals, a mental test was no longer a research ‘experiment’, or a substitute for a scholastic ‘exam’, but became primarily a diagnostic instrument. Though developed as a supposedly objective, positivist technology (a means of establishing, experimentally and statistically, the prevalent ‘averages’ of mental capacity, and then quantifying deviations from them), in the hands of the medical professionals testing became a clinical tool, to be used to identify ‘pathologies’, i.e. to recognize deviations that would be understood not purely mathematically, but normatively, as deviations from ‘the healthy’ or ‘the optimal’ (Canguilhem, 1991). 5
Some of the rhetoric promoting mental testing among the education profession (as part of the already mentioned efforts to raise its status) presented teachers not only as experimental psychologists, but also as kinds of diagnosticians (e.g. Solov’eva, 1911: 22). This rhetoric was, in fact, also fostered by psychologists and overlapped with some of their own constructions of psychological expertise, insofar as they too often resorted to medical metaphors and used discourse marked by a medical style. For instance, the verbal profile that resulted from Lazurskii’s above-discussed ‘natural experiment’, while being called a kharakteristika [character description] – the term most commonly used for teachers’ reports on their pupils (Matveev, 1893; Rokov, 1904) – was articulated by Lazurskii (whose background was in medicine) in a language that emulated that of medical aetiology (Lazurskii and Filosofova, 1916; cf. also Nekliudova, 1906).
Defined broadly, a ‘diagnosis’ is by no means restricted to identifying medical pathology. It refers to a much more general power to recognize deviation from an established (and valued) norm or standard, including those of psychological development or school performance. Having the power to form a diagnosis is crucial to professionalism as a mode of existence of some occupations, but not of others (Johnson, 1972). Medicine remains an exemplar of professionalism in this sense, given its rootedness in the diagnostic (power) relationship between doctors and patients, created around particular normative forms of knowledge about the patient’s body and later mind. Psychology too has historically been successful in acquiring diagnostic powers of its own, but teaching rather less so.
In early-20th-century Russia, teachers could assess a pupil as ‘failing’, but this did not in itself amount to their patrolling an educational normative boundary resembling that between ‘health’ and ‘disease’ in medicine. Children who fell outside the established norms of educational progress were, from the perspective of the education profession, considered, in the extreme, effectively ‘unteachable’, and hence to be excluded from regular schooling (usually after repeating a year several times). However, the boundary of ‘unteachability’ was construed as one of ‘pathology’, which, as such, became the jurisdiction of medical professionals (Byford, 2006). In this zone, on the outer margins of the field of education, teachers in tsarist Russia were happy to devolve the powers of ‘diagnosis’ as well as ‘therapy’ to doctors, who at this time led the way in establishing Russia’s first special schools (Zamskii, 1980). This area of education was turned into a fuzzy border between ‘normality’ and ‘pathology’ – a boundary where educational problems merged inextricably with medical ones (Byford, 2006). Initially this field was dubbed ‘curative’, ‘medical’, or ‘pathological’ pedagogics [lechebnaia, meditsinskaia, or patologicheskaia pedagogika]. In the 1920s–30s so-called ‘defectology’ [defektologiia] was to rise out of these as a distinct specialism, and leading positions in it were invariably occupied by doctors (Zamskii, 1995). It is primarily on this boundary that the mental test, deployed as a diagnostic tool, became equivalent less to a school exam than a medical one.
In the 1900s–10s, a number of Russian psychiatrists and neurologists engaged in developing mental tests specifically as diagnostic tools of this kind, and these became widely used in Russian child science, alongside the ‘experiments’ of Nechaev and Lazurskii. Key figures here were A. N. Bernshtein and G. I. Rossolimo, both of whom were based in Moscow and active in some of Russia’s first clinics assessing children with behavioural or developmental problems. Both were also among the main advocates in Russia of the use of mental testing as an innovation in neurological and psychiatric diagnostics more generally.
Medicine’s borrowing of methods from psychology was sometimes justified through analogy: just as somatic pathology made use of physiology, so psychopathology was said to need to make use of psychology (Zinov’ev, 1912). In both these cases at stake was the deployment of a positivist methodology to define (experimentally and mathematically) an objective, value-free norm, which a clinical methodology could then refer to in order to diagnose pathology as deviation from it. Experimental psychological methods were thus promoted as simply new, more objective instruments in the clinician’s existing diagnostic toolbox. Bernshtein, in particular, promoted mental tests as giving greater precision to the classification and symptomatology of mental illness than what was possible through the mostly intuitive and subjective method of classical clinical observation (Bernshtein, 1907, 1908). 6 In his view, mental testing technologies, at the very least, reinforced the diagnostic intuition of experienced clinicians. Yet such technologies benefited trainee psychiatrists, whose diagnostic hunches were all too fallible. Furthermore, given that mental tests did not require a lab, that they were quick and simple to administer, and could therefore be applied at the patient’s bedside, they were meant to be particularly useful to those medics who were not specialists in psychiatry – regular hospital, family, forensic and school doctors, who lacked the time and the infrastructure to perform prolonged clinical observations of a neurological and psychiatric kind. So, yet again, it was on the margins of an established professional field (here that of clinical psychiatry and neurology) that mental testing, as a new diagnostic technology, seemed least contentious and could be promoted more easily.
Bernshtein also stressed another key advantage of mental tests as diagnostic methods – namely that they supposedly allowed the clinician to identify forms of mental dysfunction not only in clear-cut cases of mental pathology but in borderline situations where mental deficiency or psychopathology were by no means unequivocally established (Bernshtein, 1908: 193). This ambiguity between ‘the normal’ and ‘the pathological’ lay at the very core of Bernshtein’s methodology, which did not discriminate or differentiate between, for instance, mental dysfunction, on the one hand, and mental development, on the other. Indeed, as a diagnostic tool, mental testing was not designed to explain mental disease or even, in itself, to recognize pathology. In Bernshtein’s framing, what mental tests identified were particular mental structures that could then be attributed both to mental dysfunction (e.g. in mentally ill adults) and to particular stages of mental development (in growing children).
While Bernshtein worked on designing tests as diagnostic methods at the Central Reception Ward for the Mentally Ill in Moscow (emulating similar work done at labs attached to psychiatric clinics in the West), 7 he, at the same time, developed these same tests in his work on child mental development at the lab affiliated to the Moscow Pedagogical Assembly (Bernshtein, 1909). He presented his research not only to fellow psychiatrists, but also at the above-mentioned conferences in pedagogical psychology and experimental pedagogy (Bernshtein, 1907: 299–300). His tests were included in Nechaev’s kit of experimental apparatuses and test cards promoted to schoolteachers; but these same tests could also be ordered from the Central Reception Ward for the Mentally Ill, where the target clientele were Bernshtein’s fellow doctors.
Indeed, the promotion of mental testing in Russia at this time had no problem in placing its medical and pedagogical uses side by side. For example, F. E. Rybakov’s Atlas for the Experimental-Psychological Study of Personality bore the subtitle Compiled for the Purposes of Pedagogical and Medico-Diagnostic Study (Rybakov, 1910). While Rybakov himself was a medic (assistant at the lab affiliated to the Psychiatric Clinic of Moscow University, where Bernshtein worked too; cf. Rybakov, 1908), his Atlas was received especially warmly in the pedagogical press (Iakovenko, 1910). Thus, through the use of mental testing technologies, which were supposedly neutral on the matter of pathology, clinical diagnostics extended quite naturally across and beyond the normative boundaries of medicine. However, they thereby enhanced only the ambiguity of the boundary between ‘normality’ and ‘pathology’, as well as the ambiguity of any particular ‘borderline case’ – such as, for instance, that of a child whose relatively low scores in a mental test could not be unequivocally attributed to mental defect or pedagogical neglect.
The method
It was above all as broadly conceived ‘diagnostic’ instruments that mental tests came to be packaged (both figuratively and literally) into sets containing systematically organized collections of questions and tasks, designed to measure different mental properties in a supposedly integrated way, providing a quantifiable representation of an individual’s overall cognitive abilities or personality structure. The creation of these sets was prompted not only by the pragmatically driven instrumentalization of testing for the purposes of efficient diagnostic application, but also by the growing criticism of experimental psychology for its artificial fragmentation of the psyche into seemingly disconnected cognitive functions (memory, attention span, observation, suggestibility, etc.). While this line of criticism eventually gave rise to the opposing, holistic paradigm of Gestalt psychology, it also prompted an alternative response, internal to experimental psychology itself – namely, the effort to bring together existing tools of experimental measurement devised for discrete mental functions into organized ensembles that would measure the intellect or the personality as integral wholes. This went hand in hand with growing interest in the concept of ‘general intelligence’ (the Russian term for which was odarennost’, meaning ‘giftedness’), as well as the foregrounding of the methodological problem of ‘correlation’ (of data obtained in the measurement of individual mental processes).
The solution to the above was the creation of black-box-like test-packages, the outputs of which would be supposedly synthetic representations of their subjects’ overall mental profiles. In early-20th-century Russia, test-packages of this kind were referred to as ‘methods’ [metody] and usually bore the name of their creator. The internationally famous Binet-Simon ‘method’ was the one most used and discussed in Russia, although there were a number of others. It was recognized from the start, though, that foreign tests posed the significant problem of adaptation to Russian social and cultural conditions. Indeed, it seemed rather scandalous when, in the early years, some results of research carried out using the Binet-Simon test appeared to indicate that the mental age of Russian children was on average two years lower than that of their French counterparts (Chelpanov, 1912b: 182; Sokolov, 1956b: 25). While some researchers (notably A. M. Shubert, who initially worked together with Bernshtein) devoted much of their energies to adapting foreign tests (Shubert, 1912, 1913), a number of Russian psychologists and psychiatrists preferred to develop and market original ‘methods’ of their own.
The outputs of these mental testing ‘black boxes’ usually took a numerical as well as a graphic form; alternatively, they could be articulated verbally as a ‘diagnosis’ or a ‘character description’. As elsewhere in the world, with increased standardization and mathematization, numerical representations of the IQ kind eventually proved to be the most efficient and compact synthetic representation. However, in tsarist Russia, especially in the initial stages of the development of these mental testing ‘methods’, the visual immediacy of a graphic representation seemed rather more important, and was certainly more effective in promoting mental tests to wider groups of users. 8 Indeed, in Lazurskii’s natural experiment discussed above, the star diagram was developed precisely as a way of turning the natural experiment into something resembling a black-boxed ‘method’, on a par with those developed at that same time by Lazurskii’s colleagues, friends and rivals, such as A. P. Nechaev and G. I. Rossolimo (Figure 3).

A. P. Nechaev, A. F. Lazurskii and G. I. Rossolimo (1911); NA RAO f. 86, d. 46.
Arguably the most popular, as well as most controversial, native ‘method’ in Russia was the one developed by the Moscow-based neurologist G. I. Rossolimo, 9 dubbed ‘the psychological profile method’ [metod psikhologicheskogo profilia] (Rossolimo, 1910a, 1910b; Sh[ubert], 1911; Rossolimo, 1930). Like most ‘methods’, Rossolimo’s was made up of questions and tasks measuring different cognitive abilities (memory, observation, attention, etc.), many of which were borrowed and adapted from the ‘experiments’ of others. Crucially, though, Rossolimo organized the tasks in such a way that each discrete mental function was measured using 10 questions. The scores for each function could then be easily projected onto a graph, which allowed for quick (visual as well as mathematical) comparison of their (supposedly relative) levels of strength or weakness. Thus, (simplifying, of course) a subject could be assessed as having, for example, middling attention span, excellent memory and poor capacity for observation.
The term ‘profile’ [profil’], featured in the method’s name, referred to the shape of the curve that connected the peaks for the different mental functions as displayed on the graph (Figure 4). This curve (or more commonly a zigzag) was not a statistical entity – it did not, in principle, refer to variations across a population. Instead, it was conceived of as a snapshot of the totality of an individual’s mental abilities. The Russian press was duly impressed, portraying it as tantamount to a ‘photograph of the soul’ (RO RGB f. 326, p. 31, d. 12, l. 24).

G. I. Rossolimo’s ‘psychological profile’ [piskhologicheskii profil’].
Rossolimo devised several versions of his test: a ‘complete’ one, intended for more considered prolonged clinical evaluation in specialist institutions and usually reserved for individual assessments in medical contexts; and a ‘short’ one, for quick administration by non-specialists, in mass environments, such as schools. In the latter case it was also possible to work out the average profile shape of a given group of children, such as a particular stream or class, or even an entire school. Thus, Rossolimo was able to impress both by his method’s clinical thoroughness and exhaustiveness at the individual level and by its potential for speed and efficiency when applied to larger populations. Nonetheless, university professors, such as Chelpanov, were extremely critical of the method, arguing persuasively that the way that it ‘correlated’ the scores for the different mental functions was not based on any kind of psychological theory: the connection between the values obtained was just a visual effect of the profile curve itself and, ultimately, the product of an arbitrary numbers game, rather than of some measurable psychological relationship between the different mental processes featured in the profile (RO RGB f. 326, p. 31, d. 12).
Rossolimo himself was vague about what his ‘method’ was really measuring. When pinned down he, as a neurologist and psychiatrist, usually claimed that his test was ultimately designed to establish particular ‘profile’ types associated with specific mental diseases or forms of profound mental deficiency (Chelpanov, 1911; Rossolimo, 1911b). He would, for example, present and discuss different kinds of ‘profiles’ found in the mentally ill, to which he would attribute medical labels, such as ‘hypotonic’, ‘amnestic’, ‘demented’, ‘asthenic’, etc. (Govseev, 1912). Yet Rossolimo simultaneously ‘plugged’ this same ‘method’ as the most comprehensive, as well as the most practical, general mental capacity test on the market, to be used in schools or other types of children’s institutions. In this context, his test allegedly allowed teachers to make far more precise and objective their intuitive explanations of particular children’s poor educational performance, which they otherwise tended to attribute impressionistically to such vague characteristics as ‘laziness’, ‘dullness of mind’, ‘distractedness’, etc. (Rossolimo, 1910a, 1911a). Thus, just like Bernshtein, Rossolimo positioned his diagnostics strategically in the niche between and across the ‘pathological’ (i.e. psychiatric) and the ‘normal’ (i.e. scholastic), promoting his ‘psychological profile’ simultaneously both to medical and to pedagogical audiences. 10
The audit
In pre-revolutionary Russia mental testing practices did not and could not target the empire’s child population as such in any meaningful way, since the bulk of this population remained outside the professional intelligentsia’s purview. Given the general lack of interest and input in this domain from the tsarist state, the professionals themselves were able to develop their norms and form their diagnoses mostly only locally and piecemeal, as part of relatively small-scale civic and private initiatives. Estimates for percentages of ‘abnormal’ children in the population were certainly being proposed, but figures diverged widely, usually fluctuating between 2% and 10%, amounting to little more than guesswork (Troshin, 1915). At this stage, the primary purpose of mental testing in schools and other types of children’s institutions was to develop the technology itself and to legitimize it as one way of establishing the relevant norms and forming the relevant diagnoses. While debates around mental testing were certainly heated, its actual uses (e.g. in the assessment of children who were to be referred to special schools) was cautious and remained, as a matter of principle, intermixed with other types of diagnostics, namely pedagogical evaluations by teachers and clinical checks by doctors.
The first major sign of official recognition of mental testing by the state in pre-revolutionary Russia occurred only in 1916 at the third conference in experimental pedagogy in St Petersburg, which was partially funded by the Ministry of Public Education (Vserossiiskii s”ezd, 1916). The newly appointed Minister of Education, Count P. N. Ignat’ev, supported the idea of improving the state’s running of the empire’s education system by using cutting-edge technologies in the human sciences. As part of a new programme for the more systematic monitoring of an expanding schoolchild population he announced the setting-up of the so-called School Hygiene Laboratory, which was an official organ of the ministry, albeit entrusted to experts, mostly doctors, but also some psychologists (NA RAO f. 85 op. 1 d. 63; RO RGB f. 326, p. 30, d. 37). 11 This body was erected upon the existing structure of the ministry’s medical-sanitary division, but a significant new addition was the introduction of a section responsible for the mass psychological study of schoolchildren, to be directed by Nechaev, with Rossolimo and others taking active part in it. The School Hygiene Lab’s overall function was to establish the norms of physical and psychological development of Russia’s schoolchildren by age, gender, class, geographical region and ethnicity. Its experts were expected to devise standardized monitoring programmes in the form of surveys and tests, to be used by local school doctors, teachers, parents and the lab’s own branches across the empire.
The 1917 revolutions and the ensuing civil war put a stop to these plans, but the new socialist state that emerged out of these upheavals was keen to expand and diversify this kind of monitoring work as part of its own ambitious programme of social transformation through radically reformed and scientifically informed universal welfare and education. From very early on, the new Soviet educational authorities, central as well as local, faced the significant problem of both enabling and coping with the rapid expansion of the schoolchild population coming from social constituencies characterized by extremely low pre-existing literacy and educational levels. Moreover, as a result of years of revolutionary and wartime violence and displacement, considerable numbers of Soviet children in the early 1920s were homeless and ‘delinquent’, posing additional problems of both evaluation and socialization, especially in conditions where schools and teachers were scarce (Ball, 1994).
Throughout the Soviet 1920s, testing of various kinds was developed and promoted by a growing network of state-sponsored research institutes in educational science and deployed in schools, clinics, consultancies and children’s homes (Kadnevskii, 2004: 295–379). The number and variety of tests increased considerably over this period. It was estimated that by the end of the decade there were around 70 different kinds of testing ‘methods’ used in the USSR. Around 75% of these were imported, but they were then thoroughly reworked and adapted to Soviet purposes. The greatest number came from the United States, although tests were also introduced from France, Britain, Germany, Belgium and elsewhere. Russia’s own pioneers of mental testing, such as Nechaev, Rossolimo, Shubert and others, continued to actively develop and promote their own ‘methods’, with many followers joining their ranks, primarily under the rising banner of paedology, which from the mid-1920s was construed by the educational authorities as the way of ‘Sovietizing’ educational science (Balashov, 2012; Trombetta, 2013; cf. also Garreta, 2013).
Paradoxically, though, the proliferation of tests, both in number and variety, was due less to utopian faith in the potentials of this technology (particularly in emulation of key rival countries, such as the United States), and more to a certain scepticism regarding the trustworthiness and the meaningfulness of its outputs (NA RAO, f. 4, op. 1, dd. 67–8). 12 Indeed, irrespective of its remarkable rise as a promising new instrument for managing mass education, the methodological underpinnings of testing technologies continued to be vigorously critiqued in the USSR throughout the 1920s–30s, from various quarters – scientific, professional and political – just as had been the case before the revolution, and just as was the case in the West at this same time.
Even those Soviet researchers who were at the forefront of the testing movement were conceding that no single testing ‘method’ was without reproach. However, they also argued that this was alleviated by the simultaneous deployment of different ‘methods’, which, while all equally partial and imperfect, mutually complemented and corrected each other’s results. Furthermore, it was commonly argued that, wherever possible, the outputs of tests, as objective data, needed to be correlated with and confirmed by data obtained through pedagogical, psychological and medical observation (which was itself imperfect, of course, because inherently subjective, but which seemed reassuringly familiar as a natural part of existing professional practices).
The fact that (1) no single testing ‘method’ was deemed in any way ‘definitive’ for the wide range of purposes required of an education system in revolutionary flux, and (2) that any such ‘method’ was, in principle, ‘correctible’ by other ‘methods’ and methodologies, meant that new tests appearing on the market were adopted extremely readily and applied remarkably quickly, without much concern about how (in technical parlance) ‘reliable’ and ‘valid’ they might be. As one researcher, V. Ia. Vainberg, put it: ‘We believe that in the absence of exact knowledge about the very mechanisms of intelligence, every method that assists in revealing the latter in one form or another has a right to existence’ (Vainberg, 1929b: 6). Of course, researchers such as Veinberg, based at the various institutes devoted to child and educational science, saw as their principal task the rigorous evaluation and improvement of testing methods (e.g. cf. also Vainberg, 1927, 1929a). However, this itself only contributed to a proliferation of tests, each of which was being developed and promoted as only a partial improvement on or complement to its rivals and predecessors.
The methodological pluralism of early-Soviet paedology more generally (yet in which the proliferation of testing methodologies played a significant part), was the consequence not just of a form of epistemological ‘bet spreading’, but also, crucially, of the instrumentalization of child science and its methodology by the Soviet educational administration. The latter sponsored and commissioned this research as a means of acquiring an arsenal of auditing instruments for rationally managing the organization of mass education as a field of labour in its own right – the one responsible for forging the country’s future labour force.
A vital consequence of such instrumentalization was the blurring between different types of testing, all of which became components of a single bureaucratic toolbox. For sure, tests were still being used for a whole variety of purposes – to diagnose mental deficiency and refer children to special schools, to evaluate new school entrants and stream classes, to monitor school performance more generally and profile teenagers for particular industries (Basov et al., 1930). However, as managerial instruments, ‘tests’ became valued primarily for their ability to audit an entire schoolchild population (as opposed to assessing, diagnosing, or profiling individual cases). Indeed, the prime object of paedology, created precisely by means of mass testing, became ‘the mass child’ [massovyi rebenok] – a normativized figure of the Soviet child population (Levinskii, 1927). This also meant that testing methodologies, devised to quantify very different kinds of norms in this population – e.g. those of (mental) ‘development’, those of (scholastic) ‘achievement’, and those of ‘health’ (vs. ‘pathology’) – ended up fused into tools in which distinctions between particular normative regimes were minimized to the point of irrelevance (NA RAO, f. 13 op. 1, dd. 408–33). 13
Crucially, the ultimate purpose of ‘testing’ as a bureaucratic instrument was the auditing of the achievements of the revolutionary state itself. What appeared as the mass testing of pupils – e.g. the so-called ‘end-of-year audit’ [zakliuchitel’nyi uchet], introduced in 1925–6 in lieu of exams (Mikhailychev, Karpova and Leonova, 2005a, 2005b, 2005c, 2005d, 2005e) – was actually a technology for auditing pedagogical labour. Its meaning lay in the bureaucratic measurement of the ‘production’ successes and failures of particular schools or whole educational districts, as well as in providing an account of managerial efficiency at the highest levels of the Soviet educational administration.
By 1927, the state-led instrumentalization of ‘testing’ resulted in the move, by the researchers themselves, to integrate testing technology in a more formal way. What they hoped for was to take charge of two simultaneous, if seemingly contradictory, drives: that of the ‘centrifugal’ (and seemingly chaotic) proliferation and diversification of testing methodologies; and that of the ‘centripetal’ fusion or levelling out of very different kinds of tests into a single, standardized type of auditing technology. With this in mind the leaders of Soviet testing founded the Moscow Testological Association, the first meeting of which took place in May 1927 (Kadnevskii, 2004: 370–3). 14 The association’s tasks, as articulated by P. P. Blonskii, were: to defend testing against ongoing attacks; to enable different institutes and researchers engaged in developing tests to communicate better with one another; and finally, to plan collectively future research and application agendas. The organization’s key vehicle was a non-periodical publication, entitled Tests: Theory and Practice.
These developments paralleled and were directly linked to the contemporaneous drive to integrate the methodologically even more pluralist paedology as such. Indeed, debates around testing also raged at the First All-Russian Paedology Congress, which took place in Moscow at the end of 1927 and the beginning of 1928. Views on tests expressed at the congress were very mixed (Bernshtein, 1928; Rybnikov, 1928; Zankov, 1928): while stringent criticism was voiced by a number of authoritative figures (e.g. K. N. Kornilov, at that point director of the Moscow Psychology Institute), there was a simultaneous call to continue with mental testing research, primarily in view of further standardizing its methodology. Yet the overall conclusion of the congress was that paedology had to rely on different methodologies, of which testing was, and had to be, only one.
This ambivalent attitude, which intermixed concern and criticism about the liabilities of testing, on the one hand, and the promotion and expansion of testing as an essential mass accounting tool, on the other, is also visible on the pages of the journal Paedology, which started to come out in 1928 (Leopoldoff, 2013). 15 Despite the fact that political criticism of testing ratcheted up in the course of the early 1930s (Kurek, 2004), the deployment of mass testing of various kinds remained widespread in Soviet schools, above all in the context of the schools’ efforts to cope with the implementation of universal primary education (Avanesov, 2004).
Yet what became important for the Stalinist state in this period was less the evaluation of pupils as part of streamlining the pedagogical process itself, and more the search for effective tools of bureaucratic, as well as broadly disciplinary, control over the work of teachers and educational bureaucracies, with a focus on the delivery of ambitious labour targets, determined by Stalinist ‘socialist construction’ (Anan’ev, 1935). Increasingly, though, statistics on ‘subnormal’ children came to serve as a bureaucratic measure of educational and managerial inefficiency pointing to the highest levels of the Soviet educational administration – the Commissariat of Education (Ewing, 2001).
The outcome was the aforementioned 1936 party decree that officially abolished paedology as a recognized discipline, while imposing a blanket ban on all forms of mass research in the field, and above all a ban on testing. What is important here is that Stalin was abolishing not a scientific method but a bureaucratic auditing tool. And what was radical about this measure was that the party thereby effectively abolished the Soviet child population norm itself (‘the mass child’), the establishment of which paedology itself had become associated with thanks primarily to its development of mass mental testing.
Conclusion
Despite the 1936 anti-paedology decree, the ban on testing, the dismantlement of institutes, the withdrawal of funding, and the purging of a number of key scientists and bureaucrats, testing was not completely eradicated in the Soviet Union, even if ‘tests’ could no longer be mentioned without qualifying them as pseudo-scientific and harmful, as a technology of the capitalist exploitation of the working classes (Kadnevskii, 2004). Although general intelligence tests based on population-wide statistical norms of mental development remained taboo, educational audits of one kind or another continued to be practised, albeit on a limited scale and in the guise of so-called ‘control assignments’ [kontrol’nye zadaniia]. Scholastic testing of this kind expanded especially from the 1960s, in the context of de-Stalinization, Khrushchev’s educational reforms and the cold war competition with the United States. Occasional discussions of testing methods in major psychology journals, such as Voprosy psikhologii [Questions of Psychology], tended to frame them primarily in educational terms (for example, as psychologically informed methods designed to improve the acquisition of mathematical concepts) and used substitute terminology, such as ‘experimental problems’ [eksperimental’nye zadachi]. It was only in the 1980s that psychometrics started to be revived more explicitly, and even then cautiously (e.g. at conferences in other eastern bloc countries). In the New Russia, though, since the collapse of the Soviet regime, psychological and scholastic testing, while both still contentious in many quarters, has seen a remarkable resurgence. This has resulted in a veritable ‘testology’ movement, especially under the banner of psychological and pedagogical ‘diagnostics’. Its promotion invariably includes a form of self-historicization that claims early-20th-century developments as a significant, if still somewhat controversial, point of origin (Avanesov, 2004; Kadnevskii, 2004).
However, as I have argued in this article, ‘mental testing’ as an historical phenomenon is not reducible to the methodology or technology of any particular scientific movement, be it testology or paedology. Crucial to the dynamics of the development, deployment and debating of mental testing in early-20th-century Russia is the fact that, as a practice-in-the-making, it was never restricted to some more or less narrowly circumscribed community of fellow specialists who could claim it as their practice (whether the latter were understood as experimental, pedagogical, clinical, or bureaucratic). There was no single science to which the mental test belonged, even though it might have been claimed forcefully by such ‘disciplinary’ entities as psychology, experimental pedagogy, or paedology. Instead, claimants and stakeholders were spread across a disparate network within a far from clearly demarcated multi-professional arena where different groups, with very different stakes and agendas, were constantly renegotiating the frameworks of debate in mutually discordant terms. In this context, ‘mental testing’ acquired different forms and guises. It thrived as a strategically ambiguous and flexibly interpreted ‘boundary object’ that interconnected a highly heterogeneous field, enabling a coexistence and cooperation of divergent occupational agendas and normative regimes.
Both the promotion and the criticism of mental testing were characterized by a certain ‘muddying of the waters’ which resulted from the displacement of the practice across the boundaries of psychological experimentation, scholastic assessment, medical diagnostics and bureaucratic accounting. In this context there could be no ‘consensus’ about what kinds of norms tests referred to and what types of deviations from them they were expected to establish. It was never fixed whether these norms and deviations referred to stages of psychological development or to standards of scholastic performance; whether they spoke of mental deficiency or bureaucratic efficiency. Most often they encapsulated all of these at the same time. Equally, there was no ‘consensus’ about how universal, accurate, or pertinent the norms established through particular ‘methods’ might be: they were accepted as dynamic – as provisional, temporary and variable from one child population group to the next and from one application to the other.
The cross-boundary mobility and ambiguity of mental testing practices was conducive to the spread of the practice and certainly helped its wider claims to legitimacy, even in the face of hostile criticism from authoritative figures within the scientific institution. Indeed, this ambiguity, characteristic of a boundary object, undoubtedly contributed to the fact that continuing denunciations of mental testing (which came both from within and without the wider child science movement) so often ended up in confusion. At the same time, this very same mobility and ambiguity (which always seemed beneficial more on the margins than at the cores of the better-established occupational and disciplinary fields) ensured that mental testing’s legitimacy claims at all times remained perilous, constantly teetering on the borderline between ‘science’ and ‘pseudo-science’. This was the case even (and perhaps especially) at the point where the state started to take interest in this technology, investing in it and bestowing upon it its seal of legitimacy, yet which it could just as easily remove, as happened in the 1936 anti-paedology decree. The ‘boundary object’ character of mental testing does not, of course, explain why mental testing was banned in the USSR in 1936, but it does show why it occupied such a pivotal, yet at the same time highly precarious, position in the complex phenomenon that was early-20th-century ‘child science’.
