Abstract
The influences of schooling and everyday experiences on cognitive development are typically confounded. In the present study, we unraveled the influence of chronological age and years of schooling on the development of general cognitive competency in a two-wave longitudinal design with a three-year interval among 181 Kharwar children in India, aged 6 to 12 years. Effects of chronological age and years of schooling on cognitive development could be estimated independently because of their weak correlation among the Kharwar and because of the many shared background characteristics of school drop-outs, children without schooling, and children with schooling. The same five cognitive measures, each with parallel school and everyday testing modes, were administered to all children on both occasions. The internal structures of both the school and the everyday tests were equivalent across time and to each other. In line with our expectations, analyses of the net development per year revealed a decrement of the effect of chronological age, which was stronger for everyday tests than school tests, and an increment of the effect of years of schooling, which was stronger for school tests than everyday tests. Schooling ought to be considered in all theories of cognitive development, with genuine attention toward the real cognitive advantages it has at each developmental level.
The near-perfect correlation between age and years of schooling in affluent countries and large difficulties with implementing a longitudinal assessment in schooling effect studies have seriously affected research on the cognitive consequences of schooling. Over the decades, this combination of methodological complications has led to a situation in which the full scope of potential explanations of schooling effects has been contested and yet left untested. The debate is easily illustrated by two coexisting extreme positions on the cognitive consequences of schooling. On the one hand, years of schooling have been shown to have broad cognitive benefits, as evidenced in positive correlations with adult income, socioeconomic status, and health (Gottfredson, 1997; Strenze, 2007); on the other hand, having been taught concrete school skills, such as solving arithmetic problems, does usually not lead to an increased performance in analogous problems in everyday life (Schliemann, Carraher, & Ceci, 1997). In the present paper, we report a natural experiment in which we were able to implement a longitudinal assessment of cognitive test performance in an isolated sociocultural setting in Northeast India where chronological and educational age show a much weaker correlation than in affluent countries. We use parallel tasks with school-like and everyday item content to get a more direct understanding of the potential change mechanisms through which schooling affects cognitive test performance.
Disentangling the Effects of Schooling and Everyday Experiences
Children and adults who have attended formal schooling perform consistently higher on standard cognitive performance tests than children and adults with little or no formal schooling and this observed difference is often interpreted as the stimulation of intellectual development by formal schooling (Ceci, 1990, 1991; Christian, Bachman, & Morrison, 2001; Stelzl, Merz, Ehlers, & Remer, 1995). Whether these general increases in test performance are really linked to intellectual development, and to the enhancement of higher order thinking and general cognitive competency as claimed, is difficult to verify (Brouwers, Mishra, & Van de Vijver, 2006; Cunningham & Carroll, 2011; Morrison, Smith, & Dow-Ehrensberger, 1995). In numerous studies researchers have tried to avoid the confounding of chronological and educational age (Ceci, 1991), but the relative influence of a single year of schooling at different chronological ages cannot be properly addressed in Western contexts. Schooling’s long-term effects are often studied by comparing schooled and unschooled children in various non-Western countries where school attendance is low (e.g., Serpell, 1993). However, in many countries with low percentages of children attending schools or with high percentages of dropouts, schooled children may have a richer and more stimulating family background that exposes them more to written language prior to schooling.
Early Hypotheses about Generalization
Most important among the early hypotheses on the generality of schooling were literacy (Vygotsky, 1978) and out-of-context instruction. Olson (1994) argues that people who have learned to read and write accept the premises, arguments, and conclusions presented in written text as self-sufficient and true, and do not look for evidence outside the written text to support or refute the claims made. Still, research has failed to find broad generalizations in the cognitive system because of literacy. In two settings where unschooled literates, unschooled illiterates, and formally schooled literates could be examined independently, it was found that the effects of literacy did not generalize beyond tasks that were closely associated with syllabic characteristics of language (Berry & Bennett, 1991; Scribner & Cole, 1981).
Proponents of general effects of schooling argue that the out-of-context instruction in school fills this gap and provides pupils with cognitive challenges not found in the everyday context, thereby enhancing the transfer of skills to new contexts (Bruner, 1966). There is no solid empirical support for this position either. For example, empirical evidence showed that despite being skilled in decontextualized geometry, children schooled in the United States performed poorer on visual tasks consisting of Mayan weaving patterns, than Mayan children without Western schooling (Greenfield & Childs, 1977; Maynard & Greenfield, 2003). Interestingly, even with considerable skills in weaving complex visual patterns, Mayan children did not develop superior visual spatial skills that generalized to situations not involving weaving. From the different findings, we conclude that theories that expect broad schooling effects on higher order thinking and general cognitive competency are not well supported.
Performance and Change Mechanisms
A major problem in understanding schooling effects is that we know very little about how children get from one level of performance to another (Siegler, 2000). Only in the last two decades have theories started to address more detailed aspects of learning that could enhance our understanding of schooling effects. In this paper we adopt an information processing perspective. The information processing account of learning consists of two unifying parts (Simon, 1962). The first part is a performance model; it describes the processes and strategies people use to solve problems and how they know solutions. The second part is a change model; it describes how performance at one stage moves to the next stage. Successful performance within an educational setting would somehow relate to the mindful coupling of school and various daily contexts. Socioculturally, learning implies that children apply skills they already possess to new contexts and that teachers help them to achieve it, that is “the context-specific approach seeks to understand how cognitive achievements, which are initially context specific, come to exert more general control over people’s behavior as they grow older” (LCHC, 1983, p. 299). The information processing perspective highlights how information processes and strategies adapt to their contexts of usage.
A growing body of evidence suggests that primary schools teach a variety of processing skills that facilitate the probability and strength of the associations between information units from different sources (Fischer & Immordino-Yang, 2002). Reading may be the most obvious example. The ability to read and write provides words with a more complex meaning, but also fosters the composition of propositions (Elman et al., 1996). This is underscored by research on metaphors (Fischer & Immordino-Yang, 2002). Metaphors often capture conceptual structures related to, for example, movement, causality, or time (Lakoff & Johnson, 1980). In a similar way, schooling teaches a broad range of skills that create, facilitate, and strengthen associations between information from different contexts. In contrast to learning in school, everyday learning is geared toward participation in a unique setting (Bruner, 1966). An everyday setting can be a work setting, like the occupation of one’s father the child learns about during socialization, and the familiar home context where a child learns aspects of social relations. Learning in an everyday environment (e.g., scaffolding) is typically confined to optimal performance in a specific setting.
The second part of the information processing account of learning is the change model. It deals with how learners move from one stage of cognitive performance to the next and how this movement looks like. Change might consist of the generalization of processes and strategies from one context to another, and also the discrimination between contexts, thus enhancing the recognition that some processes or strategies are less effective in another context, and the proceduralization of reasoning and thinking, providing quick and reliable schemes to arrive at a solution (Klahr & MacWhinney, 1998). Long exposure to many different situations is necessary for generalization and discrimination. Each everyday context tends to have specialized strategies and children prefer to stick to those, even when provided with strategies that have proven to be effective across many different contexts. A good illustration of this resilience is the mathematical processing of Brazilian street vendors (Saxe, 1988). Ten-to-twelve-year-old candy sellers on the streets of Brazil’s cities solve arithmetical and ratio problems with large numbers that is hardly affected by the formal math strategies taught in school. In school, these children continue to rely on the strategies they use while selling candy on the streets.
It takes children and adults much exposure and practice to accept those strategies they are not already familiar with (Blanchette & Dunbar, 2000). Even within a narrow domain like reading, it takes a long time before children link their decoding skills to their comprehension skills across different contexts. If a performance domain is open and not well defined, such as settings that involve the generalization or discrimination of reasoning strategies, even more training may be needed. Acknowledging that generalization and discrimination become gradually easier if a richer network is available to recognize where strategies apply, we expect a gradual acceleration of the effect of schooling on learning. Skills, frameworks, and concepts from a school context may not have been favored over those from various everyday contexts, but once a few contexts have been mastered, others will follow more easily. We expect the opposite pattern (a gradual deceleration of the increase) of the effect of everyday skills on learning. Optimal situation-specific performance may reach an asymptote relatively quickly. Therefore, we expect that everyday skills will boost learning in the beginning of the process and that their impact on general cognitive competency will gradually decline, notably because skills get overlearned and their application saturated.
Study Setting
The confounding of chronological and educational age is not very strong with the Kharwar, a society indigenous to southeast Uttar Pradesh and western Jharkhand, thereby allowing us to disentangle school effects from the everyday cognitive development. The Kharwar live at a subsistence level of economy. Food patterns depend on the season, brought about by the availability of forest produce and the scarcity of agricultural produce in the months following the monsoon season. The Kharwar grow rice but as the soil has little water retention capacity and facilities for irrigation are limited, rice is available for only three or four months after the monsoon season. For the rest of the year the Kharwar depend on forest resources and animals. Malnutrition is common among children and, due to the lack of safe drinking water, diarrhea is an important health hazard.
The Kharwar have been introduced to formal education in the late 1990s. Because of their strict reliance on agricultural and gathering occupations, there was no strong demand for jobs that require school skills. Schools were met with considerable skepticism at their introduction because having formal education did not guarantee better living conditions. Interest has grown lately. This increase reflects hopes for economic development, as schooling is perceived to help when setting up a small business. Moreover, the Indian government provides parents and the community with annual financial incentives to send children to school. As a result, poverty and status (which show only a small variation in the population) are not related to school enrollment and attendance.
The sample of the study was drawn from nine villages located on a forested plateau south of the Ganges valley. The villages are contained in the Naugarh Block area, which is the most remote region of Chandauli district. Situated at a distance of about 100 km south of the city of Varanasi, the whole region is quite underdeveloped. The region creates harsh conditions for human habitation and the villages are isolated and inaccessible. Naugarh is a market place that caters to the day-to-day needs of villages situated at distances of 8 to 25 km. Roads from Naugarh to neighboring villages are poor or even absent. Due to this isolation, people have only very little exposure to the outside world. The predominant belief system among the Kharwar is animism, but Hindu rituals and festivals are also observed and celebrated.
The everyday setting of the Kharwar provides children a natural setting for learning about the physical and social world, a type of learning that is described in the developmental literature as quite universal: Piagetian operations (Piaget, 2001) and personal memories related to the people they know, events they were part of, and objects they handled (Nelson & Fivush, 2004). Life in the villages is deemed sufficient to develop these universal skills. First, Kharwar children are expected and even encouraged to participate in a variety of chores and activities from the moment they are able to walk. Within the village surroundings, children take care of babies, clean the house, run errands, and purchase little items from local shops. Outside the village, major activities are collecting firewood leaves (for making plates) and an indigenous nut from the forest. Herding of livestock is another important activity first with adults, but later on their own. Therefore, all children learn to count animals that belong to them and those that belong to other families. Since the villages border immediately on the forests and fields, children also have first-hand experience of natural surroundings, the prevailing weather, and the effects of rain and drought. Together, these experiences lead to an understanding of transitivity, conservation, rotation, prediction and postdiction, and many other cognitive operations that are fundamental to abstract reasoning and related tasks that are measured in a standard cognitive test battery. Second, the development of autobiographical memory during childhood supports the development of temporal understanding, as well as the development of language skills, propositional thought, and the general storage in memory of names, facts, and categories, all skills related to tasks provided on standard cognitive test batteries.
Quality of the schooling is in line with the overall low socioeconomic development of the region. Not all villages have a school, thus children from a village without a school have to walk to the nearest village with a school so as to be able to attend school. The only schools available in the area were primary schools, teaching grades one to six. The teachers (all males) completed secondary school, which makes them the best-educated people in the entire area, but they did not have additional teacher training. It is very common to find children of different ages within the same grade level.
The curriculum and teaching methodology are very similar to that practiced in other rural schools throughout developing countries, where schooling is made relevant to local conditions and teachers are seen as change agents. In addition to conventional subjects of math, science, social science, and Hindi, which are taught in a traditional teacher-centered method, teaching also includes recitation and substantial time spent on ethics, physical health, and moralistic perseverance. The agricultural background of the Kharwar features extensively in teaching, with lessons not restricted to the classroom, but conducted outside as well, including nature walks, where natural science concepts are explained through hands-on activities and field experiences.
The Present Study
In the present study we employ the particular setting of the Kharwar, where there is no natural confounding of chronological and educational age, as the general context of a natural experiment (e.g., Scheier, 1959). Earlier, natural experiments have been used to examine literacy effects among adults (Scribner & Cole, 1981; also see Berry & Bennett, 1991). In the present natural experiment we focus on middle childhood. The age of enrolment in primary education varies considerably in this society and the children who attend school do this for different lengths of time. This natural variation allows us to include in our study children at varying ages with different amounts of schooling, including children without any schooling (Mishra & Dasen, 2004). Natural variation of chronological and educational age in our study is studied in a two-wave longitudinal design. In this way we can address developmental increments at different levels of chronological and educational age. Many selection biases that typically challenge studies that compare chronological and educational age, such as lower dropout rates among boys and children of lower socioeconomic strata, could be neglected, because all children came from the same, lowest caste.
Ecological validity is known to have a large influence on test scores, with more valid tasks usually yielding higher performance. Bronfenbrenner defined ecological validity as “the extent to which the environment experienced by the subjects in a scientific investigation has the properties it is supposed or assumed to have by the experimenter” (Bronfenbrenner, 1977, p. 516). Ecological validity thus refers to the “potential utility of various cues for organisms in their ecology” (Hammond, 1978, p. 8). In the context of our study, ecological validity refers to the link between task contents and everyday life; tasks with a higher ecological validity use materials that are closer to everyday objects and experiences of participants. Schooling has been said to have an influence on stimulus familiarity and many cognitive tests employ school-related tasks. We can expect schooled children to perform better on such tasks. Another issue is whether ecological validity would influence the underlying structure. There are no indications in the cross-cultural psychological literature that stimulus familiarity is associated with a different cognitive architecture (Berry, Poortinga, Segall, & Dasen, 2002; Ferguson, 1956; Malda, Van de Vijver, & Temane, 2010; Schliemann, Carrahar, & Ceci, 1997; Van de Vijver, 2002). However, we intended to examine stimulus familiarity more systematically, even if stimulus familiarity is not expected to influence the factor structure. Therefore, we factored familiarity in our design by designing two parallel batteries, each of which assesses different but comparable general factors: one battery that deals with school-like cognitive development (lower ecological validity) and one that deals with everyday cognitive development (higher ecological validity).
We employed features of the natural environment of the Kharwar to design two sets of parallel tasks in which related processes are required for a successful completion of parallel measures, but in which the domain of application of these processes are more similar to typical school tasks or more to local tasks specific to the Kharwar. Such a combination of universal processes from the literature with local information ensures a culture-informed test battery (Brouwers & Van de Vijver, 2015). Differences between tasks from the same domain reside in the stimuli used to access those information processes. Reasoning that consists of transforming a few stimuli into a new conclusion, through an analogy, for example, can thus be accessed in a school-like fashion by using abstract stimuli in a matrix design, while for a local Kharwar task having to draw an analogy can be operationalized by using objects from the immediate environment, such as rain puddles and drought to design an item about future events. The underlying information processes targeted are thus identical for both school and everyday tasks drawn from the same processing domain, but layout or operationalization of the tasks are quite different.
Overall, for the Kharwar tasks we used objects and small three-dimensional models and avoided the use of language as much as possible. These different modes of assessment have a bearing on our two hypotheses. First, we expect a decrease of the influence of everyday context with age. This pattern should be visible in the decrement of the development per year across chronological age for everyday tests. Second, we expect an increase in the influence of school context with age. This pattern should be visible in the increment of the development across educational age and be stronger for the school tests than for the everyday tests. The parallel test versions act as replication. A similar pattern of development of everyday- and school-related skills (the opposite of our prediction) would suggest that both types of environments would have the same influence and reinforce each other in cognitive development.
Method
Participants
One hundred and eighty-one children took part in this study, the youngest was six years old at the first test administration and the oldest twelve years old at the second test administration. Table 1 shows the composition of the sample in terms of chronological age and differences in years of schooling. Children were recruited with help of a non-governmental organization (NGO), with parents of children that were involved in the study providing their consent. All children were recruited with the help of an experienced recruiter who is well connected with the community and school teachers through his work for an NGO that was active in the region. One selection criterion for inclusion in the study was that each child was typically developing; all children in the study were deemed typically developing and fit for inclusion. Getting exact birthdates in a community where most people cannot read or write is difficult. For this reason, age in years was mostly reconstructed from other evidence, in discussion between test assistants, who were themselves from the community, and the children’s larger family. These estimates are dependable concerning the age in years but do not provide sufficient detail on the level of day or even month. The community from which the children are drawn is culturally and linguistically very homogeneous: they are at the same lowest socioeconomic level and all speak a dialect of Hindi that is often referred to as Kharwar.
Numbers of Children by Chronological and Educational Age.
Note. In between brackets are the percentages of boys.
Cognitive Measures
In the present study we assess intellectual performance in line with extant literature, which amounts to using a test battery that consists of several processing domains so as to determine the general factor later in the analysis. We deviate from the traditional test assessment procedure by having two parallel batteries, each of which assesses different but comparable general factors. The overall procedure we followed to arrive at our two final test batteries was based on a mixture of bottom-up and top-down decisions. We started from Carroll’s (1993) structural model of cognitive abilities and chose five domains from this model that could be assessed in an everyday and school context: Reasoning, Vocabulary, Figure Discrimination, Memory, and Arithmetic. Based on cross-cultural literature (e.g., Cole, 1996; Lave, 1997; Posner, 1982) and Carroll’s task descriptions, we selected two parallel tasks for each of the five cognitive domains and subsequently used pilot testing to examine the ease of administration of our parallel tasks with children from the local community. The initial task selection showed that it was not easy to find parallel everyday counterparts for all cognitive domains. For example, for reasoning very few everyday tests exist in the literature that are also based on inductive processes, which is the hallmark of reasoning and also the basis for our school-like reasoning test. Similarly, finding good vocabulary tests for children that grew up in a community with very low literacy rates proved difficult.
While the pilot testing of nearly 20 children led to adjustments of the total length of the initial test battery as a whole, very few findings about alternative test strategies could be gathered. The combination of a universal model from the literature with local information derived through pilot testing and interviews with local experts led to a culture-informed test battery that measured cognitive performance in two different, yet comparable ways. For each domain we discuss the two subtest versions and their administration.
Reasoning
We employed the Raven’s Colored Progressive Matrices (Raven, 2000) to assess Reasoning in a school fashion. The instrument was chosen for its appropriateness in research with participants who are unfamiliar with the more abstract format of the Standard Progressive Matrices. The Colored version includes 36 matrices arranged in three sets of twelve matrices in increasing level of difficulty. For our purpose, a shortened version with two example matrices and 18 scored matrices was employed. The number of items was reduced, because the pilot study had shown that the administration of more than 20 matrices decreased the motivation of the children and that this would carry over to subsequent tests. Everyday reasoning was tested with four items that we adapted from Sternberg and Kalmar’s (1998) set of computerized items that deal with state changes in everyday objects. Sternberg and Kalmar’s items involved objects in some state and their subjects had to predict a possible future state of that object or postdict its past state. The four items that we adapted used small but life-like models of objects familiar to children in this cultural setting: (1) How long will this tree be in ten years? (2) How high will the level of water in this bottle be in one month? (3) What was the color of this yellow leaf three days ago? (4) How tall will this tree be in ten years? Children could choose from three alternatives. Items on both test versions were scored as either correct or false. Combined across T1 and T2, the Raven items (36 in total) showed a reliability (Cronbach’s alpha) of .52. The rather low number may be explained by the foreignness of the items to the children of the Kharwar; the items were perceived as tiresome, and even when shortened to 18 items per occasion, the test on a whole was viewed as rather long. Combined across T1 and T2, the everyday reasoning items (eight items in total) showed a reliability of .72.
Vocabulary
Formal vocabulary was measured with a free association task. Children were asked to enumerate as many objects as they could from three representative categories: things to eat, animals, and boys’ and girls’ names. The test score was the number of correctly mentioned names. For the local, everyday naming test a series of 14 pictures from locally available children’s posters were used. The pictures were shown to the children one by one and they had to provide the correct name of the object. When an object was named correctly, two points were given; when the response was incorrect but fell in the same category (e.g., calling a melon an orange), a single point was given; in other cases no points were given. Combined across T1 and T2, the formal items (28 items) showed a reliability of .63 and the local items (6) one of .40. The local items might be interpreted as a measure of comfort and feeling at ease with a school-like testing situation and for this reason could show a lower than expected level of replicability across children in the sample.
Figure Discrimination
We used six search items to assess figure discrimination. The layout of these items involved a large square drawn on paper, with 20 figures depicted within it. For the first three items the figures consisted of geometric figures such as squares and triangles; these three items were used to measure school-like cognitive functioning. The second set of three items consisted of concrete figures such as parrots and umbrellas; these were used to measure everyday cognitive functioning. For all items children had to look up and point out the figures in the square that matched the four figures in the row beneath the square. The first figure of each set was used as an example item. The items on both versions were scored as either correct or false. Combined across T1 and T2 and the two test modes (as four scores), Figure Discrimination showed a reliability of .50. The mixture of multiple tasks into a single whole may have reduced the reliability. In retrospect, a larger set of independent item scores that could have been subjected to item-level analysis later on would have been preferable in a novel study area.
Memory
Assessment of memory in a school-like fashion was conducted by reading 12 words slowly in a fixed random order to the children, after which they were asked to recall these words from memory in any order. Everyday memory was assessed in a similar way, but now the children were shown 12 objects randomly from a bag that were then put in front of them. After the last object was shown, the objects were covered with a towel and the child was asked to recall the objects in any order. The words in the school condition and the objects in the everyday condition came from four categories (i.e., food, beauty products, tools, and animals), with three words or objects per category. All correctly recalled words and objects were scored one and the missing as zero. Combined across T1 and T2 and the two test modes (as four scores), Memory on the whole showed a reliability of .49. While the underlying tasks were generally perceived as fun and doable by the children, the combination of categories into a small number of items at each occasion may have hampered reliability.
Arithmetic
All children were asked eighteen mathematical questions. Nine questions were asked in a formal school style (e.g., “How much is two plus two?”). These nine questions were also embedded in stories (e.g., “Sawita saw two birds this morning and two this afternoon. How many birds did she see today?”). The eighteen test items consisted of six additions, six subtractions, and six multiplications. All items were open-ended and scored as either correct or incorrect. Combined across T1 and T2, the formal items (18 items) showed a reliability of .87 and the local items (18) a reliability of .83. These reliabilities are good.
Demographic Information
A biographical questionnaire was administered to assess six socioeconomic and family characteristics: father’s education in years, amount of land owned by the family (in bigha; 1 bigha = 2468 m2), months per year the father is in wage employment, number of cows owned by the family, total size of household, and whether the family owns a radio (yes/no). These variables were chosen to reflect, on the one hand, contact with the world outside their own village and, on the other hand, economic status and material wealth in the rural area where the study was conducted. The variables are treated separately in the analysis to address contextual variation that might exist in the sample.
Procedure
All 181 children were tested twice. Test administration took place in two periods of five weeks: the first in November and December 2001 and the second in January and February 2005. The tests were administered individually and in an order that was held constant across children and both time points: Reasoning, Vocabulary, Figure Discrimination, Memory, and Arithmetic, within each test domain the everyday version immediately following the school version. A single administration of the test battery took approximately one hour. Children were tested in a secluded area (mostly a courtyard) in their village that was especially assigned for the purpose. Each child was tested on both occasions by the same experimenter. All three experimenters belonged to the local community and had previous experience with test administration. All three had ten years of schooling and worked as primary school teachers in their own village. Thus, the experimenters were well known to the children and their cultural practices. On both T1 and T2, the experimenters received training prior to testing. Children were tested in their mother tongue, the local unofficial dialect of Hindi that the three testers also had as their mother tongue.
Results
Before presenting our results, we performed an operationalization check to see whether the natural experiment of our study was successful in disentangling chronological and educational age. Then the descriptives of the measures are described. This is followed by an examination of the internal structure of the test battery at T1 and at T2 using multigroup analysis. Finally, the impact of both chronological age and educational age on the differences between T1 and T2 is examined by analyzing annual score increments separately for the school-based tests and everyday tests in a series of analyses of covariance.
Operationalization Check
Our natural experiment set out to disentangle chronological and educational age. In the analysis of the data, chronological age was defined as the number of years since birth. Educational age was defined as the number of years since the child started attending school; children in the first grade were assigned an educational age of one year, children in the second grade an educational age of two years, and so on. Children never having attended school got an educational age of zero years. Table 2 shows the correlations between the age-related and family background variables. Relations with Contact and Affluence were not significant. The correlation between age and school attendance at T1 disappeared at T2. The negative correlation between chronological age at T1 and years of schooling received between T1 and T2 revealed that older children tended to have less schooling than younger children in the same period, which can be expected in an environment with school dropout. In our longitudinal analysis, chronological age always had the same within-subject value of three years, but this three-year interval was analyzed for its impact on six-, seven-, eight-, and nine-years-olds; educational age was operationalized as the number of years of schooling in between the two assessment points and could vary from zero to three years. It can be concluded that the context of our study was adequate for the disentanglement of educational and chronological age.
Correlations between Family Background and Chronological and Educational Age.
Note. N = 181. *p < .05. **p < .01.
Descriptives
Mean scores for each of the five domains were constructed by averaging the item scores for each cognitive domain separately for T1 and T2, and separately for the school-related and parallel version of the everyday tests. The resulting 20 scores (5 domains × 2 times × 2 versions) were standardized by subtracting the domain-specific mean of T1 (also for scores at T2) and then dividing this number by the domain-specific pooled standard deviation across T1 and T2. Table 3 presents the 20 unstandardized means and their standard deviations for the cognitive domains at the first and second time point, along with values for skewness and kurtosis. A preliminary analysis supported the adequacy of the measures; for the nine scales with a maximum attainable performance, the scores were well above chance level, while the one scale with an open-ended score showed strong individual variation. Furthermore, mean scores at T2 are higher than those at T1, thus suggesting that development took place, while it can also be seen that the standard deviations follow the opposite pattern, decreasing from T1 to T2. Both skewness and kurtosis for school-like Vocabulary were strong at T1 and T2 (skewness being –2.75 and –9.16, respectively, kurtosis 11.51 and 88.13). This may be due to the nature in which the items were scored. Children could continue naming objects as long as they wanted, and with a few exceptions, most appeared to have done that.
Unstandardized Means and Standard Deviations Per Domain and Year.
Note. N = 181. School-related Reasoning consisted of 18 items that ranged from 0 (incorrect) to 1 (correct), school-related Vocabulary consisted of 3 items with each reflecting the children’s personal free recall, school-related Figure Discrimination consisted of 3 items that ranged from 0 (incorrect) to 1 (correct), school-related Memory consisted of 12 items that ranged from 0 (incorrect) to 1 (correct), and school-related Arithmetic consisted of 9 items that ranged from 0 (incorrect) to 1 (correct), everyday Reasoning consisted of 4 items that ranged from 0 (incorrect) to 1 (correct), everyday Vocabulary consisted of 14 items that ranged from 0 (incorrect) to 2 (correct), everyday Figure Discrimination consisted of 3 items that ranged from 0 (incorrect) to 1 (correct), everyday Memory consisted of 12 items that ranged from 0 (incorrect) to 1 (correct), and everyday Arithmetic consisted of 9 items that ranged from 0 (incorrect) to 9 (correct).
aThe number does not reflect the total number of items, but the highest score observed.
Structural Analysis
In line with current thinking about intelligence (e.g., Carroll, 1993) and the rationale behind our assessment battery as described before, we assumed that a single ability would underlie test performance on the various tests of the school mode and a single ability would underlie the performance on everyday tests. By using multigroup confirmatory factor analyses, we examined whether the school and everyday tests exhibited invariant factorial structures at T1 and T2 to ensure that we can compare the mean scores at both time-points (scalar invariance). One multigroup Confirmatory Factor Analysis (CFA) was used to examine invariance for the five school tests, T1 as the first group and T2 as the second group, and the second multigroup CFA to examine equivalence for the five everyday tests, again T1 as one group and T2 as the second. Each of the two CFAs consisted of the five domains and one latent variable (see Figure 1) and it was examined to what degree the models at T1 and T2 were equivalent. In the analyses, invariance is tested at four nested levels: the highest level of equivalence (3) is reached when all the different parameters in the model have the same value at T1 and T2, thus the same pattern of zero and non-zero loadings, measurement weights, and measurement intercepts; slightly less equivalent (2) is when the models have the same figure and the measurement weights the same values at T1 and T2, but measurement intercepts do not (i.e., at T1 the intercepts can be larger or smaller than at T2); even less equivalent (1) is when in addition to the intercepts, the weights also have different values at T1 and T2, thus only the configuration of test loadings (i.e., the same pattern of zero and non-zero loadings) is similar at both times; finally, least equivalent (0) is when the models at both times also look different. For each CFA, a single statistical test shows which level of equivalence is the most fitting.

Internal Structures of the School-like and Everyday Tests Batteries at T1 and T2.
The resulting fit indices of the two multigroup analyses are presented in Table 4. The most restrictive model with an adequate fit for both the school and everyday modes was the measurement residuals model. The fit of the CFA for the school battery was good, χ2(25, N = 181) = 40.47, χ2/df = 1.62, p < .05, 95% CI [14.61–37.65], CFI = .94, and RMSEA = .04; and the fit of the CFA for the parallel everyday battery was also good, χ2(25, N = 181) = 39.08, χ2/df = 1.56, p < .01, 95% CI [14.61–37.65], CFI = .95 and RMSEA = .04. These analyses corroborate that means at T1 to T2 show scalar invariance and can be meaningfully compared. Figure 1 shows the factor loadings for each model. The internal consistency at T1 was .73 averaged for the five school tests and .78 averaged for the everyday tests, which is adequate. The standard deviations of the cognitive domains decreased from .71 to .54 for the five school tests and from .81 to .40 for the everyday tests.
Fit Indices of the Four CFA Multigroup Models by School-like and Everyday Test Batteries.
Note. N =181.
Analysis of Change
Table 5 reports the correlations between school and everyday tests at T1 as well as the correlations between school and everyday tests at T2. Each correlation of the two matching constructs is positive and significant (p < .01), except for Figure Discrimination at T2 that bordered on significance (r = .13, p = .072). Overall, the correlations at T2 are smaller than those at T1.
Correlations between School and Everyday Tests at T1 and T2.
Note. N = 181.
Significance of the expected effects was examined by a Repeated Measures Analysis of Covariance (ANCOVA). For each test mode (school, everyday), the within-subject factor is defined by the test scores at T1 and T2. Chronological age at T1 (6, 7, 8, 9 years) and years of schooling in between T1 and T2 (0, 1, 2, 3 years) make up the between-subject factors; there were a number of covariates: gender (girls = 0; boys = 1), six separate context variables (father’s education in years, amount of land, months of wage employment, number of cows owned, total size of household, and ownership of radio), and the difference between observed grade at T1 and expected grade at T1 with a normal age of enrollment (3, 2, 1, 0) in order to compensate for the skewed distribution of years of schooling across chronological age.
The course of development from T1 to T2 can be deduced from Table 6, which shows Cohen’s d effect sizes of the estimated marginal means derived from the ANCOVA. The analyses underscore our expectation of non-linearity: F(3, 163) = 10.06, p < .01, 95% CI [0.12–2.66], (partial) η2 = .16 for the interaction with chronological age, F(3, 163) = 2.98, p < .05, 95% CI [0.12–2.66], η2 = .05 for the interaction with years of schooling. Increase in the performance on the school test also interacted significantly with the everyday test: F(3, 163) = 4.09, p < .01, 95% CI [0.12–2.66], η2 = .07 for chronological age, F(3, 163) = 3.52, p < .05, 95% CI [0.12–2.66], η2 = .06 for schooling. Context did affect performance in a small number of ways, despite our efforts to avoid those effects. One way was through gender, with boys performing better than girls: F(1, 163) = 5.45, p < .05, 95% CI [0.00–3.90], η2 = .03 for the everyday test, F(1, 163) = 9.04, p < .01, 95% CI [0.00–3.90], η2 = .05 for the school test. Amount of land owned by the family showed an impact on performance: there was a significant difference between subjects, F(1, 157) = 6.24, p < .05, 95% CI [0.00–3.90], η2 = .05, but amount of land affected the interaction of the school with everyday tests as well, F(1, 163) = 4.44, p < .05, 95% CI [0.00–3.90], η2 = .03. Owning a radio affected the interaction of school with everyday tests in a similar way, F(1, 163) = 4.80, p < .05, 95% CI [0.00–3.90], η2 = .03. There were no other significant effects of the context variables.
Score Increments by Chronological and Educational Agea.
Note. N = 181.
aThe numbers in the cells show the Cohen’s d effect sizes of the estimated marginal means of the growth from T1 to T2 that followed from the Repeated Measures Analysis of Covariance.
Effects of chronological age and years of schooling were stronger on the tests pertaining to their own domain. More specifically, for chronological age the score increments for the school domain decreased from d = .90 to d = .46, whereas the score increments for the everyday domain showed a decrement from d = 1.30 to d = .18. So, it appears that the decline was steeper for the everyday domain than for the school domain, which supports the significant interaction of chronological age with test mode. The opposite pattern is shown by the effect sizes for years of schooling. Here, children without schooling between T1 and T2 showed an increase of d = .61 for the school domain, whereas children with the maximum of three years of schooling showed an increase of d = .97. Increases in the everyday domain appeared to be slightly smaller, with d = .34 for children without schooling between T1 and T2 and d = .80 for those children with three years of schooling.
Discussion
We conducted a longitudinal study to examine the association between schooling and everyday cognitive development on the one hand and general cognitive competency on the other. The natural setting of the Kharwar enabled us to disentangle the effects of schooling and everyday experiences. Effect sizes of chronological age demonstrated a pattern of diminished growth with age in middle childhood. Chronological age was thus a more important predictor of cognitive development for younger than for older children (diminishing return). The reverse pattern was found for educational age: the influence of schooling increased with years of education (increasing return). Test material played a crucial role in the strengths of these effects in both testing modes. The effect of chronological age was strongest for the (ecologically more valid) everyday tests, which is plausible because school content is usually not the focus of everyday life. Analogously, effects of educational age were strongest for the school tests, presumably because these better resemble what children are taught in school. Despite these smaller effects of chronological and educational age on the non-matching test version, similar developmental patterns for each predictor across versions point to an increase of general cognitive competency. Development is not just limited to familiarization with stimulus materials in one context, but also consists of acquiring representations and principles that enhance performance in many different settings. The contrasting patterns in generalization demonstrate the relevance of disentangling chronological and educational age for the study of cognitive development as well as the presence of mode-specific effects in cognitive development. These findings have implications for the specificity of developmental mechanisms and the caution required when interpreting cognitive ability scores. We discuss each area in turn.
Specificity and Generalization
The problem of the specificity and generality of the cognitive system still looms large. Our study provides new data. On the one hand, we found evidence for the general cognitive factor that is commonly found in intelligence research and supports the generality argument. On the other hand, we also found evidence for mode-specificity in performance. In addition, we found evidence that in middle childhood, everyday experiences tend to contribute less and less to the cognitive development, whereas the opposite was found for years of schooling. We argue that this differential influence is related to the specificity and generality aspects of everyday and school learning. Learning based on everyday life is typically not geared to generalizability of solution strategies. Schooling, however, is typically more focused on the generalizability of learned skills; much instruction explicitly focuses on the transfer of learned skills to new contexts. Humans are often not very capable of recognizing when new problems require the use of old solution strategies. This transfer training requires much intellectual effort. Not surprisingly, the effects of schooling become more salient with mass practice, which requires much time to build up. Schools provide multicontext instruction; these multiple contexts are essential for transfer of skills. It may be noted that transfer is not only the topic in academic matters in school curricula, but also in life skills taught in class. Thus, Durlak, Weissberg, Dymnicki, Taylor, and Schellinger (2011) describe that in interventions aimed at social health, “skills may be taught, modeled, practiced, and applied to diverse situations so that students use them as part of their daily repertoire of behaviors” (p. 475). So, generalization does not come indiscriminately and automatically. The breadth of application of each new representation or principle has to be taught and trained. Attention to context in the teaching process is essential.
Specificity in Assessment
In accordance with much cross-cultural psychological research, we found that performance is closely related to familiarity with the expectations and strategies specific to the different contexts; the ecological validity of the tasks is a vital factor for the performance levels to be found (Berry et al., 2002; Ferguson, 1956; Malda et al., 2010; Schliemann et al., 1997; Van de Vijver, 2002). Tests of cognitive ability typically overlap with school tasks (Mishra, 1997; Rogoff, 1981; Schliemann et al., 1997) and familiarity to these tasks depends on their degree of exposure to Western culture (Mishra, Sinha, & Berry, 1996). Prior exposure enhances children’s performance on mental tests (Helms-Lorenz, Van de Vijver, & Poortinga, 2003; Van de Vijver, 1997), but environmental constraints like child labor, malnutrition, and pandemics reduce exposure and put children in developmental jeopardy (Fagan & Holland, 2002; Feuerstein, 1979).
Ecological validity is related to generalizability: children perform better when tasks given to them are derived directly from their own immediate everyday context. Sternberg and associates (2002) observed among rural Tanzanian school-attending children that the children familiar with the skills and strategies contributing to success on tests of general cognitive ability during an intermediate training session, increased their performance from pretest to posttest significantly more than a group without any specialized training. Moreover, pretest–posttest correlations were weak and posttest scores were better predictors of reference measures of general cognitive ability. When a test is unfamiliar to a child, it has to pay attention to identifying the test, its underlying elements, the principles and mental representations required for an efficient execution of processes. With ongoing practice, children will learn procedures for executing the relevant processes (cf. Ackerman, 1986). Our findings demonstrate that this practice results in an improved performance on tests of cognitive ability already in the earliest stages of schooling.
Finally, we mention some limitations of our study. The combination of rather low reliabilities for some individual tests and higher overall structural equivalence for the entire test battery leaves room for improvement. It is important not only that future studies focus on stimulus familiarity within the targeted age group alone (as we did), but also that additional ethnographic observations of everyday chores and operations are conducted so as to gain a more detailed understanding of the cognitive processes that underlie the tasks. This more detailed understanding of the cognitive processes might have provided a larger item pool, thereby increasing the reliabilities of instruments without challenging the ecological validity. Stimulus familiarity by itself may not be sufficient to guarantee high psychometric accuracy. Another limitation is the rather limited sample size. More specifically, a larger sample size would have enabled a more detailed analysis of gender differences. There was some evidence that boys scored higher than girls on our test items. The nature of this difference is not well understood. In developing countries this gender difference is common, more so than in developed countries. To our knowledge no formal examination of this difference has been conducted, but some researchers provided possible explanations. For example, Berry and associates (1986) reasoned that girls are given everyday chores that keep them inside the family home, while boys are given chores that allow them to move further away, thus acquiring a much wider and more diverse area of experiences of the social and natural world.
Conclusion
Our study is relevant for the debate on the role of schooling in child development and for ways to address the topic in school settings (Fischer & Immordino-Yang, 2002). Theoretically, the most important implication is that schooling ought to be considered in all theories of cognitive development and not only in theories on the role of culture in cognitive development, as typically is the case now. Development over time is often understood as linear, with researchers assuming that linearity points to a single underlying driving force, such as literacy, maturation, or experience (Schaie, 1965; Vygotsky, 1978). However, our study suggests that cognitive development in childhood is based on at least two developments (based on schooling and everyday life) in an invariant general structure of intelligence (Mackintosh, 1998). Practically this distinction between developmental pathways implies that the general effects of schooling only seem to count with longer periods in school. A community benefits only from schooling by keeping children in school for longer periods (probably longer than one or two years) and by having teachers who gradually include more and more everyday information in their lessons. This last implication particularly would have been more compelling if we had the opportunity to replicate the study in an area with an overall higher level of socioeconomic development, but there would be a large chance that such an environment would show the common very high correlation between education and chronological age that precludes the study of their separate effect. In the present study we were only able to include one economically homogeneous group. Despite this limitation, we would still argue that cognitive development is typically the net result of the increase of everyday skills grounded in daily participation and the increase of skills that are associated with school practice and instruction (Karmiloff-Smith, 1992): for children that do not go to school, everyday development remains relatively important, whereas for children who attend school, skills related to school become increasingly more important and will start to dominate cognitive development.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
