Abstract
Cognitive and neural research over the past few decades has produced sophisticated models of the representations and algorithms underlying numerical reasoning in humans and other animals. These models make precise predictions for how humans and other animals should behave when faced with quantitative decisions, yet primarily have been tested only in laboratory tasks. We used data from wild baboons’ troop movements recently reported by Strandburg-Peshkin, Farine, Couzin, and Crofoot (2015) to compare a variety of models of quantitative decision making. We found that the decisions made by these naturally behaving wild animals rely specifically on numerical representations that have key homologies with the psychophysics of human number representations. These findings provide important new data on the types of problems human numerical cognition was designed to solve and constitute the first robust evidence of true numerical reasoning in wild animals.
Over the past half century, laboratory work on numerical processing has advanced considerably in psychology, neuroscience, and comparative cognition. Research has characterized both the shared and distinct representational systems that animals and humans use to judge numerosity (Dehaene, 2009; Feigenson, Dehaene, & Spelke, 2004; Gallistel, 1990; Nieder & Dehaene, 2009). However, little work has examined numerical processing outside the lab and, what role, if any, number plays in animals’ natural decisions. The problems that animals solve in the wild are certain to help reveal the adaptive pressures on and functions of their cognitive systems, and thus provide insights into the origins of human numerical thought.
One domain in which numerical cognition could serve a natural function is social decision making. Prior research has shown that some species make decisions about group movements or actions collectively; that is, group members choose between two or more exclusive options by reaching some kind of consensus (e.g., Boinski & Campbell, 1995; Byrne, 2000b; Seeley & Buhrman, 1999; Stewart & Harcourt, 1994). This is termed collective decision making. In species that use collective decision making, cooperation enhances fitness (Conradt & Roper, 2003). Under a large range of circumstances, animals who are able to follow a consensus group vote rather than an experienced leader or despot will face fewer fitness costs than those who cannot. Thus, the ability to track and tally votes quantitatively is consequential for survival.
In an exciting recent article, Strandburg-Peshkin, Farine, Couzin, and Crofoot (2015) used GPS collars to measure the troop movements of wild baboons in Kenya. They showed that the baboons made decisions about the direction of troop movement democratically, on the basis of the relative quantity of animals heading in different directions. One interesting question left open is whether the animals’ underlying representations of the vote tallies were truly numerical. There are several alternatives, including size-based representations (i.e., total animal mass moving in one direction), that would give similar answers. Democratic decisions could even be made without any underlying quantitative representation: Randomly picking another individual to follow would yield choices that track the total proportion of votes. Evidence that the baboons’ decision processes were numerical in nature would be important for psychology, as researchers have sometimes assumed that number is a difficult or unnatural concept for nonhuman animals and preverbal human children (Cantlon & Brannon, 2007; Cantrell & Smith, 2013; Church & Broadbent, 1990; Davis & Memmott, 1982; Mix, Huttenlocher, & Levine, 2002; Newcombe, Levine, & Mix, 2015; Simon, 1997). Some have argued that size-based quantities, such as surface area or mass, are more readily used by animals in their decision making. In the study reported here, we formally tested the basis of animals’ quantitative decisions using the natural experiment provided by wild baboons’ troop movements.
Method
We used data from the natural behavior of wild baboons to test the natural statistics and psychological models of quantitative judgments in wild primates. Our data were made available by Strandburg-Peshkin et al. (2015).
Subjects
Twenty-six baboons (14 adults, 10 subadults, and 2 juveniles) from the Mpala Research Center in Kenya were fit with GPS collars for the study by Strandburg-Peshkin et al. (2015). One adult’s collar failed, so the total sample size was reduced to 25 individuals. Each animal was weighed during the collaring procedure.
Data collection and preprocessing (from Strandburg-Peshkin et al., 2015)
After the animals were fit with collars, they were returned to their social group in the reserve. The GPS collars were used to continuously follow the animals for 14 consecutive days, recording their coordinates at a rate of 1 Hz. In all analyses, we retained the data preprocessing performed by Strandburg-Peshkin et al. and reported in detail in their article. To summarize, the GPS track from all 25 animals was searched for instances of troop movement decisions. Decision events were extracted by Strandburg-Peshkin et al. (see their Supplemental Material) using dyads’ sequences of movements apart and together. These dyadic interactions were grouped into “events,” in which one animal, a “follower,” was potentially pulled in different directions by two simultaneous movements of groups of other individuals in the troop. These procedures resulted in 9,376 events that each involved two subgroups. Strandburg-Peshkin et al. found that at low angular disparities (< 90°), the animals followed the vector of average direction between the subgroups, so we restricted our analyses to events in which the angular disparity between subgroups was greater than 90° and the number of animals in the two subgroups differed. This resulted in a final sample of 1,773 events. For each event, we analyzed the follower’s subgroup choice as a function of subgroup number and mass.
Analysis
We analyzed the data according to the original study’s parsing of the continuous GPS track into discrete decision events in which a baboon determined which of two groups to follow (Strandburg-Peshkin et al., 2015). We used the measured numerosity of each subgroup and each subgroup’s total weight, as a proxy for spatial extent. These variables were used as predictors of individual baboons’ decisions. We then fit a variety of psychophysical models—each motivated by existing literature—to the animals’ aggregate choice behavior . These models (see Table 1) were of six general classes: proportional voting models (a, b), a linear-scale model (c), compressed-scale models (d–f), models with scalar variability only for large numbers (g–j), a model with noisy 1-to-1 correspondence (k), and baseline choice models (l–n).
Description of the Models Tested and Their Akaike’s Information Criterion (AIC) Values for Predicting the Probability of Choosing the Subgroup With More Members (Number AIC) and the Subgroup With Greater Total Mass (Mass AIC)
Note: In the formulas, x1 refers to the numerosity or total mass of the subgroup with the greater numerosity or mass, x2 refers to the numerosity or total mass of the subgroup with the smaller numerosity or mass, Φ denotes the standard normal cumulative distribution function, C denotes constants fit to the data, and δ is the delta function. Lower AIC scores indicate better fit. Similar results were found using the Bayesian information criterion (BIC) and log likelihood as measures of fit. AIC values for mass predictors were not computed for Models g through j, which depended on specific cardinalities.
In the proportional voting models, the animal was hypothesized to pick an individual to follow at random. This would result in choice probabilities that tracked the proportion of individuals in each subgroup. Model a used raw proportions, and Model b included a smoothing term. In the linear-scale model (c), the animals were hypothesized to represent each subgroup as a linear quantity with normal noise and constant variability. In this model, the probability of choosing a subgroup was based on comparison of absolute values, as might be expected in a uniformly noisy counting model. The compressed-scale models included standard models from psychophysics that assign greater acuity or representational resources to the smallest values on the scale relative to larger values. Model d used a linear scale with linearly increasing (Weber) noise, Model e used a logarithmic scale with Gaussian noise, and Model f used a power-law scale with Gaussian noise. Models g through j assumed subitizing, or two systems (Feigenson & Carey, 2005; Feigenson et al., 2004): fixed performance for subgroups numbering less than or equal to 4 (Models g and h) or 3 (Models i and j), and scalar variability when either (Models h and j) or both (Models g and i) numbers exceeded this bound. Model k implemented a noisy 1-to-1 correspondence in which the winning group was assumed to be computed through 1-1 pairing of votes, but each pairing failed with some probability. The animals responded to movement at chance (50-50) if a pairing failed and perfectly otherwise. The three baseline models included a model in which the greater cardinality was chosen with some fixed accuracy (Model l), a model with a softmax choice rule popular in decision theory (Model m), and a model based on a logistic function (Model n).
Each model predicted the probability of choosing each of the subgroups as a function of the number of individuals or total mass (animal weight) in each subgroup. Each model included no, one, or two free parameters that quantified unknown aspects of the animals’ choices (e.g., the noise in representations). These parameters were fitted to the observed choices using R’s optimize function (Brent, 1973). Akaike’s information criterion (AIC) was used to compare models because significance testing cannot be used to compare nonnested models (Akaike, 1974). A lower AIC indicates stronger support for a model, and the magnitude of difference between AIC values indicates the amount of evidence. Note that AIC incorporates a principled penalty on the number of free parameters. When the AICs for two models differ by at least 10, this is typically considered strong evidence for the model with the lower AIC over the alternative (Burnham & Anderson, 2002).
Results
Our analyses yielded five main findings. First, the natural correlation between number and mass in the environment was high. Second, despite this high natural correlation, number was a better predictor of choice than was mass or simple transformations of size (e.g., surface area or linear extent). Third, number was represented on a compressed scale. Fourth, the wild baboons’ natural numerical acuity was consistent with that observed in previous laboratory studies. And finally, the natural frequency with which the animals encountered each numerosity was strongly skewed toward low numbers. Taken together, the results provide strong evidence that numerical reasoning has a natural function in wild primates, and provide new insights into the distribution of quantitative variables in the natural environment.
An untested assumption in research on numerical and spatial cognition is that humans and animals encounter highly correlated quantitative stimuli in the natural environment (e.g., Newcombe et al., 2015). For example, a large number of individuals typically takes up more space and has greater mass than a small number of individuals. Although this is a reasonable assumption, the relation between quantitative dimensions has never been tested in the natural environment. We found that for the wild baboons in this study, the correlation between the number and total mass of animals in each subgroup was quite high (R2 = .92; see Fig. 1 for a scatterplot showing total mass as a function of number across decision events).

Scatterplot illustrating the natural correlation between number and mass in the subgroups of the wild baboons.
The implication of this finding is that animals receive substantial input from the environment indicating that there is a tight relation between number and mass. However, they also experience instances in which number and area are anticorrelated. For example, we found substantial overlap between the distributions of mass values for groups of 5 and 10 individuals. If the animals in our analysis represented number as an independent dimension, then they should have been able to distinguish subgroups even when number and mass were anticorrelated. In fact, we found that they used number to identify the larger of two subgroups when number and mass were anticorrelated. In particular, when number and mass predicted different choices, the animals selected the numerically larger subgroup (the subgroup with smaller mass) 71% of the time, 95% confidence interval = [62%, 78%]. This percentage was significantly greater than 50%, p < .001 (most available comparisons involve low cardinalities; see Table S2 in the Supplemental Material available online). Thus, despite the high natural correlation between number and mass, the baboons represented number as an independent dimension in their decision making.
We tested specific hypotheses about the animals’ underlying representations by comparing the mathematical models (see Table S1 in the Supplemental Material for the specific cardinalities in the data set). The results in Table 1 show that the models that performed best used number instead of mass as the predictor variable. Strikingly, the best mass-based model performed worse than nearly all of the number-based models. The superiority of the number-based models provides strong evidence that the animals used numerical value as the basis of their behavior. The best number-based models (d–f, g, and i) all used a compressed scale (Gallistel, 1990) for number. Model c, which fit constant noise across the range of numbers, fared particularly poorly, demonstrating that the precision of representations is not constant across numbers. This superiority of models with compressed scales is consistent with data from neural recordings of nonhuman laboratory primates (Nieder & Dehaene, 2009; Nieder & Miller, 2003).
The model comparison also allowed us to test whether, alternatively, the animals’ behavior appeared to track the number in each subgroup because they chose another individual at random to follow. If the animals had followed that strategy, they could have appeared to be reasoning on the basis of cardinality comparison, but in fact would not have needed to represent any numerosities. Decisions based on picking a single individual to follow predict that choice probability should simply match the proportion of individuals in each group. We formalized this strategy simply in Model a and with smoothing in Model b. Neither of these models performed well. Thus, it is unlikely that the animals’ behavior was driven by randomly picking another individual to follow. Model k, which implemented noisy 1-to-1 correspondence, also performed poorly, which suggests that the algorithm we modeled was not the basis of these decisions. In addition, the models that free-fit accuracy for small numbers (g–j) did not perform as well as the best compressed-scale models (e and f). This suggests that a single representational system was used for low and high cardinalities. The baseline models (l–n) also performed substantially worse than the models with compressed scales. The animals likely did not make choices with noise that was independent of numerosity or use choice rules common in decision theory and statistics. Together, these comparisons strongly support the interpretation that the baboons used compressed scales commonly found in numerical cognition.
Next, we found that the precision of the baboons’ numerical discrimination, as quantified by their Weber fraction (the constant C in Model d) was 0.63 (bootstrapped 95% confidence interval = [0.59, 0.68]; see Fig. 2), a level similar to that reported in previous laboratory studies of baboons (Barnard et al., 2013; Cantlon, Piantadosi, Ferrigno, Hughes, & Barnard, 2015; Ferrigno, Hughes, & Cantlon, 2015). Thus, there is now evidence from both controlled laboratory studies and studies of behavior in the wild that monkeys naturally discriminate numerical values at about a 2:3 ratio, which is comparable to the performance of 3-year-old human children (Ferrigno et al., 2015; Halberda & Feigenson, 2008).

The baboons’ accuracy in choosing the larger subgroup as a function of the numerical ratio of the minimum cardinality of the two groups to the maximum cardinality, (min(x1, x2)/max(x1, x2)), rounded to the nearest 10th. The plotted points show the means, and the error bars indicate bootstrapped 95% confidence intervals. The line shows the fit of a generalized linear model, with the 95% confidence interval shaded.
Finally, this data set provides a unique opportunity to understand the frequency distribution of number use among wild animals. This need probability of each number is important because efficient systems of representing and processing numbers would be tuned to how often different numerosities must be used. It has been argued that the frequencies of number words in human language reflect number-need probabilities in human cognition, and it has been found that the probability that people must represent a number n scales as 1/n2 (Dehaene & Mehler, 1992; Dorogovtsev, Mendes, & Oliveira, 2006; Jansen & Pollmann, 2001; Piantadosi, 2016). This distribution can explain the use of compressed scales (Piantadosi, 2016), and the general trend matches the overrepresentation of lower numbers that has been found in the primate brain (Nieder & Merten, 2007). Figure 3 shows how frequently each subgroup cardinality was encountered, according to Strandburg-Peshkin et al.’s (2015) data. The distribution tracks the same distribution that has been found for humans, showing a strong skew toward the lowest numbers and a falloff that scales as 1/n1.95 (fit line).

Natural frequency with which the baboons encountered subgroups of each numerosity in the wild, with a power-law fit (α = −1.95) to the data.
Discussion
Although many studies have emphasized the role of quantitative representation in animal behavior, no studies have formalized the representation that accounts for animals’ quantitative judgments in the wild. Our study compared several formal models of representation that the baboons in the study by Strandburg-Peshkin et al. (2015) could have used during their natural quantitative decisions about troop movement. We found that the models that best accounted for the animals’ decisions used approximate-number comparisons with a compressed scale. These models were superior to models that tested subitizing, or two-systems, numerical representations (e.g., Feigenson & Carey, 2005; Feigenson et al., 2004) and models that represented heuristic decision rules. The approximate-number model also was superior to alternative models that relied on approximations of total mass instead of number. Few prior studies distinguished number from area or mass in animals’ natural quantitative behavior (see Gallistel, 1989; Hunt, Low, & Burns, 2008). Some prior field studies explored the role of quantitative cognition in brood parasitism and intergroup encounters, but left ambiguous whether subjects represented number or alternative dimensions (McComb, Packer, & Pusey, 1994; White, Ho, & Freed-Brown, 2009; Wilson, Hauser, & Wrangham, 2001). In most studies, number has been confounded with an alternative dimension, such as mass (for visual objects, such as food, offspring, or conspecifics) and duration (for auditory stimuli, such as vocalizations). Our study is the first detailed and formal analysis of the role of number representation in natural animal behavior, that is, behavior that has not been specifically trained or elicited in a testing environment.
Researchers have often questioned whether number is a natural dimension to represent (Cantrell & Smith, 2013; Newcombe et al., 2015). Some research has shown, for example, that pigeons and rats attend to spatial and temporal cues instead of numerosity in laboratory experiments (Davis & Memmott, 1982). It has been argued that spatiotemporal properties such as size and mass may be more natural concepts than numerosity in humans and animals (Cantrell & Smith, 2013; Church & Broadbent, 1990; Davis & Memmott, 1982; Mix et al., 2002; Simon, 1997). The data from the Mpala baboons’ troop movements suggest otherwise—that number is fundamental to the cognition of wild primates.
We found not only that numerical perception is a natural ability in wild baboons, but also that the monkeys’ sensitivity to numerical value is comparable to humans’. Similarities in the functioning and sensitivity of numerical cognition between humans and nonhuman primates can point to shared cognitive mechanisms. Substantial prior evidence shows that humans and nonhuman primates engage similar cognitive and neural mechanisms during nonverbal numerical estimation (Cantlon & Brannon, 2007; Nieder & Dehaene, 2009; Nieder & Miller, 2003). We found that the natural numerical sensitivity of wild baboons (Weber fraction = 0.63) is comparable to that of 3-year-old human children (Weber fraction = 0.53; Halberda & Feigenson, 2008). The implication is that wild baboons’ numerical cognition is similar to the raw, preverbal numerical cognition of humans—that is, human numerical cognition that emerges prior to the acquisition of number words. Our results provide novel evidence of evolutionary continuity in the primitive numerical reasoning abilities of humans and wild nonhuman primates.
Another novel piece of evidence for continuity is our finding that the frequency distribution of the numerical set sizes encountered by the baboons tracks a power law, with a strong skew toward the lowest numbers (i.e., < 6). A similar distribution has been reported for human language (Dehaene & Mehler, 1992; Dorogovtsev et al., 2006; Jansen & Pollmann, 2001; Piantadosi, 2016). Our findings are the first to suggest that humans and nonhuman animals naturally experience similar environmental pressures for representing small numerical values. This is important because environmental pressure to represent small numerosities is a possible causal factor in the evolution of cognitive systems for numerical representation (Piantadosi, 2016).
Numerical representation in nonhuman animals has been shown to relate to humans’ nonverbal numerical estimation abilities (e.g., Beran, 2007; Brannon & Terrace, 1998; Cantlon & Brannon, 2006). Monkeys and adult humans tested in laboratory experiments show similar accuracies and response times when asked to rapidly estimate the numerically larger of two visual arrays (Cantlon & Brannon, 2006). Those prior data provide important evidence of a common capacity for learning about numerical relations in humans and nonhuman primates. Research on the natural functions of numerical reasoning adds a new dimension to understanding those prior findings because it provides information about what could have caused numerical cognition to emerge in the first place during evolution.
Our results provide novel evidence that one of the natural functions of numerical reasoning in primates is to monitor social behavior during collective movements. Collective decision making is widespread among primate species, which suggests that the behavior existed in a distant common ancestor of human and nonhuman primates (Byrne, 2000a; Conradt & Roper, 2003). Empirical examples of collective decision making in wild primates are diverse, and such behavior has been observed in apes (Stewart & Harcourt, 1994), Old World monkeys (Byrne, 2000b), and New World monkeys (Boinski & Campbell, 1995). Our analyses show that numerical representation is the mechanism by which baboons cognitively track and tally votes during social decision making. Democratic decision making is thus one utility of numerical representation in the primate lineage. This finding moves researchers closer to understanding the types of problems human numerical cognition was designed to solve.
Footnotes
Acknowledgements
This research was supported by grants from the National Science Foundation (DRL1459625, to J. F. Cantlon), the National Institutes of Health (R01 HD085996, to J. F. Cantlon and S. T. Piantadosi), and the James S. McDonnell Foundation Understanding Human Cognition Program.
Action Editor
Steven W. Gangestad served as action editor for this article.
Declaration of Conflicting Interests
The authors declared that they had no conflicts of interest with respect to their authorship or the publication of this article.
Open Practices
The analyses reported here were based on independent data collected by Strandburg-Peshkin, Farine, Couzin, and Crofoot (2015). There are no materials to share. The complete Open Practices Disclosure for this article can be found at
.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
