Abstract
Languages carve up conceptual space in varying ways—for example, English uses the verb cut both for cutting with a knife and for cutting with scissors, but other languages use distinct verbs for these events. We asked whether, despite this variability, there are universal constraints on how languages categorize events involving tools (e.g., knife-cutting). We analyzed descriptions of tool events from two groups: (a) 43 hearing adult speakers of English, Spanish, and Chinese and (b) 10 deaf child homesigners ages 3 to 11 (each of whom has created a gestural language without input from a conventional language model) in five different countries (Guatemala, Nicaragua, United States, Taiwan, Turkey). We found alignment across these two groups—events that elicited tool-prominent language among the spoken-language users also elicited tool-prominent language among the homesigners. These results suggest ways of conceptualizing tool events that are so prominent as to constitute a universal constraint on how events are categorized in language.
Keywords
Language imposes category structure on the dynamic world of events around us. Any given event can be categorized in multiple ways (DeLancey, 1991; Gleitman, 1990). For example, the activity shown in Figure 1a could be described as “painting” but also as “creating art,” “holding a paintbrush,” or “having fun.” Verbs in different languages often impose different categories. For example, English speakers commonly describe the events in Figures 1a and 1b using two different verbs (paint, draw). In contrast, Mandarin Chinese speakers use the same verb for both events (huà, “create a pictorial representation”). In this study, we asked whether, despite this variability, there are cross-linguistically universal ways of using language to categorize events. If so, what gives rise to these common patterns?

Tool event pictures from Richard Scarry’s Best Word Book Ever (Scarry, 1980). English glosses for these pictures are (a) a pig painting, (b) a rabbit drawing, (c) a rabbit slicing oranges.
We drew on a unique source of data to address these questions—event descriptions from 10 child homesigners spanning five countries (Guatemala, Nicaragua, United States, Taiwan, Turkey). Homesigners are deaf individuals who do not have access to a language model. Their hearing losses prevent them from acquiring the spoken language in their environment, and they have not been exposed to a sign language. Despite their lack of a usable model for language, these individuals communicate using self-created gestures that function like natural human language and have many of the properties of language (Coppola & Newport, 2005; Goldin-Meadow, 2003, 2020). Homesign therefore provides a window onto categories that people use to communicate that are not shaped by conventional language (Goldin-Meadow & Mylander, 1998; Morford & Kegl, 2000). If homesigners from distinct cultures, and speakers of different spoken languages, describe an event in the same way, we have evidence for a universal constraint on how the event is categorized in human communication.
Cross-linguistic differences in event descriptions, as in the contrast between how Figures 1a and 1b are described in English versus Mandarin Chinese, are the norm rather than the exception (Kemmerer, 2019; Malt & Majid, 2013). One well-studied example of this phenomenon involves events of material destruction (“cutting and breaking” events; Majid et al., 2008). There is diversity, as well as commonality, in how languages categorize events within this domain. A common dimension along which languages tend to distinguish events is “predictability of the locus of separation”—whether the change undergone by the object can be predicted from how the object was acted on (Majid et al., 2008, p. 242). For example, if someone slices carrots with a knife, where the carrot will come apart into pieces is highly predictable. To the extent that common dimensions have been observed in cross-linguistic categorization, these commonalities have been attributed to many factors, including shared conceptual/perceptual structure, the need for efficient communication, and/or common sociocultural priorities (see Goddard & Wierzbicka, 2014; Kemp et al., 2018; Malt & Majid, 2013).
In this study, we investigated cross-linguistic category structure in events involving tools, where a tool is an object acted on by an agent in order to affect another object (e.g., in Fig. 1a, the brush is a tool). We used verbs as indicators of how a language user conceptualizes an event (Fisher et al., 1994), specifically analyzing whether a verb highlights the role of a tool. The action in Figure 1c, for example, can be described as slicing oranges or as making orange juice. Slice is a more tool-oriented verb than make, exemplifying the broader phenomenon that verbs such as chop, poke, and stir imply the presence of a tool more than verbs such as separate, tease, and serve do (Koenig et al., 2003, 2008; Rissman et al., 2015, 2019). If people using different languages all describe an event with a tool-prominent verb, the assumption is that they all understand the tool to be a conceptually prominent part of the event.
Statement of Relevance
Across the globe, thousands of distinct languages are spoken by people from a diverse array of cultures. We investigated whether there are, nonetheless, fundamental ways of thinking about events that are shared by all people. We asked two groups of people with different language and cultural backgrounds to describe everyday actions (such as cutting bread with a knife): (a) adult speakers of English, Spanish, and Chinese and (b) deaf children, from five countries, who have learned neither a signed nor a spoken language. We found broad similarities across these groups, indicating ways of thinking about the world—and describing the world through language—that are common to all people. As cultures and languages of the world become more interconnected, these results help us understand which aspects of our thinking and communication are a product of being human rather than a product of our particular cultures and experiences.
Tool events are an important domain of study because tool use is universal across cultures (Brown, 2004). It is also shared with a variety of primate and nonprimate species, indicating a long evolutionary history (Haidle, 2014; Seed & Byrne, 2010). Although humans’ ability to manipulate and reason about tools is robust even in childhood (Neldner et al., 2020), it is not inevitable that tool knowledge strongly constrains how event components are packaged into language. Linguistically encoding a tool appears to have lower priority relative to other event components—for example, English- and Turkish-learning children frequently fail to mention tools when describing tool events (Grigoroglou & Papafragou, 2019a, 2019b; Ünal et al., 2021), and tools are visually identified and remembered less robustly than the goals of events (Ünal et al., 2021; Wilson et al., 2011). Tools may be peripheral enough in event representation (relative to agents, for example) that there are only weak constraints on whether and how they are encoded in language (see Rissman & Majid, 2019).
We explored this possibility by comparing how child homesigners from different cultures, and users of different spoken languages, describe tool events. Homesigners describe events in structured ways—they create word orders that distinguish agents from patients (Goldin-Meadow & Mylander, 1998), and they modulate the shape of their hands to distinguish agentive from nonagentive events (Coppola & Brentari, 2014; Rissman et al., 2020; Rissman & Goldin-Meadow, 2017). They also impose abstract structure on events—for example, two people kissing is construed as an inherently symmetrical event, unlike two people punching each other, which is construed as two reciprocal events (Gleitman et al., 2019). We observed homesigners’ descriptions of tool events and measured how often they included information about the tool; that is, how often the shape of the hand in a description represented the tool itself (“instrumental handshape”) as opposed to the hand holding the tool (“handling handshape”; see Fig. 2). When hearing people from the United States, Mexico, and The Netherlands describe tool events using only their hands and not their voice, they favor handling over instrumental handshapes (Ortega & Özyürek, 2020; Padden et al., 2015)—this preference reflects the direct iconic mapping found for handling handshape between the shape of the hand and how a person actually holds a tool. By contrast, instrumental handshapes highlight properties of the tool itself, such as its shape or size.

Photos of homesigner Guatemala 1 describing Figure 1a using (a) a gesture with a handling handshape (how a hand would hold a paintbrush) and (b) a gesture with an instrumental handshape (the shape of the paintbrush). Schematic images are shown beneath each still image to highlight the shape of the child’s hand.
We compared child homesigners’ propensity to use instrumental handshapes with adult speakers’ propensity to use tool-prominent verbs (such as slice) when describing tool events. Adult speakers and homesigners were different in many ways: They used a spoken versus signed language, they learned versus created their language, and they were adults versus children. The individuals in our study also live in a range of cultural contexts. If these populations nonetheless have common ways of describing tool events, we would have strong evidence of universal constraints on how concepts are encoded in language in this domain. If we found evidence for such universals, we could then ask about the conceptual constraints that give rise to the universals. The spoken languages we selected for this study were English, Spanish, and Mandarin Chinese, building on prior work on tool-prominent verbs in these three languages (Rissman et al., 2015, 2019).
Open Practices Statement
Coded data, analysis scripts, stimuli, and supplementary materials are publicly available at https://osf.io/jmv94. Video recordings have not been made publicly available because participants did not consent that identifying data could be released. None of the studies reported in this article were preregistered. All the homesign data that were coded and analyzed for this study have been reported.
Method
Participants
Six Guatemalan homesigners (ranging in age from 6 years, 11 months, to 11 years, 0 months) participated in the study. Four of the participants (1, 2, 3, 4) have contact with at least one other deaf signer. For example, Participant 1 lives as part of a family of four, including her mother, who is also deaf. Participant 2 has a younger brother who is deaf, as is her grandfather. Two of the participants (3 and 4) attend a local school for special education together, whereas three participants (1, 5, 6) attend the schools that are closest to their homes with other hearing children. At the time of this study, Participant 2 did not attend her local school but began attending the same school for special education after these data were collected. Although four of the Guatemalan participants have contact with another deaf signer, none of the participants have learned the national sign language used in Guatemala City, Lensegua, because it is not commonly used where they live and is not taught in their local schools. All participants are active social members of their families and have siblings, cousins, and neighbors with whom they interact and play. The children also contribute around the home, helping out with chores and caregiving of younger relatives (see Horton, 2020, for more detailed description of participants from Guatemala). Hearing status was determined through parental report, as audiological assessments are rare in Guatemala.
To put the data from the Guatemalan homesigners in a cross-cultural context, we also include data from four homesigners from four other cultures: the United States, Taiwan, Nicaragua, and Turkey. Aspects of these four homesigners’ systems have been previously reported (Flaherty et al., 2021; Goldin-Meadow et al., 1994; Goldin-Meadow & Mylander, 1998; Özyürek et al., 2015), but not the data described here. The 10 children in this study were selected because they produced a sufficiently large number of descriptions of the target stimuli (see Materials and Procedure). The ages and nationalities of the 10 homesigners, as well as the number of descriptions they produced, are listed in Table 1. This research was reviewed and approved by our universities’ institutional review boards, and informed consent was obtained for all participants.
Participant Information
Note: “Max” indicates the total number of different events/objects that were described across all children.
Materials and procedure
Participants described illustrations from Richard Scarry’s Best Word Book Ever (Scarry, 1980). This storybook contains detailed, full-page scenes of cartoon animals engaged in various everyday actions such as cooking in a kitchen, studying at school, or playing in a playground. Participants were free to leaf through the book and describe whatever illustrations they chose. Participants described the book to a researcher, family member, or close friend.
We analyzed descriptions of two types of pictures: tool events (pictures in which an individual is acting on an object with the goal of affecting a second object) and tool objects (pictures of objects presented in isolation that have a conventional function). Examples of tool events are shown in Figures 1 and 3. Example tool objects were pictures of a fork, a knife, a pencil, a paintbrush, and a saw. Sixty-five event pictures and 88 object pictures were described by at least one homesigner in our sample. All event and object pictures are listed in Tables S2 and S3 in the supplementary materials.

Tool-prominence scores in each spoken-language pair (Spanish–English, Chinese–English, Chinese–Spanish) for each event picture. Each dot in the three graphs represents a single event. Examples of nine of the event pictures are shown on the right; their data are marked with capital letters on the graphs.
Homesign coding
Videos of homesigners describing the Scarry word book were coded by the first and second authors and by a research assistant trained to reliably code homesign. We annotated homesigners’ descriptions of event or object pictures using the software ELAN (Crasborn & Sloetjes, 2008). For each sign, we coded whether the shape of the hand depicted the shape of the tool itself (instrumental handshape) or the shape of a hand holding the tool (handling handshape). Handshapes that were ambiguous or difficult to discern from the angle of the video were coded as ambiguous and not included in the analyses (6.4% of descriptions were excluded for this reason). For example, in one excluded sign, the Nicaraguan homesigner described a bear washing its face with a hand towel (image G in Fig. 3). The homesigner raises two flattened palms near his face and produces a rotating motion indicating scrubbing—this handshape is ambiguous because it could represent how someone holds a towel (handling handshape) or the towel itself (instrumental handshape).
Spoken-language norms
We collected norms indicating the tool prominence, as encoded in English, Spanish, and Mandarin Chinese, of the events that the homesigners described. If users of these three diverse languages chose to describe an event such as the one in Figure 1c using a tool-prominent verb (e.g., slice), we concluded that this event was conceptualized by these speakers as tool oriented. We first collated the verbs used to describe each event (see the following section) and then determined how tool-prominent each of these verbs was (see Verb-Meaning Judgments below). We collected data from speakers of English, Spanish, and Mandarin Chinese because these languages represent two different language families (Indo-European and Sino-Tibetan) and previous studies of the instrumental semantics of verbs show that the three languages align in some ways but diverge in others (Koenig et al., 2008; Rissman et al., 2015, 2019). To distinguish the tool prominence of an event from the function associated with the tool when it is not being used, we also collected data on verbs in each language used in naming the typical function of the tool objects described by the homesigners.
Event and object descriptions
Native speakers of English (n = 16), Spanish (n = 15), and Mandarin Chinese (n = 12) described the 65 events and 88 object pictures that at least one homesigner in our sample described. Previous research from our lab suggests that a sample size of 12 is sufficient to approximate the range and distribution of verbs used in description tasks of this type. English and Spanish speakers (located in the United States and South America, respectively) were recruited on Amazon Mechanical Turk; Chinese speakers were recruited through The University of Chicago psychology subject pool. English speakers described the events given the prompt, “What is happening in this picture?” and described the objects given the prompt, “What action is commonly performed with this tool?” For the Spanish and Chinese description tasks, these prompts were translated by native speakers of these respective languages. The events and the objects were presented in separate blocks, with the event block preceding the object block.
We coded the event descriptions according to the main verb phrase that characterized the action in the event. For example, for the description “the cat is writing,” the main verb phrase was “write,” and for “the cat is taking notes as reporter,” the main verb phrase was “take notes.” For the object prompt, if participants provided multiple verbs (e.g., scooping or shoveling to describe the typical function of a scoop), we included the first verb in the description. Across both event and object pictures, we elicited 242 English verbs, 244 Spanish verbs, and 183 Chinese verbs.
Verb-meaning judgments
We collected judgments about verb meanings from 33 English speakers, 18 Spanish speakers, and 20 Chinese speakers. English and Spanish speakers (located in the United States and South America, respectively) were recruited on Amazon Mechanical Turk; Chinese speakers were recruited through the University of Wisconsin–Madison psychology subject pool. Participants were instructed to judge whether a verb “highlights” a tool. English speakers, for example, were told that verbs such as shred and prod highlight a tool that is used for shredding or prodding, respectively. Participants were also told that some verbs (such as rotate or defrost, for English-speaking participants) are compatible with tools but do not highlight them. Participants completed practice trials (with feedback) in which they decided whether or not a verb highlights a tool. In total, participants encountered 18 seed verbs over the course of the instruction and practice trials, nine of which highlighted tools and nine of which did not. These verbs were selected on the basis of prior linguistic classification by Koenig et al. (2008) and Rissman et al. (2015). The 18 Spanish and 18 Chinese seed verbs were selected on the basis of work by Rissman et al. (2019) and through native-speaker intuitions. The Spanish and Chinese instructions were translated from English by native-speaker consultants. The seed verbs for each language are listed in Table S1 in the supplementary materials.
After completing the training, participants judged whether or not the verbs elicited in the description task highlighted a tool. In each language, we tested a subset of the total number of verbs that were elicited across all descriptions: 110 English verbs produced more than three times, 108 Spanish verbs produced more than three times, and 99 Chinese verbs produced more than twice. 1 As an attention/comprehension check, we also tested the 18 seed verbs that participants encountered during the training phase. Participants needed to answer correctly on at least 13 out of 18 seed-verb trials for their data to be included (p = .03 on a binomial test). 2 Participants viewed the verbs in a random order.
The verbs were presented in sentences with example tools such as “Jan CAUGHT something [with a net].” Example tools were given to help clarify the intended sense of the verb (e.g., physical catching rather than catching a cold). The verb was emphasized through capital letters in English and Spanish and through quotation marks in Chinese. The tool was introduced through the preposition with in English, the preposition con (“with”) in Spanish, and the periphrastic verb yòng (“use”) in Chinese.
We calculated the prominence of the tool in each event in each language by weighting the judgment mean for each verb by how often that verb was produced for that event. For example, for the event in Figure 1c, the most common verbs produced by English speakers were, in descending order, cut, slice, juice, and make. These verbs were judged to highlight a tool on 88%, 97%, 52%, and 3% of trials, respectively. Weighting the judgment means for each verb by how often that verb was produced leads this event to have a tool-prominence score of .70 in English. The prominence score for this event was .70 in Spanish and .80 in Chinese. For verbs that were produced infrequently in the description study, we did not collect judgment data and they did not figure into the tool-prominence calculation for each event or any other analyses. We calculated prominence scores for tool objects in a parallel manner.
Handshape affordance ratings
As described in the Homesign Coding section, we coded how often homesigners described tool events using instrumental handshapes. The human hand affords a wide range of physical configurations—some objects are better represented through the hands than others. For example, the shape of a pencil is fairly well represented by an extended index finger, whereas the shape of a ruler is less well represented by this handshape. Across the 65 events in the data set, the shape of the tool could be faithfully represented in the shape of the hands to a greater or lesser degree. To quantify how much the shape of the tool in each event afforded an instrumental handshape, we collected ratings from 24 English-speaking adults on Amazon Mechanical Turk. In each trial of this rating task, participants were shown one of the event pictures alongside a photograph of a hand in a particular shape (e.g., a fist with the index finger extended; Fig. S1 in the supplementary materials shows an example trial). Participants were asked to rate, on a 5-point scale, “How well does the shape of the hand match the shape of the [tool label]?” Each event was paired with a single handshape photograph, selected on the basis of which instrumental handshape the homesigners produced most often for that event. For example, for the event in Figure 1c, the most common handshape was a flat hand without gaps between the fingers and thumb (a “B” handshape). For events in which homesigners never produced instrumental handshapes, we consulted established sign-language dictionaries to identify a handshape that could reasonably be used to represent the shape of the tool. For example, for a picture of a pig rolling out dough with a rolling pin, the homesigners described this event using only handling handshapes. In German sign language, the sign for a rolling pin includes a fist handshape (an “S” handshape), so we paired the picture of the pig using a rolling pin with a photograph of the “S” handshape. 3 In total, each of the 65 event pictures was paired with one of 13 different handshape photographs.
Results
Tool events
Events that were likely to be described with tool-prominent verbs in one language were also likely to be described with tool-prominent verbs in another language. We found significant positive correlations for each spoken-language pair—English vs. Spanish: r(62) = .80, p < .001 (Fig. 3a); English vs. Chinese: r(63) = .70, p < .001 (Fig. 3b); Spanish vs. Chinese: r(62) = .65, p < .001 (Fig. 3c). Each dot in Figures 3a to 3c represents data for a single event; examples of nine of the events are displayed on the right, and the data for those events are marked in the graphs with capital letters.
We next asked whether the tool-prominence scores for spoken language predicted instrumental handshapes in homesigners. Across all 10 homesigners, we observed 474 event descriptions: 302 of these signs had handling handshapes, 143 had instrumental handshapes, and 29 had ambiguous or nondiscernible handshapes. Signs with ambiguous or nondiscernible handshapes were excluded from analysis. The ratios between handling and instrumental handshapes differed across the 65 event pictures in our sample—handling handshapes were always used for 32 pictures; in contrast, instrumental handshapes were always used for only three pictures. Figure 4 shows the average rate of instrumental handshapes (i.e., instrumental/(instrumental + handling)) for each event picture across all homesigners. These differences suggest that homesigners represent the tool as more prominent for some events than for others.

Average rate of producing instrumental handshapes for each tool event picture. Each tick mark on the y-axis corresponds to an individual picture and is labeled with an English gloss of the picture. Note that each gloss represents only one possible construal of each picture. Letter labels for individual tick marks correspond to the pictures displayed in Figure 3. Darker bars indicate a larger number of descriptions of a picture across all homesigners than lighter bars.
To situate tool prominence in homesign in relation to tool prominence in our three spoken languages, we calculated the mean spoken-language tool-prominence score for each event picture (equally weighting English, Spanish, and Chinese). The average tool-prominence scores across the three spoken languages were significantly positively correlated with average rates of producing instrumental handshapes for the homesigners, r(62) = .42, p < .001 (see Fig. 5). We modeled homesigners’ production of instrumental versus handling handshapes in event-picture descriptions using mixed-effects logistic regression and the lme4 package for R (Bates et al., 2014; R Core Team, 2022). We included random intercepts for homesigner and picture. We also included by-subject random slopes for each independent variable, except in cases where adding random slopes led to failed model convergence. All continuous predictors were standardized. Before we included the spoken-language tool-prominence scores in our model, we first tested two variables that did not reflect event conceptualization but may have predicted use of instrument handshapes. The first variable was handshape affordance ratings (i.e., how well the shape of the human hand matched the shape of the tool in the picture). We did not find that this measure predicted instrumental handshape, b = −0.52, 95% confidence interval (CI) = [−1.37, 0.32], z = −1.21, p > .1. The relationship between handshape affordance ratings and homesigners’ use of instrumental handshapes is shown in Figure S2 in the supplementary materials. We next tested whether the relative size of the tool in the picture predicted instrumental handshape use—it may be that tools that are more visually prominent are more likely to elicit instrumental handshape in homesigners. We approximated this value for each event picture by calculating the ratio between the area of the tool and the area of the agent (e.g., the rabbit) using the website SketchAndCalc (https://www.sketchandcalc.com/). Again, we did not find that this measure predicted use of instrumental handshapes, b = −0.41, 95% CI = [−1.23, 0.41], z = −0.96, p > .1, indicating that relative size of the tool did not influence homesigners’ handshape choices. 4

Tool-prominence scores averaged across English, Spanish, and Mandarin Chinese versus rates of instrumental handshape in homesign for each event picture. Point size indicates the total number of descriptions of that picture across all homesigners. Letter labels for individual points correspond to the pictures displayed in Figure 3.
Mean spoken-language tool prominence averaged over all three languages significantly predicted use of instrumental handshape in homesign, b = 1.34, 95% CI = [0.54, 2.14], z = 3.29, p = .001. The tool-prominence scores for the individual spoken languages also significantly predicted instrumental handshape (English: z = 2.74, Spanish: z = 2.94, Chinese: z = 3.05), as might be expected because all three spoken languages were correlated. These findings demonstrate shared patterns in how homesigners and speakers of three spoken languages convey the role of a tool in an event.
What factors led the homesigners and users of the three spoken languages to describe the event pictures in similar ways? Events of cutting and sawing were highly likely to elicit both instrumental handshapes and tool-prominent verbs. Events of brushing, digging, stirring, and painting also frequently elicited instrumental handshapes and received fairly high tool-prominence scores. We suggest the following generalization about what these events have in common—to the extent that that the shape of the tool corresponds to the shape that the patient assumes after the tool is used on it, this event will elicit tool-oriented descriptions across participants. We refer to this spatial dimension of events as “tool/patient molding.” For example, in slicing a piece of meat, the shape of the knife resembles the resulting slice made in the meat. This generalization is consistent with the “predictability of the locus of separation” dimension proposed by Majid et al. (2008) in their study of cutting and breaking events. However, our generalization is broader, as it also includes molding configurations found in events such as brushing, digging, and stirring. For digging, for example, the shape of the tool (e.g., a shovel) causes a similarly shaped indentation in the patient (e.g., soil). For these events, the molding configuration is less prominent than for cutting/sawing events, leading to intermediary use of instrumental handshapes and tool-prominent verbs.
Although the spoken languages and homesigners converged along the dimension of tool/patient molding, these two groups were not in perfect alignment across the entire sample of events. For a few events, the spoken languages avoided tool-prominent verbs, but the homesigners used instrumental handshapes (e.g., playing a drum with drumstick, hitting a hoop with a stick, a dentist examining teeth with a magnifying glass). In addition, we also observed events in which the spoken languages used tool-prominent verbs but the homesigners avoided instrumental handshapes (e.g., hammering a nail, rowing a boat, writing on a blackboard). We will return to this latter set of events in the Discussion.
Thus far, we have presented results aggregated across the 10 homesigners. Because homesigners have each developed their own gesture systems, we need to evaluate whether spoken-language verbs predict instrumental handshape for individual homesigners. To the extent that conceptual knowledge about tool/patient molding constrains linguistic encoding of these events, we expected that this constraint would influence individual homesigners’ descriptions. We tested this prediction by computing a median split over the mean spoken-language tool-prominence scores, dividing the 65 event pictures into “high” and “low” categories. For each homesigner, we then calculated rates of instrumental handshape for pictures in these categories. We chose this analytic strategy because data for individual event pictures are relatively sparse at the individual homesigner level. Figure 6 displays these values for each homesigner, showing that across individuals, instrumental handshape is numerically more common for pictures with high tool-prominence scores than for pictures with low tool-prominence scores (note that the size of this difference is relatively small for the U.S. homesigner). 5 This result suggests relatively strong constraints on the relationship between event concepts and event language, both within a single culture (the Guatemalan homesigners) and across cultures.

Rate of producing instrumental handshapes for event pictures with high versus low tool-prominence scores, separately for each homesigner. The number over each bar indicates how many descriptions that homesigner produced in that category. Darker bars indicate a higher number of event descriptions. The homesigner Guatemala 2 did not describe any events in the low tool-prominence category.
Tool objects
We found that event pictures described with tool-prominent verbs by English, Spanish, and Chinese speakers are likely to be described using instrumental handshapes by homesigners. One possibility is that this result does not reflect event conceptualization but rather reflects functional knowledge of the tool objects themselves. For example, knives are frequently used for cutting; is it the cutting event or the knife itself that elicits tool-prominent descriptions? To explore this possibility, we asked whether tool-prominent verb use would also predict handshape in homesigners’ descriptions of tools presented in isolation (e.g., a picture of a knife on its own). Recall from the Event and Object Descriptions section that for each of the 88 tool object pictures that homesigners described, we asked speakers of English, Spanish, and Chinese to name an action that is commonly performed with this tool.
We observed 267 object descriptions across all 10 homesigners: 174 of these signs had handling handshapes, 74 had instrumental handshapes, and 19 had ambiguous or nondiscernible handshapes. Signs with ambiguous or nondiscernible handshapes were excluded from analysis. Was instrumental handshape for object descriptions predicted by spoken-language tool-prominence scores for object pictures? Across the three spoken languages, tool-prominence scores for object pictures were significantly positively correlated—English vs. Spanish: r(86) = .78, p < .001; English vs. Chinese: r(85) = .64, p < .001; Spanish vs. Chinese: r(85) = .70, p < .001. Turning to the relationship between tool-prominence scores for objects and instrumental handshape in homesign, we found a relatively weak and only marginally significant correlation, r(80) = .20, p = .071 (Fig. 7). In our regression model of instrumental handshape, we found that mean tool-prominence score for object pictures (weighted equally across English, Spanish, and Chinese) did not significantly predict instrumental handshape in homesign, b = 1.21, 95% CI = [−1.02, 2.03], z = 1.06, p > .1. In addition, none of the individual spoken languages predicted instrumental handshape for object pictures (English: z = 1.35, Spanish: z = .91, Chinese: z = .17).

Tool-prominence scores averaged across English, Spanish, and Mandarin Chinese versus rates of instrumental handshapes for each object picture. Point size indicates the total number of descriptions of that picture across all homesigners. Example object pictures are labeled.
We examined patterns across individual homesigners by splitting the mean tool-prominence scores for object pictures into high and low categories, as described earlier for event pictures. We did not observe a clear pattern across individual children: Six children used a greater proportion of instrumental handshapes for high object pictures than for low object pictures, three children showed the reverse pattern, and one child used instrumental handshapes at equal rates across the two categories of pictures (Fig. S3 in the supplementary materials displays these results). In summary, we did not find a clear relationship between spoken-language verbs and homesigners’ handshapes for static objects. The positive relationship we observed between tool prominence for spoken language and for homesign in event pictures is therefore not likely to reflect knowledge of objects per se but rather knowledge about events and how to communicate about them.
Discussion
Use of tool-prominent verbs by speakers of English, Spanish, and Chinese predicted use of instrumental handshapes in 10 homesigners from around the globe. This correspondence suggests universal constraints on the interface between conceptual event knowledge and language because we observed this common pattern across children and adults, across different cultures, and across individuals who had learned a conventional language from birth and individuals who had not. We ruled out several alternate explanations for our findings—the relative size of the tool did not predict rates of instrumental handshape, nor did handshape affordance ratings. In addition, we observed a positive relationship between spoken language and homesign descriptions only for tool event pictures not for tool objects on their own. The constraints we have isolated are thus specific to event representation.
In the domain of event cognition, it has been particularly difficult to describe the structure of the concepts, or “sortal” categories (Shukla & de Villiers, 2021), that interface with language. Consider, by contrast, the extensive literature on the language of perception (see Majid et al., 2018). Researchers have made significant progress in understanding the causes and consequences of linguistic variability in the domain of color, for example, because perceptual color space is easily definable independent of language. Conceptual representations of events are less easily defined, limiting researchers’ ability to probe crucial components of how human cognition and language interact.
Our understanding of tool cognition and language is far less developed. At a coarse-grained level, the distinction between artifacts (e.g., knives) and natural kinds (e.g., gorillas) is robustly encoded in both language (Hwang et al., 2016; Levin et al., 2019) and conceptual representation (see Ishibashi et al., 2016; Kemmerer, 2019; Mahon & Caramazza, 2009, for a review). At a more fine-grained level, however, it is unclear whether the categories that are important for tool language are also important for tool concepts. For example, in English, Dutch, and German, tools and nontool body parts can belong to the same linguistic category because they can be marked morphologically in the same way (e.g., “she cut the bread with a knife,” “she pressed the button with her foot”; Rissman et al., 2022). This equivalence is not necessarily conceptually universal given the strong distinction between artifacts and natural kinds. Our results suggest a particular dimension of spatial event representation, tool/patient molding, that may be particularly prominent in the concept/language interface, with conceptual and linguistic categories in close alignment. Our findings are also significant because they suggest that universal constraints on the concept/language interface for events extend to tool categories and are not limited to the more prominent, well-studied thematic roles agent, patient, recipient, and goal (see Rissman & Majid, 2019).
Despite the similarities that we observed across the three spoken languages and the homesigners’ descriptions, we also saw the type of variability that is the norm in studies of cross-linguistic semantics. The spoken languages were far from perfectly correlated with each other, and the correlations between pairs of spoken languages were higher than the correlation between the spoken languages and homesign. Notably, pictures of writing and drawing received high tool-prominence scores but were never described by homesigners with instrumental handshape. The homesigners in our sample were familiar with writing/drawing implements such as pencils. One interpretation of these findings is that the tool in writing/drawing events is less conceptually prominent than the tool in molding events, leading to greater cross-linguistic variability in how writing/drawing events are encoded. An alternate possibility is that homesigners avoided instrumental handshapes for these events (specifically, an extended index finger, or “1” handshape) because the “1” handshape in homesign has other functions—an indexical function and a tracing function (in which the finger outlines the shape of an object or path of a motion in the air). These functions may compete with the instrumental function, leading homesigners to favor handling handshapes for writing/drawing events. The study on novel-verb learning that we sketch below can shed light on whether tools in writing/drawing events have lower conceptual prominence than tools in molding events.
These findings allow us to make multiple predictions about how children and adults interpret events and learn language. First, we expect that conventional languages are strongly biased to use tool-prominent verbs to describe the event pictures in the upper right-hand corner of Figure 5 (where both spoken-language speakers and homesigners used tool-oriented language). Research on established sign languages shows variability in use of instrumental versus handling handshapes, but the reasons for this variability are not well understood (see Nyst et al., 2021). We predict that event semantics (specifically events with tool/patient molding) is one factor predicting the instrumental/handling alternation in sign languages.
If the tool is especially prominent when it molds the patient, then this event concept should be robust independent of language. Four-month-old infants better individuate objects when these objects are seen to function as tools (Stavans & Baillargeon, 2018). If our interpretation is correct, infants ought to better individuate tools that are in patient-molding configurations than tools that are not. Because infant biases to interpret events are reflected in how adults remember events (Lakusta et al., 2007; Lakusta & Landau, 2012), we predict adults will have more robust memory for tools that mold the patient than tools that do not.
Finally, our findings are relevant to theories of child verb learning. Children demonstrate biases in the relation between event concepts and language: For example, children often construe causative events (e.g., one person pushing another) as an agent acting on a patient, corresponding to a transitive syntactic frame (e.g., “Big Bird is pushing Cookie Monster”; see Rissman & Majid, 2019, for a review). This correspondence establishes the conditions for “syntactic bootstrapping,” a learning mechanism by which children use a verb’s syntactic distribution to infer the verb’s meaning (Fisher et al., 2020; Lidz et al., 2003). For instance, children may assume a one-to-one mapping between participants in conceptual structure and arguments in syntactic structure. Our results suggest that when children perceive molding events, they represent the tool as a prominent participant in the conceptual structure of the event. However, in sentences with tool-prominent verbs such as slice, a noun for the tool (e.g., knife) is often omitted (Rissman, 2022; Rissman et al., 2015). These data suggest that children cannot use syntactic bootstrapping to learn that verbs such as slice are tool prominent. Our findings thus call for a more nuanced account of how children relate conceptual representations to noun phrases to verb arguments.
We also predict that some tool-prominent verbs are easier to learn than others. If children are biased to view a tool in a molding event as more prominent than a tool in a nonmolding event, they should be more likely to interpret a novel verb as conveying tool information for events in which the tool molds the patient. For this reason, it may be more difficult to learn that write and draw have instrumental semantics than to learn that slice and cut do, as only the latter set of verbs encode tool/patient molding. This prediction could be tested by creating events in which tools are more or less likely to mold the patient and then teaching children verbs for these events. If our hypothesis is correct, children should interpret the verb as conveying tool information more often for molding events.
In conclusion, we investigated the interface between concepts and language using a unique source of evidence, gestures from child homesigners across five countries. Through this evidence, we identified a sortal category (tool/patient molding) that is particularly important for language. This result allows us to make a range of predictions for how people process events and event language, opening up new lines of inquiry across multiple disciplines.
Supplemental Material
sj-pdf-1-pss-10.1177_09567976221140328 – Supplemental material for Universal Constraints on Linguistic Event Categories: A Cross-Cultural Study of Child Homesign
Supplemental material, sj-pdf-1-pss-10.1177_09567976221140328 for Universal Constraints on Linguistic Event Categories: A Cross-Cultural Study of Child Homesign by Lilia Rissman, Laura Horton and Susan Goldin-Meadow in Psychological Science
Footnotes
Transparency
Action Editor: Sachiko Kinoshita
Editor: Patricia J. Bauer
Author Contributions
Notes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
