Abstract
Drawing from construal level theory, we test the hypothesis that words promote thinking of events in terms of their abstract and central features (i.e., high-level construal), whereas pictures promote thinking in terms of more concrete and idiosyncratic features (i.e., low-level construal). In Experiments 1a and 1b, we found that verbal (vs. pictorial) presentation of objects led to broader, more inclusive categorization of those objects. In Experiment 2, we found that word (vs. picture) priming led to greater global (vs. local) processing of subsequent perceptual information. Finally, in Experiments 3 and 4, we tested the opposite direction of causality. Thinking about high-level “why” versus relatively low-level “how” (Experiment 3) and thinking about high-level categories versus relatively low-level exemplars (Experiment 4) led to more verbal versus pictorial thought. These findings provide converging evidence that medium (word, picture) is associated with level of construal.
People use words and pictures to represent the world around them in order to create meaning and to communicate with others. For example, you are exposed to verbal information when you see a billboard that says, “Help victims of Hurricane Sandy,” when you hear your friend describe her trip to Rome, or when you read a book. Information can also be presented in pictorial form. The billboard might depict a family next to their dilapidated home, your friend could send you pictures of her trip, or you could watch a film adaption of the book. How does medium (visual, verbal) affect the way people think about a piece of information and process that information? We propose that pictures promote immersion in the specifics of situations via low-level construal, whereas words allow one to transcend the particulars of situations and consider their more abstract and essential properties via high-level construal. For example, reading the slogan about helping hurricane victims might lead people to consider the issue in a more global and abstract way (i.e., about disaster relief in the United States). A pictorial presentation of the same problem might lead individuals to process the information in a more local and concrete way (e.g., think about the specific family in the picture or the particular instance of Hurricane Sandy). In this work, we empirically test this proposed association between medium and construal level.
Levels of Construal
A given object, event, or person can be represented in various ways that emphasize different aspects of the target. For instance, “traveling” can be described as “creating memories,” which emphasizes the purpose of the activity, or as “taking a plane,” which highlights the means by which one might travel. Construal level theory (CLT; see Trope & Liberman, 2010, for a review) proposes that people represent or construe events differently as a function of psychological distance—the degree to which events are removed from direct experience. Detailed specifics about events are typically unavailable or unreliable when those events take place in a distant time, in a distant location, to dissimilar others, and are less likely to occur. CLT proposes that people’s response to psychological distance is to construct representations that highlight the more central properties of events (i.e., high-level construal) rather than the relatively more peripheral and secondary aspects (i.e., low-level construal). This is functional because the central and most defining aspects of events are less likely to change across different contexts (e.g., here vs. there; now vs. later), whereas superficial details are more variable and dependent on the particular situation. For example, “why” one travels is less likely to depend on the specific situation compared to “how” one travels, which is more likely to change depending on the context. In short, high-level construal promotes transcendence of the particulars, while low-level construal promotes immersion in them.
Accordingly, CLT research shows that psychological distance prompts high-level construal of information, while proximity prompts relatively low-level construal. This link between psychological distance and level of construal has been demonstrated in various ways. For example, people are more likely to think in terms of broad, inclusive categories when considering objects that are associated with a psychologically distant event (e.g., Amit, Algom, & Trope, 2009a; Liberman, Sagristano, & Trope, 2002). Furthermore, psychological distance facilitates processing of the gestalt, while psychological proximity facilitates processing of details (e.g., Amit, Algom, Trope, & Liberman, 2009b; Wakslak, Trope, Liberman, & Alony, 2006). Distance also promotes thinking about others more in terms of their central and abstract dispositional traits (e.g., Nussbaum, Trope, & Liberman, 2003; Rim, Uleman, & Trope, 2009). Creating broader versus narrower categories, focusing on the whole versus the constituent parts, and focusing on abstract traits versus concrete behaviors are representational processes that map onto high-level construal and low-level construal, respectively.
Pictures Versus Words
In the present research, we argue that pictures and words are cognitively associated with relatively different levels of construal. We propose that while pictures are associated with low-level construal, words are associated with high-level construal. Although there may be exceptions (like onomatopoeia and abstract art), in general, most pictures are highly concrete, whereas most words are relatively more abstract. Pictures are icons, analogue representations of specific objects in a definite time and place. Visually, they resemble the object depicted and provide a snapshot of what direct experience with that specific object is like. This is a representational process similar to low-level construal. Words, by contrast, do not visually resemble their referent objects. They are symbolic, abstract representations that capture the essential and categorical features of objects. The word “CHAIR” represents many possible exemplars (e.g., office chair, kitchen stool, armchair, rocking chair), whereas a picture of that same chair is a singular exemplar. Even a more specific version of the word (e.g., “ROCKING CHAIR”) is more abstract than a picture because the picture still contains the idiosyncratic and concrete details that render the depicted object unique (e.g., color, size, scratches, logos). Thus, representing events linguistically (i.e., in words) reflects a representational process similar to high-level construal. We propose that these similarities lead pictures to be associated with low-level construal and words to be associated with high-level construal.
Suggestive evidence for the link between medium and construal level comes from research which shows that medium is associated with psychological distance. For example, Amit, Algom, and Trope (2009a) demonstrated that psychological distance (vs. proximity) is associated with greater verbal (vs. pictorial) thinking on object identification, categorization, and selective attention tasks. Extending this link between psychological distance and medium of presentation to the domain of communication, recent research found that people increasingly prefer to use pictures when communicating with temporally, spatially, and socially proximal others, but they increasingly prefer to use words when communicating with temporally, spatially, and socially distant others (Amit, Wakslak, & Trope, 2013). People thus appear to link pictures and words with psychological proximity and distance, respectively.
Although the effect of distance on medium is consistent with our hypothesis that medium and construal are associated, it is important to note that distance may affect medium for reasons other than construal. For example, a perceptual explanation would also predict such a relationship between distance and medium because the perception of an object usually implies proximity to it (e.g., if one sees it, it is usually close by). Thus, Amit et al. (2009a) does not provide unequivocal evidence that medium should be associated with construal. In the present research, we systematically tested this hypothesis. Specifically, we examined whether presenting information in verbal versus pictorial form would promote high-level construal versus low-level construal, respectively (Experiments 1 and 2). Additionally, we examined whether this relationship is bidirectional in nature by priming construal (i.e., thinking in terms of high-level ends vs. low-level means) and assessing medium of thought (Experiments 3 and 4).
Experiment 1a
One characteristic of high-level relative to low-level construal is breadth of categorization. High-level construal promotes broader, more inclusive categories (e.g., furniture), whereas low-level construal is reflected in more narrow, less inclusive categories (e.g., wooden chairs). In Experiment 1a, we examined the effect of medium (picture vs. word) on the breadth of categories into which items would be classified. We presented participants with a series of events (e.g., a camping trip) and for each event, participants were given a set of items (e.g., brush, tent), either in pictorial or in verbal form, and were asked to group them into as many categories as they thought appropriate. We predicted that pictorial presentation of items would lead to a greater number of categories than verbal presentation of items.
Method
Participants and Design
Twenty-four undergraduates from Tel Aviv University participated in this study for course credit. Participants were randomly assigned to one of two between-subject conditions in a 2 (Medium: Pictures vs. Words) × 3 (Scenario: Camping, Yard Sale, and Moving) mixed design with the second factor within subject. Sample size was determined by recruiting as many participants as possible before the end of the semester (approximately 1 week), and data were not analyzed until recruitment was completed.
Procedure
Participants considered three events (adapted from Liberman et al., 2002): a camping trip, moving an apartment, and a yard sale. For each scenario, participants were presented with a set of 38 items. Half of the participants saw these items as words, while the other half saw them as pictures (i.e., color photographs of items; see Figure 1). As in Liberman, Sagristano, and Trope (2002), participants were asked to group the items into as many different groups as they saw fit (see Table 1 for stimuli and instructions). The total number of categories that participants grouped the items into for each of the three scenarios served as the measure of construal. After completing the tasks, participants were thanked and debriefed.

Example of stimuli used in Experiments 1a (Left) and 1b (Right). Item 30 (ice skates) in the yard sale scenario.
Stimuli and Instructions Used in Experiment 1 (Only Camping and Yard Sale Used in 1b).
Results and Discussion
We conducted an independent samples t-test and found the predicted main effect of medium such that participants presented with the items in verbal form (M = 5.94, SD = 1.14) generated fewer categories than those who were presented with the same items in pictorial form (M = 8.33, SD = 2.15), t(22) = 3.41, p = .003, d = 1.39. 1 This provides preliminary support for our hypothesis that words prompt high-level construal, whereas pictures prompt low-level construal.
Experiment 1b
In Experiment 1a, we found that participants placed items into more discrete categories when they were presented in the form of pictures than in the form of words. Experiment 1b attempted to replicate Experiment 1a using more abstract, black-and-white line drawings to address an alternative interpretation that it was the richness in detail contained in the colored drawings when compared to the words that produced the previous finding.
Method
Participants and Design
Forty-eight undergraduates (25 male, Mage = 19.48, SD = 1.81) from William Paterson University participated in this study for course credit. Participants were randomly assigned to one of four between-subjects conditions in a 2 (Medium: Words vs. Pictures) × 2 (Order of Scenario: Camping First vs. Yard Sale First) × 2 (Scenario: Camping Trip and Yard Sale) mixed design with the last factor within subjects. Sample size was determined by recruiting as many participants as possible within a 1-week window, with a minimum requirement of 20 participants per medium condition (words vs. pictures).
Procedure
Participants were asked to think about two events (adapted from Liberman et al., 2002): a camping trip and a yard sale. The selection of these two events was guided by practical considerations: At the time we conducted the study, we were readily able to obtain black-and-white line drawings (see, e.g., Figure 1) for these two scenarios and not the third (moving an apartment) that we used in Experiment 1a. The procedure was the same as in Experiment 1a.
Results and Discussion
Given the centrality of verbal comprehension, we excluded 12 nonnative speakers. 2 We conducted an independent samples t-test and found the predicted effect of medium on number of categories generated such that participants presented with the items in verbal form (M = 4.67, SD = 0.90) generated fewer categories than those who were presented with the same items in pictorial form (M = 6.12, SD = 1.83), t(34) = 2.83, p = .008, d = 1.01. In combination with Experiment 1a, this provides evidence that words prompt high-level construal and therefore lead to the generation of broader, hence fewer categories when compared to pictures that prompt low-level construal.
Experiment 2
Experiment 2 builds on Experiment 1 by demonstrating that medium (pictures vs. words) can influence how subsequent, unrelated information is processed. That is, presenting stimuli in pictorial versus verbal format may induce different procedural mind-sets (e.g., Freitas, Gollwitzer, & Trope, 2004; Fujita, Trope, Liberman, & Levin-Sagi, 2006), invoking a general tendency not only to represent focal stimuli via low-level construal versus high-level construal but also extending the tendency to unrelated stimuli. To test this, we examined the effect of exposing people to words versus pictures on global versus local processing. High-level construal can be conceptualized as seeing the “gestalt,” while low-level construal is exemplified by a relatively greater focus on the constituent details. To measure the extent of global versus local processing, we adopted a task that has been used in past CLT research (e.g., Förster, Liberman, & Shapira, 2009). Our prediction was that word priming would lead to more global, high-level processing when compared to picture priming.
Method
Participants and Design
Forty students from Harvard University’s community sample (28 female) took part in the experiment for a payment of 5 dollars. Age ranged from 18 to 36 years (M = 23.4, SD = 4.26). Medium was primed within participants. Sample size was determined by recruiting as many participants as possible within a 1-week window, with a minimum requirement 20 participants per medium condition (words vs. pictures).
Procedure
The experiment was divided into two main blocks with a filler task separating the blocks. Each block began with a medium priming task that was introduced as a task on spatial perception. In one block, the items on this priming task appeared in pictures, while in the other block the items appeared in words. The order of medium prime was counterbalanced across participants. On each trial of the priming task, participants were presented with two pictorial or verbal items on the screen, one on the right side and the other on the left side. One item (a picture of an apple on picture prime trials and the word “tomato” on word prime trials; see Figure 2) served as the target across all trials, and the other item (e.g., a picture of broccoli or the word “broccoli”) changed from one trial to the next. The filler items were the same across the two priming conditions and only varied in the medium in which they were presented. The location of the target item (left or right) was randomized across the trials. On each trial, participants were asked to indicate the location of the target item by pressing the “S” or “L” key on the keyboard to indicated left or right, respectively. The priming task in each of the two blocks consisted of 80 trials.

Sample picture prime trial (top) and word prime trial (Bottom) in the medium priming task in Experiment 2.
After completing the medium priming task, participants performed the Kimchi–Palmer task (Kimchi & Palmer, 1986), which was presented as an ostensibly unrelated visual matching task. There were 12 trials in each block. On each trial, three figures were shown: one at the top (the focal figure) and two at the bottom, on the left and right sides of the screen (the comparison figures). Each figure was a geometric shape made up of smaller geometric shapes (e.g., a triangle made up of circles). Importantly, one of the comparison figures always matched the focal figure at the feature level (local match), whereas the other comparison figure always matched the focal figure at the configural level (global match). For example, if the focal figure was a triangle made up of circles, the local match might be a square made up of circles and the global match might be a triangle made up of squares (see Figure 3). The task was to indicate which of the two comparison figures seemed more similar to the focal figure. On half of the 12 trials within each block, the global match was on the right side of the screen and on the other half, on the left. We randomized the order of the trials within each block.

Example of stimuli used in the Kimchi–Palmer task in Experiment 2. The focal figure is presented at the top and two comparison figures (global match left; local match Right) at the bottom.
Between the two blocks, participants performed a buffer task to prevent cross-block contamination. Participants were asked to complete 10 trials (10 s each) of simple math problems (e.g., 3 + 12 – 6 =). After completing the experiment, participants were thanked and debriefed.
Results and Discussion
As before, we excluded three nonnative English speakers. The main dependent variable was the average number of global matches chosen on the Kimchi–Palmer task. A paired sample t-test showed that, as predicted, participants chose the global match a greater number of times after exposure to the word primes (M = 6.54, SD = 2.17) versus picture primes (M = 5.59, SD = 1.82), t(36) = 2.22, p = .033, d = .37. 3 Words and pictures therefore appear not only to impact people’s construal of focal targets but also capable of inducing a general tendency (i.e., a procedural mind-set) to construe objects in high-level versus low-level terms.
Experiment 3
We demonstrated in Experiments 1 and 2 that pictures and words evoke low-level construal and high-level construal, respectively. Experiment 3 examines the association between medium and level of construal in the reverse direction. Specifically, invoking low-level construal versus high-level construal should facilitate thinking in terms of pictures versus words, respectively. To manipulate construal, we presented participants with a focal behavior and asked them to either generate the superordinate ends achieved by the action (why) or subordinate means by which the action is implemented (how). Past research has indicated that ends-focused thought is a characteristic of high-level construal, whereas means-focused thought is a characteristic of low-level construal (e.g., Freitas et al., 2004). We predicted that considering “why” one would engage in various behaviors (e.g., maintain good physical health) should involve verbal versus pictorial thinking when compared to considering “how” one would engage in the same behaviors.
Method
Participants and Design
Sixteen undergraduates at New York University participated in this study for course credit. Whether participants thought of “why” or “how” of a series of behaviors was manipulated within participants. Sample size was determined by recruiting as many participants as possible before the end of the semester (approximately 1 week), with a minimum requirement of 12 total participants for this within-subject design.
Procedure
Participants were presented with a series of 50 behaviors (e.g., maintain good physical health; taking your dog out). For half of the behaviors, they were asked to think about “why” they would engage in those behaviors and for the other half “how” they would engage in those behaviors. We counterbalanced the order of the task. After thinking about “why” or “how,” participants were asked to report whether they thought about it in words or pictures. Specifically, they were told, “People can sometimes think about behaviors in different ways. Sometimes, they ‘see a picture’ or an image in their mind. At other times, they ‘hear words’ or use inner speech to think about it.” Then they were asked, “When prompted to think about why [how] you would engage in the behavior on the previous screen, did you see a picture or an image in your mind, or did you hear words or use inner speech to think about it?” Finally, participants completed a demographic questionnaire and were thanked and debriefed.
Results and Discussion
Choice of “picture” was coded as 1 and choice of “word” was coded as 2. For each participant, we computed the average choice score for “why” and “how” and submitted these two scores to a paired samples t-test. As predicted, participants were more likely to think in terms of words (M = 1.53, SD = .20) than pictures (M = 1.31, SD = .21) when thinking of reasons why they would perform behaviors than when thinking of how they perform behaviors, t(15) = 3.15, p = .007, d = .79. This is consistent with our hypothesis that pictures are associated with low-level construal and words are associated with high-level construal.
Experiment 4
In Experiment 4, we operationalized level of construal in a different way: manipulating whether participants generated abstract, high-level categories or relatively more concrete, low-level exemplars. We presented participants with a focal item (e.g., car) and asked them to either think of a category that this item belongs to or to think of an example of this item. Past research has shown that thinking of superordinate, broad categories is a characteristic of high-level construal, whereas thinking of subordinate, concrete exemplars is a characteristic of low-level construal (e.g., Fujita et al., 2006). We predicted that thinking about categories should involve verbal versus pictorial thinking when compared to thinking about exemplars of the same items.
Method
Participants and Design
One hundred participants (43 male, 57 female) were recruited on Amazon’s Mechanical Turk (Buhrmester, Kwang, & Gosling, 2011). Age ranged from 18 to 82 years (M = 38.36, SD = 14.75), 5 unreported. Whether participants thought of categories or examples of a series of items was manipulated within participants. To assuage concerns that the results of Experiment 3 were spurious due to a relatively small sample size and because of the readily available nature of online participants using Amazon.com’s Mechanical Turk, we recruited a total of 100 workers. No data were analyzed until data from 100 participants were collected.
Procedure
Participants were presented with a series of 30 items (e.g., movie). For half of the items, they were asked to think about a category that the items belonged to (e.g., entertainment) and for the other half, examples of the items (e.g., Gone with the Wind). We counterbalanced the order of the task. After generating a category or an exemplar, participants were asked to report whether they thought about it in words or pictures as they had in Experiment 3. Finally, participants completed a demographic questionnaire and were thanked and debriefed.
Results and Discussion
We excluded five nonnative speakers from the analysis. Choice of “picture” was coded as 1 and choice of “word” was coded as 2. For each participant, we computed the average choice score for “category” and “exemplar” and submitted these two scores to a paired samples t-test. As predicted, participants were more likely to think in terms of words (M = 1.48, SD = .28) than pictures (M = 1.36, SD = .23) when thinking of categories of items as compared when thinking of examples of items, t(94) = 4.52, p < .001, d = .46. This is consistent with our hypothesis that pictures are associated with low-level construal, and words are associated with high-level construal.
General Discussion
We have shown that pictorial versus verbal representation leads to greater low-level construal versus high-level construal and that low-level construal versus high-level construal leads to a greater tendency to think in terms of pictures versus words. Although words are more abstract than pictures in general, it is important to note that pictures and words vary in their abstractness or concreteness. Some pictures are extremely concrete, depicting many minute details, while others are relatively more abstract. In the same way, words can vary in their level of abstractness. Thus, it is possible that more abstract versus concrete pictures will elicit greater high-level construal versus low-level construal and more abstract versus concrete words will elicit greater high-level construal versus low-level construal. This remains to be tested in future research. Nonetheless, we would like to point out that it is usually the case that even the most abstract picture is more concrete than a word to represent the same object. Indeed, we demonstrated in Experiment 1b that even abstract line drawings lead to a greater number of categories than words to represent the same objects (see also Amit et al., 2009a, 2013).
The present work has several potential implications for “downstream” construal-dependent judgments. To the extent that pictures and words evoke different levels of construal, they may also prompt systematic differences in prediction, judgment, decision making, and behavior. For example, research suggests that while high-level construal promotes attention to causes, low-level construal promotes attention to consequences (Rim, Hansen, & Trope, 2013). We might thus anticipate parallel findings with words and pictures, with important consequences for persuasion and policy: Words would heighten concern for causes, whereas pictures would heighten concern for consequences. Research also suggests that the weighting of desirability (ends) and feasibility (means) shifts as a function of high-level construal and low-level construal, respectively (e.g., Liberman & Trope, 1998). This suggests that pictures too, relative to words, might lead to a greater tendency to focus on feasibility considerations (e.g., What time does the lecture take place?) rather than desirability considerations (e.g., How interesting is the lecture?) when making a decision (e.g., about whether or not to attend a guest lecture).
The association between medium and level of construal may also speak to the effectiveness of specific versus general appeals for help. The identifiable victim effect describes the tendency for people to be more willing to donate money to a single, identifiable victim rather than to a disparate group of victims who cannot be individually identified (e.g., Kogut & Ritov, 2005; Sherman, Beike, & Ryalls, 1999; Small & Loewenstein, 2003). However, research also indicates that psychological distance moderates this effect: Distance increased the persuasiveness of general versus more specific appeals in driving donation intentions (Fujita, Eyal, Chaiken, & Trope, 2008). The medium by which critical information is presented may likewise affect whether people prefer to donate to specific versus general causes. For example, a picture of a family next to their home destroyed by Hurricane Sandy should lead to increased willingness to donate to that specific family or to Hurricane Sandy victims, in particular. On the other hand, the slogan, “Help victims of Hurricane Sandy,” despite the specificity of the request, should increase relative concern for the broader problem of natural disasters and willingness to donate to disaster relief, more generally.
The present research may also have implications for understanding individual differences in judgment, decision, and behavior. Past research shows that there are individual differences in the propensity toward verbal versus visual cognition (see Ernest, 1977; Marks, 1977, for reviews). Verbalizers are better at processing words while visualizers are better at processing pictures (Richardson, 1977). Based on our findings, verbalizers would be more likely to engage in high-level (vs. low-level) construal than would visualizers. This, in turn, may affect a host of construal-dependent judgments in the context of self-control, attitudes, and preferences.
Finally, it may be worth considering common assumptions about the differences between pictures versus words. The idiom, “a picture is worth a thousand words,” suggests that pictures may more effectively convey information than words. The present research suggests that whether this is true may depend on what message is being communicated. Although pictures may help immerse individuals into the idiosyncrasies of direct experience, words may help them transcend the immediate here and now to appreciate universals. By understanding the connection between medium and levels of construal, we may be able to appreciate the psychological functions of pictures and words.
Footnotes
Acknowledgment
We thank Jessica Düsing and the Harvard Decision Science Lab for their help in this research.
Authors’ Note
The first two authors contributed equally to this work.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
