Abstract
An important characteristic of knowledge is that it exists at multiple levels of abstraction. This article illustrates how different levels of abstraction influence perception, comprehension, categorization, memory, and thought. Theories exist for how abstraction influences each of these cognitive processes, but there are few unifying principles for discussing these theories within a common conceptual framework. My proposed taxonomy examines three senses of abstraction: (a) an abstract entity is a concept that has no material referent, (b) abstraction focuses on only some attributes of multicomponent stimuli, and (c) an abstract idea applies to many particular instances of a category. The first refers to instances, the second to attributes of instances, and the third to classes of instances. Concrete mental representations consist of modal images for instances, equivalent attributes, and exemplars or episodes for categories. Abstract mental representations consist of amodal propositions for instances, distinctive attributes, and rules or prototypes or schema for categories. I first apply the taxonomy to words, pictures, and problems. The next section shows how categorization strategies combine with abstraction at the attribute, instance, and category levels. The subsequent section applies the taxonomy to hierarchical (subordinate, basic, superordinate) levels. A concluding section proposes directions for further development.
A vivid illustration of the power of abstraction is revealed by an engineer describing his weekly meeting with his supervisor:
Each question was precisely the best one based on the information he had uncovered so far. His logic was faultless—he never asked a question that was irrelevant or erroneous. His questions came in rapid-fire order, revealing a mind that was lightning-fast and error-free. In about an hour he led each of us to understand what we had done, what we had encountered, and where to search for the problem’s cause. It was like looking into a very accurate mirror with all unnecessary images eliminated, only the important details left. (Dyson, 2012, p. 127)
The year was 1947, and the project was to build a computer. The engineer had graduated at the top of his class at MIT with degrees in electrical engineering and mathematics. His supervisor was John von Neumann.
Abstraction, however, is a double-edged sword. Its consequences are both good and bad. It is easy to find the virtues of abstract thinking in psychology research as illustrated by the following examples. Seventh graders who were doing well in math classes were able to classify word problems based on common solution procedures and were not mislead by the specific story content (Silver, 1981). Experts in solving physics problems classified problems based on physics principles such as conservation of energy while novices classified problems based on physical objects such as pulleys (Chi, Glaser, & Rees, 1982). People can determine the consequences of seeking permission or fulfilling an obligation in an unfamiliar situation by using pragmatic reasoning schemas that capture the generic aspects of these situations (Cheng, Holyoak, Nisbett, & Oliver, 1986).
However, there are also negative consequences of abstraction. Paivio, Smythe, and Yuille (1968) discovered that people were better at recalling responses in a paired-associates paradigm for concrete words (such as juggler, dress, letter, hotel) that were easy to image than for abstract words (such as effort, duty, quality, necessity) that were difficult to image. Bransford and Johnson (1973) found that people had difficulty recalling ideas from an abstract text. One of their passages began with the sentences,
“The procedure is actually quite simple. First you arrange things into different groups. Of course, one pile may be sufficient depending on how much there is to do.” (p. 400)
The inclusion of generic terms makes this passage difficult to simulate as mental images—we do not know the content of the procedure, the groups, or the pile.
The finding that abstraction can have both positive and negative consequences indicates that the type of abstraction differs in the two cases. The negative consequences of learning abstract words (Paivio et al., 1968) or in recalling abstract text (Bransford & Johnson, 1973) suggests that the abstract characteristic of certain words constrains our ability to form images of the words, which makes it more difficult to remember them. In contrast, the abstract nature of schematic knowledge in reasoning (Cheng et al., 1986) or in solving problems (Chi et al., 1982; Silver, 1981) suggests that classifying problems into categories based on shared principles and solutions is a learned skill that makes people better problem solvers.
These two perspectives are not the only ones that have fascinated psychologists with the concept of abstraction. Shared properties were emphasized in a definition that formed the basis for an extensive methodological review of research on abstraction—”we define abstraction as a process of identifying a set of invariant central characteristics of a thing” (Burgoon, Henderson, & Markman, 2013, p. 502). A fourth perspective is Rosch’s formulation of hierarchical organization. She referred to increasing levels of abstraction when moving from the subordinate (living-room chair) to the basic (chair) to the superordinate (furniture) level (Rosch, Mervis, Gray, Johnsen, & Boyes-Braem, 1976). The challenge is to integrate these four perspectives into a unified theoretical framework.
Three of the four perspectives are included in the APA Dictionary of Psychology (VandenBoss, 2006). Its three definitions of abstraction are:
the formation of general ideas or concepts, such as “fish” or “hypocrisy,” from particular instances.
such a concept, especially a wholly intangible one, such as “goodness” or “beauty.”
in conditioning, discrimination based on a single property of multicomponent stimuli. (p. 4)
All three definitions contribute to my proposed taxonomy. The first refers to the formation of general ideas or concepts (such as “fish”) from particular instances. The second refers to intangible ideas (such as “goodness”) that lack physical referents. The last refers to discrimination based on a single property. However, classifying instances on the basis of shared properties typically includes more than a single property (Burgoon et al., 2013).
The three characteristics are complementary. The opposite of “abstract” is “concrete” for the first characteristic that an abstract entity has no material referent. This aspect is illustrated by Paivio’s (1969, 1971) research on abstract versus concrete words. Paivio and his collaborators found that recall in a paired-associates task was best when both words of the pair were high-imagery words and worst when both words of the pair were low-imagery words (Paivio et al., 1968). Paivio (1969) argued that the concrete-abstract dimension is the most important determinant of the ease of forming an image. High-imagery words (juggler, dress) are concrete words that represent material objects, whereas low-imagery words (effort, duty) are abstract words that do not represent material objects. Pictures are the most concrete stimuli, and I include previously seen pictures as material referents. People should therefore be able to form images of imaginary concepts (unicorn, leprechaun) and depicted models (atom) based on recalled pictures.
The opposite of “abstract” is “equivalent” for the second characteristic that an abstract entity includes only some attributes of multicomponent stimuli. Abstraction does not treat all attributes as equivalent but emphasizes those attributes that will be useful in performing a task. An example comes from Eleanor Gibson’s (1969) theory of perceptual learning. She proposed that to make difficult discriminations, young children must learn to identify the distinctive feature that enables them to discriminate between two nearly identical patterns. Examples of distinctive features are the diagonal line in the pair RP, the vertical line in the pair YV, and the horizontal line in the pair GC. Focusing on the essential attributes also occurs for more complex situations as stated in the opening quote: “It was like looking into a very accurate mirror with all unnecessary images eliminated, only the important details left” (Dyson, 2012).
The opposite of “abstract” is “particular” for the third characteristic that an abstract entity applies to many particular instances of a category. An example of an abstract entity is a category prototype, which is an average of the category members. A study in which participants classified dot patterns into categories supported the interpretation that the classifiers had created a prototype to represent each category but were also influenced by the particular instances within the category (Posner & Keele, 1968). The degree of variability of the instances from the prototype influenced learning the category members.
These three characteristics of abstraction refer to different grain sizes. The first applies to instances, the second to the attributes of instances, and the third to categories of instances. The three characteristics form the basis for the proposed taxonomy in the next section.
A Proposed Taxonomy
My proposed taxonomy is an analysis of how different taxonomic levels of abstraction influence perception, categorization, memory, comprehension, and thought. Theories exist for how abstraction influences each of these cognitive processes, but there is a lack of unifying principles for embedding these theories within a common conceptual framework.
The goal of the taxonomy is to identify the mental representations and processes that support abstraction at the attribute, instance, and category levels. Table 1 lists concrete and abstract representations for each of these levels. The following definitions of the terms in Table 1 are from the APA Dictionary of Psychology (VandenBoss, 2006).
A Taxonomy of Abstraction
A concrete representation of an instance is a modal representation (Barsalou, 1999) that stores sensory images. An image is a likeness to an earlier sensory experience recalled without external stimulation. An abstract representation of an instance is an amodal (Barsalou, 1999) linguistic representation that includes names. Amodal representations also include propositions to represent meaning. A proposition is anything that that can be asserted or denied, such as my car runs on gas, and is capable of being either true or false. Propositions can therefore be evaluated for their truth value and images for their similarity.
A concrete representation of attributes is equivalent in the sense that the attributes have approximately equal influence on task performance. In contrast, abstraction occurs through emphasizing those distinctive attributes (features, properties) that are helpful for carrying out the task. The terms feature, property, and attribute are widely used such as feature by Chun, Golomb, and Turk-Browne (2011), property by Collins and Quillian (1969), and attribute by Rosch and Mervis (1975). I consider these three terms to be equivalent. They all refer to the components of multicomponent stimuli.
Concrete representations of a category consist of exemplars and episodes. Exemplars are equivalent to instance—they are members of categories such as different kinds of birds. An episode is a noteworthy isolated event or series of events. Prototypes, rules, and schemas are abstract representations of categories. A prototype is an average of category members that is used for making judgments about category membership. A prototype theory is typically contrasted with an instance or exemplar theory that proposes categorization depends on specific remembered instances of the category. A rule is a guideline or standard that is used to guide responses or behavior or communicate situational norms. A schema is a collection of basic knowledge about a concept that serves as a guide to perception, interpretation, imagination, and problem solving.
The taxonomy in Table 1 provides a theoretical foundation that should support searching for answers to the following questions:
Which taxonomic level of abstraction is most important for different cognitive tasks that include words, pictures, and problems?
How does abstraction at different taxonomic levels combine? Are the levels orthogonal?
How do taxonomic levels of abstraction apply to the subordinate, basic, and superordinate categories identified by Rosch?
What are the implications of the taxonomy for future research?
The four remaining sections of the article attempt to provide answers to each of the four questions. We begin our search for answers by applying the taxonomy to words, pictures, and problems.
Application to Words, Pictures, and Problems
The first question is which taxonomic level of abstraction is most important for different tasks involving words, pictures, and problems? Table 2 shows the primary taxonomic level that distinguishes between concrete and abstract representations for these three referents and six processes.
Application to Words, Pictures, and Problems
Note: The last column indicates whether the primary difference(s) between a concrete and abstract representation occurs at the instance (I), attribute (A), or category (C) level.
Words
Application of the taxonomy begins with words, in part because of the historical importance of Paivio’s research that distinguished between concrete and abstract words. This distinction was ignored for many years because Watson’s book (1924) Behaviorism had banished mental constructs, including images, from study by American psychologists. A landmark study published in 1968 helped to bring visual imagery back into mainstream American psychology by distinguishing between concrete and abstract words based on whether the word could be easily represented by a visual image (Paivio et al., 1968).
Subsequent work extended imagery to reading (Sadoski & Paivio, 2001), and further study of modal representations of text was encouraged by the popularity of perceptual symbol systems (Barsalou, 1999). Propositional representations, however, are also needed to model abstract words in text comprehension and account for how people integrate the various parts of a text to create a coherent understanding (Kintsch, 1998). Remembering words and comprehending text both utilize concrete and abstract coding at the instance level of analysis.
Remember words
A possible explanation for the better recall of concrete words is that people form visual images of concrete words and these visual codes are more memorable than verbal codes. However, Paivio (1986) argued that it is the number, and not the quality, of memory codes that explained his results. His dual coding theory proposed that a verbal code supports recall of both abstract and concrete words but a concrete word proves an additional memory code that supports recall when the verbal code is forgotten.
Images form the foundation of many mnemonic techniques to improve memory. One example that uses both verbal and visual codes is the mnemonic keyword method for learning a foreign vocabulary. The keyword is a concrete word that sounds like the foreign word and can be used to form an interactive visual image with the English word. For example, the keyword group received the keyword “garage” to connect the Russian word “gora” to the English translation “mountain.” A group of students who were taught this method recalled the English translation for 72% of the Russian words on a delayed test compared with 46% recall for a control group who selected their own learning strategies (Atkinson & Raugh, 1975).
The distinction between concrete and abstract words is also supported by neuroimaging studies using positron emission tomography (PET) and functional magnetic resonance imaging (fMRI). A meta-analysis of 19 studies found that abstract concepts elicited greater activity in the inferior frontal gyrus and middle temporal gyrus compared with concrete concepts (Wang, Conder, Blitzer, & Shinkareva, 2010). Concrete concepts elicited greater activity in the posterior cingulate, precuneus, fusiform gyrus, and parahippocampal gyrus compared with abstract concepts. The authors interpreted these findings as suggesting a greater engagement of the perceptual system for processing concrete concepts and greater engagement of the verbal system for processing abstract concepts.
There are still many unanswered questions, however, that are the focus of ongoing research. As stated by McRae and Jones (2013):
In summary, understanding the organization and content of abstract concepts is a major challenge for all current theories of semantic memory. Addressing the relevant issues will require a deeper appreciation of the similarities and differences among types of abstract concepts; how abstract concepts depend differentially on sensory, motor and internally generated cognitive and emotional information; and the manner to which they are tied to situations and contexts in which they are important. (p. 217)
Some of these issues are being addressed within the larger context of text comprehension.
Comprehend text
Words generally occur in sentences rather than in lists. I previously mentioned the Bransford and Johnson (1973) study to illustrate a negative consequence of abstraction when attempting to comprehend text. There were 18 ideas in the abstract description of washing clothes, but participants could only recall an average of 2.8 of them. Providing a hint at the end that the passage was about washing clothes did not increase recall. Bransford and Johnson proposed that the material was so difficult to comprehend that there were few ideas in memory to recall; therefore, a retrieval hint was wasted. However, providing the hint at the beginning of a passage did increase recall because readers could create concrete referents for the generic terms. For instance, the phrase “arrange things in different groups” likely refers to the color of the clothes.
The effectiveness of a hint at the beginning of a passage is consistent with Schwanenflugel and Shoben’s (1983) results that abstract material is easier to comprehend when there is a suitable context. Schwanenflugel and Shoben found that, when there was no context, abstract sentences (“The standard procedure was full of mistakes”) took longer to read than concrete sentences (“The imposing cathedral was full of tourists”). However, there was no difference when the abstract and concrete sentences were at the end of a paragraph that provided a context. They also found that people could recognize isolated concrete words faster than isolated abstract words but this difference disappeared when the words appeared within a sentence that provided a context.
Schwanenflugel and Shoben’s context theory argued against a visual imagery account of text comprehension, but there is also evidence that context can influence the construction of images. Zwaan (2004) proposed that language “serves as a set of cues to the comprehender to construct an experiential (perception plus action) simulation of the described situation” (p. 36). He illustrates his claim with the contrast between the two sentences, “The ranger saw the eagle in the sky” and “The ranger saw the eagle in the nest.” The first sentence should evoke an image of an eagle with outstretched wings, and the second sentence should evoke an image of an eagle with drawn-in wings. Stanfield and Zwaan (2001) had found supporting evidence for statements such as “She pounded the nail into the floor” and “She pounded the nail into the wall.” When participants had to verify whether a picture of an object was mentioned in the sentence, their verification times were faster for a picture of a vertical nail if given the first sentence and a horizontal nail if given the second sentence.
An embodied approach for teaching young children to read used commercially available toys consisting of a farm scene (animals, tractor) or a garage scene (gas pumps, tow truck). Children in a manipulation condition either manipulated the toys or imagined manipulating the toys after reading each sentence of a short story. Both actual and imagined manipulations greatly increased comprehension and memory of the text compared with a control group who read the text twice without manipulation (Glenberg, Gutierrez, Levin, Japuntich, & Kaschak, 2004). Extensive research on instructional manipulatives continues to explore their effectiveness across a variety of tasks (Pouw, van Gog, & Paas, 2014).
Zwaan’s (2004) immersion theory reflects his effort to create an embodied theory of language comprehension to replace a more abstract representation. There is now extensive evidence that simulations are highly flexible and, like sensory-motor interactions, can focus on specific modalities, change according to context, and take perspective into account (Pecher, 2013). However, there is also evidence “that the current state of the literature favors a moderate embodied approach, suggesting that the role of linguistic factors should not, at the very least, be underplayed” (Horchak, Giger, Cabral, & Pochwatko, 2014, p. 78).
An alternative to the concrete conceptualization of language based on perception-action simulations is an abstract conceptualization based on propositions. The best example is the Kintsch and van Dijk (1978) model of text comprehension that continues to be developed by Kintsch and his collaborators (Kintsch, 1998; Kintsch & Mangalath, 2011). According to the theory, propositions are composed of concepts that include a predicate or relational concept and one or more arguments. For instance, the sentence “The Swazi tribe was at war with a neighboring tribe because of a dispute over cattle” included as propositions “the Swazi tribe”, “was at war with”, and “a neighboring tribe.” The continued development of this model has led to many impressive accomplishments, including the organization of propositions into semantic networks (coherence graphs), predictions of readability, incorporation of prior knowledge, and a construction-integration model that partitions comprehension into construction and integration stages (Kintsch, 1998). However, Kintsch and Mangalath (2011) admitted that the model works only at the symbolic level and there is a need to integrate these symbolic representations with perceptual and action-based representations.
One study cited by Kintsch and Mangalath (2011) as a step in this direction is the computational model of concepts developed by Goldstone, Feng, and Rogosky (2005). Their neural network model supports simulations in which semantic-network accounts of meaning are supported by externally grounded features such as hoof, mane, and tail for the concept of a horse. Other features of horse, such as domesticated and strong, are more abstract. In this respect, abstract and concrete attributes of meaning are complementary and mutually reinforcing (Weimer-Hastings & Xu, 2005).
Summary
Much of the research on words and text has focused on distinctions made at the instance level of analysis, as recorded in the last column of Table 2. Paivio’s (1986) dual-coding theory proposed that concrete words are better remembered than abstract words because a visual image provides an additional memory code. The distinction between concrete and abstract representations of text is also made at the instance level. Concrete phrases support visual simulations of action (Glenberg & Kaschak, 2002; Pecher, 2013; Stanfield & Zwaan, 2001) that supplement static images of individual words. Zwaan’s (2004) immersion theory argues that an embodied theory of language is required to replace more abstract representations. However, embodied representations appear to be constructed only under restricted conditions. These include simple stimuli, tasks that are directly aligned with actions, existing visual/spatial grounding, and time to construct the simulations (Graesser & Forsyth, 2013).
Although embodied theories work well for simulating individual action phrases they cannot account for the more global integration of text included in the propositional, construction-integration model proposed by Kintsch and his colleagues (Kintsch, 1998; Kintsch & Mangalath, 2011). Representation of phrases as propositions provides the building blocks for integrating both concrete and abstract words into larger semantic structures that enable the integration of ideas within and across paragraphs (Kintsch & van Dijk, 1978). For instance, the representation of text about a lecturer at Cal State, Los Angeles includes the propositions (teach, speaker, student) and (Location: at Cal State College, Los Angeles). Propositions provide a structure that is language independent; French words could be substituted for English if the passage were in French. The ability of the construction-integration model to account for a large body of psychological data currently makes it the most comprehensive model of comprehension (Graesser & Forsyth, 2013).
Pictures
Evaluation of the memory implications of Paivio’s (1986) dual-coding theory focused on the distinction between concrete and abstract words. However, Paivio also proposed that pictures are more concrete than concrete words because people could use the picture as an image rather than create their own image. Subsequent work by cognitive scientists examined the explanatory value of encoding pictures as visual images. The primary debate occurred between Kosslyn (1981) and Pylyshyn (1973, 1981) regarding whether images consisted of a visual representation or were encoded into a propositional representation. The debate between iconic versus propositional encoding of pictures occurs at the instance level of analysis (Table 2).
A distinction, however, between pictures and images at the attribute level is their amount of detail. Images are typically less detailed than pictures because of the loss of attributes (Chambers & Reisberg, 1992). Diagrams are also schematic versions of pictures that strip away unnecessary detail (Table 2). Diagrams can either be physically constructed on paper (Hegarty & Kozhevnikov, 1999) or mentally constructed as images (Johnson-Laird, 2012). This section examines how abstraction at the instance and attribute levels influences performance on a variety of tasks involving pictures, images, and diagrams.
Create images
Abstraction of pictures at the instance level distinguishes between whether pictures are stored as concrete images as proposed by Kosslyn (Kosslyn, Thompson, & Ganis, 2002) or as abstract propositions as proposed by Pylyshyn. As argued by Pylyshyn (1973):
For the present, it suffices to point out that any representation having the properties mentioned above is much closer to being a description of the scene than a picture of it. A description is propositional, it contains a finite amount of information, it may contain abstract as well as concrete aspects and, especially relevant to the present discussion, it contains terms (symbols for objects, attributes, and relations) that are the results of—not inputs to—perceptual processes. (p. 11)
Such descriptions can be expressed in computer-based languages such as “(Sides Color Red)” and “((Top Bottom) (Color Blue))” to describe a cube with red sides and a blue top and bottom. Pylyshyn (2002) subsequently proposed that “the primary form in which representations are expressed consist of discrete sentence-like symbolic expressions” (p. 194) that he had labeled propositions. This use of the term has much in common with usage by Kintsch and van Dijk (1978) in their model of text comprehension. Pylyshyn (1981) further argued that, although images are experienced by people, the phenomenal experience of imaging is unimportant as a theoretical construct for cognition. Images can be expressed as propositions, and propositions are sufficient for constructing a theory of cognition.
The main adversary to this view has been Kosslyn. The decades-long debate between Pylyshyn (2002) and Kosslyn (Kosslyn et al., 2002) has not centered on the descriptive character of images. Rather, it centers on the argument that people process images as pictures. Images according to Kosslyn (1994) rely in part on a qualitatively distinct type of (depictive) representation in which “distances among portions of the representation correspond to distances among the corresponding portions of the object” (p. 198). In his book, Imagery and the Brain: The Resolution of the Imagery Debate, Kosslyn (1994) explained how a number of different cognitive operations, many of which are shared with visual perception, influence performance on visual imagery tasks. Examples are a visual buffer that maintains an image in short-term memory, an attention window that focuses on part of the visual buffer, and operations such as visual scanning. Support for these operations has been provided by advances in cognitive neuroscience (Kosslyn et al., 2002).
One consequence of storing images as descriptions is that it is difficult to find patterns in an image that are inconsistent with the description. Reed (1974) modified the standard embedded figures task by sequentially showing two patterns and asking participants to judge whether the second pattern was a part of the first pattern. Figure 1 shows examples of two patterns that occurred as the first pattern. The bold lines (which were not bold in the experiment) depict different second patterns that were parts of the first pattern. Differences in accuracy across parts supported the hypothesis that parts that matched a structural description would be easy to confirm and parts that did not match a description of the pattern would be difficult to confirm. Most of the participants confirmed that a diamond (1A) was a part of pattern 1 and a triangle (2A) was a part of pattern 2. In contrast, relatively few confirmed that a parallelogram was a part of pattern 1 (1D) or a part of pattern 2 (2C).

The bold lines illustrate different embedded figures contained within the figure. Based on Reed (1974).
This limitation is likely caused by difficulty in performing the cognitive operations described by Kosslyn (1994), such as maintaining an image in the visual buffer and scanning the image for parts that were not described in the initial encoding of the visual pattern. These operations might include reconnecting the horizontal and diagonal lines to form a new description or mentally moving the part across the image to determine whether it fits within a bounded set of lines. Limitations in processing images result from both maintaining the image in the visual buffer of short-term memory and performing analog operations on it. Both storage and cognitive operations compete for the limited capacity of STM (Barrouillett, Portart, & Camos, 2011).
Ambiguous figures provide another demonstration of how the initial descriptions of visual patterns constrain subsequent performance. For instance, some people interpret the ambiguous duck/rabbit figure as a duck and others as a rabbit (Chambers & Reisberg, 1985). None of the 15 participants in their experiment could use their visual image to reinterpret the figure, but all 15 could reinterpret a drawing of the figure. Chambers and Reisberg (1992) subsequently hypothesized that people precisely encoded and maintained only the more important attributes such as the face of the animal. Those who perceived the figure as a duck should therefore have a detailed description of only the left facial side of the duck, and those who perceived the pattern as a rabbit should have a detailed description of only the right facial side of the rabbit. The results from a recognition-memory test confirmed this hypothesis. People who saw a rabbit encoded the slight indentation that depicted the rabbit’s mouth, but people who saw a duck did not encode this same slight indentation on the back of the duck’s head. Discouraging verbal descriptions of the duck/rabbit figure helped people reinterpret the figure from an image (Brandimonte & Gerbino, 1993), presumably because they now maintained details that were difficult to verbally describe.
Although both finding an embedded part and reinterpreting an ambiguous figure demonstrate limitations of visual images, the limitations occur at different levels of abstraction. Reed’s (1974) interpretation is that people have a complete structural description of the pattern that would enable them to accurately recognize the pattern and the parts included in the description. Limitations in finding a part occur at the instance level because of limitations of performing analog operations on the image while maintaining it in the visual buffer (Kosslyn, 1981). In contrast, limitations in redescribing a pattern occur at the attribute level in the Chambers and Reisberg (1992) studies because of limitations of encoding and maintaining detailed features of the image.
Reason spatially
Although the previous studies revealed how schematic descriptions of pictures constrained the reinterpretation of images, other studies revealed that lack of detail can be an advantage in solving problems (Reed, 2010). The loss of detail occurs through abstraction at the attribute level. I will use the term diagram for these more schematic visual displays and images.
Diagrammatic representations are effective when details are irrelevant to solving a problem, as illustrated in the following example (Hegarty & Kozhevnikov, 1999):
At each end of the two ends of a straight path, a man planted a tree, and then every 5 meters along the path he planted another tree. The length of the path is 15 meters. How many trees were planted?
Students who created schematic representations focused on the spatial relations between objects, such as the distances between the trees. Students who created pictorial representations focused on the objects themselves. Use of schematic representations had a positive effect on solving problems, and use of pictorial representations had a negative effect. Hegarty and Kozhevnikov (1999) proposed that instructing students to “visualize” mathematical problems would not be successful unless it is clear that the visual representations should not contain irrelevant pictorial details.
Other studies have investigated the mental, rather than the physical, construction of diagrams. Johnson-Laird’s (2012) extensive research has demonstrated that much spatial and logical reasoning is based on mental models that differ from linguistic structures and semantic networks. According to Schnotz (2002), mental models represent objects based on a structural or functional analogy.
In his book Space to Reason: A Spatial Theory of Human Thought, Knauff (2013) provided extensive evidence to support his claim that logical reasoning is based on “spatial representations that are more abstract than visual images and more concrete than propositional representations” (p. 16). He referred to these representations as spatial layout models, but they satisfy the WordNet definition of a diagram. Support for spatial layout models comes from such findings, as detailed visual images can impede the process of reasoning and concurrent spatial tasks, but not visual tasks, can interfere with reasoning.
Particularly impressive is the computational model discussed in Chapter 7 of his book. PRISM (preferred inferences in reasoning with spatial mental models) predicts deductive inferences from indeterminate problems such as the following (p. 156):
The Ferrari is parked to the left of the Porsche
The Beetle is parked to the right of the Porsche
The Porsche is parked to the left of the Hummer
The Hummer is parked to the left of the Dodge
The problem is indeterminate because there are three possible spatial layout models for this set of premises: FPBHD, FPHBD, and FPHDB, in which the letters represent the initial letter of each car. PRISM proposes that the first of these is the preferred model because it is the easiest to construct. It also predicts the ordering of the nonpreferred models based on the ease of transforming the preferred model.
In addition to these static images, dynamic images can support reasoning through mental simulations (Schwartz & Black, 1996). An example problem is “Five gears are arranged in a horizontal line; if you try to turn the gear on the far left clockwise, what will the gear on the far right do?” Graduate students at Columbia University worked on a series of problems in which the number of gears in the chain varied. The experimenters used response times, error data, hand motions, and verbal reports of strategies to develop the model shown in Figure 2. Students initially simulated the task by using a detailed depiction of the gears. The fading phase created a more schematic image by eliminating unnecessary pictorial details. The codifying stage used language to describe the rule that gears alternate between moving clockwise and moving counterclockwise. The final quantitative reasoning stage related the alternate rotation of the gears to a more general rule: The last gear in the chain will rotate in the same direction as the first gear if the number of gears is odd and will rotate in the opposite direction if the number of gears is even. This study is particularly informative because it illustrates the elimination of unnecessary detail to form more schematic images that can then be examined to formulate a verbal rule.

A model of the transition from simulations to rules. From Schwartz and Black (1996).
Summary
One of the longest debates in cognitive science has concerned the propositional versus sensory characteristics of images. In the propositional perspective (Pylyshyn, 2002), it is argued that it is the structure of images that matters and propositions describe this structure. In the sensory perspective (Kosslyn et al., 2002), it is argued that imagery shares many characteristics with perception. A difference between perception and imagery is that cognitive operations on percepts are often more successful than cognitive operations on their images. People can more readily find an embedded figure in a percept than in an image (Reed & Johnsen, 1975) and reinterpret a percept than in an image of an ambiguous figure (Chambers & Reisberg, 1985). Failure to find new information in imagery is consistent with a propositional perspective that encoded structure constrains inferences. Success in finding new information in imagery is consistent with a sensory perspective that perceptual operations can be performed on images.
Details are not needed for many spatial reasoning tasks so a diagram that preserves only spatial relations has advantages over more detailed visual images. Diagrams can be helpful for solving both arithmetic word problems (Hegarty & Kozhevnikov, 1999) and deductive reasoning problems (Knauff, 2013; Knauff & Johnson-Laird, 2002). For some problems the elimination of details occurs with practice as learners gradually discover which details are relevant (Schwartz & Black, 1996).
Problems
The discussion of pictures, images, and diagrams in the previous section emphasized the instance and attribute levels of analysis. In contrast, the discussion of problems in this section focuses on the category level of abstraction. Solving a problem may be aided either by retrieving a specific episode or case from memory or by recognizing the problem as an example of a more general schema (Table 2). Schemas support solving a variety of structurally related problems, including those that require conditional and analogical reasoning. Research on conditional reasoning shows that association to a familiar schema supports difficult deductions. Research on analogical reasoning shows how instruction promotes schema abstraction.
Reason conditionally
Reasoning emphasizes drawing inferences (conclusions) from some initial information (premises) and has a foundation in logic. The four-card selection problem has been one of the most widely studied tasks in the reasoning literature and illustrates how abstraction at the category level influences conditional reasoning.
The problem requires deciding which of four cards should be turned over to evaluate a conditional rule such as if there is a D on one side of the card then there is a 3 on the other side. The four cards in this example either displayed the letter D, the letter K, the number 3, or the number 7. The experimenter informed participants that each of the cards contains a letter on one side and a number on the other side. The answer is that it is necessary to turn over the D card and the 7 card but only 5 of 128 participants turned over only the two correct cards.
Wason and Shapiro (1975) hypothesized that performance would dramatically improve if the conditional rules had realistic, rather than arbitrary, content; a prediction that was confirmed in a letter-sorting task (Johnson-Laird, Legrenzi, & Legrenzi, 1972). The task consisted of four envelopes. Two were face up, revealing either a 50-lira or a 40 lira-stamp. Two were face down revealing either a sealed or an unsealed envelope. Participants were told to imagine that they worked in a post office and had to enforce the rule, “If a letter is sealed than it has a 50-lira stamp on it.” Most participants accurately selected the two envelopes required to enforce the rule.
Although Wason and Shapiro (1975) argued that conditional reasoning is vastly improved with realistic content, Griggs and Cox (1982) questioned whether the letter task required conditional reasoning. Their memory-retrieval explanation proposed that the British participants did the task by recalling their experience in placing more postage on sealed envelopes. Griggs and Cox therefore predicted that their American students, who lacked such experience, would do poorly on the task. As predicted, students at the University of Florida did poorly on the unfamiliar letter task but excelled in evaluating a familiar drinking-age rule: “If a person is drinking beer then the person must be over 19 years of age” (the age limit in Florida at that time). These findings are discouraging because they support the conclusion that people are very limited in evaluating conditional rules unless the rules contain familiar content, in which case episodic retrieval (Rubin, 2007; Tulving, 1972) replaces reasoning.
A more optimistic view of reasoning is that people do well at conditional reasoning if the content is familiar at a general, schematic level. For instance, pragmatic reasoning schema are organized knowledge structures that enable people to evaluate practical situations such as seeking permission or fulfilling an obligation. Research supports the hypothesis that people do much better in evaluating conditional statements when they involve seeking permission or fulfilling an obligation, even if the content is unfamiliar (Cheng et al., 1986).
A related research topic that has attracted much recent interest is causal reasoning. Goodman, Ullman, and Tenenbaum (2011) evaluated a hierarchical Bayesian model that can learn an abstract theory of causality through induction as it learns about specific causal systems. They used the phrase “the blessing of abstraction” to refer to their finding that a correct theory of causality can be learned relatively quickly and sometimes precede the learning of specific causal theories. More collaborative projects between developmental and computational cognitive scientists should establish how “the blessing of abstraction” applies to children (Gopnik & Wellman, 2012).
Reason analogically
Analogical reasoning also can occur at multiple levels of abstraction that range from using a single instance as an analogy to using a more general schema based on shared relations among instances (Gentner & Smith, 2013). Instructional techniques to create more generic schema depend on whether the instances are visual or verbal. Belenky and Schalk (2014) included both visual and verbal instruction in their review of the effects of concrete and abstract materials on learning and transfer. They suggested that concrete materials can help capture students’ interest and enable them to link real-world knowledge with learning. Abstract materials are best when the goal is transfer but may be more difficult to initially learn.
One instructional technique labeled concreteness fading begins with concrete visual stimuli and gradually makes them more abstract (Fyfe, McNeil, Son, & Goldstone, 2014). Concreteness fading is based on the principle that both concrete and abstract materials have advantages and disadvantages; therefore, instruction attempts to take advantage of their combined strengths. The advantages of concrete materials are that they can activate real-world knowledge during learning, induce physical or imagined action, enable learners to create their own knowledge of abstract concepts, and activate brain regions associated with perceptual processing. The advantages of abstract materials are that they can focus attention on more useful functional features rather than superficial features and increase generalization across multiple contexts.
Goldstone and Son (2005) developed a computer simulation task to take advantage of the relative advantages of concrete and abstract representations. The concrete simulation depicted ants foraging for food without competing with each other for food resources. The abstract version of the task replaced the ants with small black dots and replaced the food with green patches. The results demonstrated that students were most successful in learning generic principles that would transfer to other tasks when they began learning with the concrete simulations and then switched to the more abstract simulations.
Another instructional technique that proceeds from the concrete to the abstract involves comparing worked examples. Worked examples provide effective solutions for particular problems (Renkl, 2014), but their limitation is that, like concrete simulations, they explain the particular thereby limiting generalization. A verbal-based method for enhancing transfer is to create more general solutions by comparing analogous examples. For instance, comparing two different examples of a contingent contract in a class at Northwestern University helped management consultants learn to recognize principles of a contingent contract and recall other examples of contingent contracts from their own experiences (Gentner, Lowenstein, Thompson, & Forbus, 2009).
A caveat is that comparing analogous examples does not always result in schema abstraction. Designing effective instruction therefore requires a better understanding of what should be compared (Rittle-Johnson, Star, & Durkin, 2009). A meta-analysis of learning through case comparisons evaluated 57 experiments with 336 tests (Alfieri, Nokes-Malach, & Schunn, 2013). Four of the 15 evaluated variables were most predictive of learning. Greater learning occurred when learners judged only similarities rather than similarities and differences, compared perceptual rather than procedural content, were tested immediately rather than on a later day, and received feedback on principles after making comparisons.
An ongoing debate concerns whether abstract schema can be taught directly rather than by beginning with concrete examples. A series of articles by Kaminski, Sloutsky, and Heckler—particularly one published in Science (Kaminski, Sloutsky, & Heckler, 2008)—has generated both praise and criticism (De Bock, Depretz, Van Dooren, Roelens, & Verschaffel, 2011) for the claim that students may benefit more from learning mathematics from a single, abstract symbol system than from exposure to multiple concrete examples. The debate concerning the effectiveness of teaching with abstract material is illustrated by the letters and reply in the December 12, 2008, issue of Science.
Summary
In this section, I discussed research on conditional and analogical reasoning at the category level of abstraction (Table 2). Schema abstraction occurs at the category level and contrasts with the use of an individual case or analogy. Research on conditional reasoning using the four-card selection problem found that people turned over correct cards if the conditional rule described familiar, rather than arbitrary, relations (Wason & Shapiro, 1975). The memory retrieval interpretation of these findings argued that correct answers were based on retrieval of related episodes so conditional reasoning was not required. An intermediate position is that conditional reasoning is possible for unfamiliar events when those events can be interpreted within the context of a familiar schema (Cheng et al., 1986).
Instruction to generalize analogical reasoning encourages students to think at higher (more schematic) levels of abstraction to increase transfer across similar problems (Belenky & Schalk, 2014). Concreteness fading is a visual-based method for achieving this goal by making visual simulations more generic (Fyfe et al., 2014). An example is replacing a simulation of ants foraging for food with a simulation of black dots approaching green patches to illustrate general principles (Goldstone & Son, 2005). Comparing worked examples is a verbal-based method that attempts to enhance transfer by creating a more generic schema for the analogous problems (Alfieri, Nokes-Malach, & Schunn, 2013). The effectiveness of teaching abstract principles directly (Kaminski et al., 2008), rather than through generalization of concrete material, requires further evaluation (De Bock et al., 2011). The goal of the next section is to show how the three levels of abstraction combine by analyzing how different categorization strategies differ on all three levels.
Application to Categories
The second question asks, how does abstraction at the different taxonomic levels combine? I provide examples from research on categorization strategies. There now appears to be a general consensus that people use a variety of categorization strategies. As stated by Hampton (1997):
There is a temptation for theorists to wish to apply their own approach to all conceptual representations. It is however most unlikely that all concepts are defined or represented in the same way. What is needed for the advance of the field is for a principled account to be given of the range of representational powers that people possess, and for a matching up of different kinds of representation with different conceptual domains. (pp. 105–106)
Applying the taxonomy requires understanding how representations combine at the instance, attribute, and category levels. Table 3 shows two cases for each of three strategies based on rules, prototypes, and exemplars. The strategies combine with the two types of instances (modal, amodal) and the two types of attributes (equivalent, distinctive) to produce the variations. A caveat is that coding of instances (modal vs. amodal) is typically not discussed in research on categorization strategies and therefore requires speculation based on the type of stimuli used in the task. In contrast, the role of attributes is usually stated specifically in the models.
Application to Category Learning
Rules
Rules typically involve an explicit reasoning strategy that can be stated verbally. Several conditions are required to base a decision on a verbally formulated rule (Ashby & Maddox, 2005). First, a semantic label must correspond to each of the attributes involved in the decision. Second, the decision maker must selectively attend to each relevant attribute. Third, the rule for combining information from the relevant attributes must also be verbalized. We should therefore expect that rule-based strategies involve amodal (linguistic) coding of instances based on distinctive (selected) attributes.
An example based on many studies is concept identification (Bruner, Goodnow, & Austin, 1956) in which participants have to learn the relevant attributes and the rule that distinguishes between two categories. Typical stimuli are geometric forms that differ in attributes such as color (red, black, white) and shape (square, triangle, circle). Typical rules are conjunctive (red and triangle), disjunctive (red or triangle), conditional (if red then triangle), and biconditional (if red then triangle, if triangle then red). Successful learning occurs when learners either verbally state the rule or make a series of correct classifications based on the rule.
Two variations of the concept identification task illustrate equivalent attributes in one case and distinctive attributes in the other case. Equivalent attributes occur in a variation referred to as rule learning, in which learners are told the relevant attributes (such as red, triangle) but have to discover the correct rule (Bourne, 1970). The two attributes are equivalent because both are necessary components of the rule and the experimenter selected the relevant attributes. Distinctive attributes occur in a variation referred to as attribute learning, in which learners are told the rule (such as a disjunctive rule) but have to discover the relevant attributes (Haygood & Bourne, 1965). The attributes are not equivalent in this variation because the learner has to discover the two that result in correct classifications.
Rules can also be based on spatial, rather than logical, relations as illustrated in a developmental study of 4- to 8-year-old children (Kotovsky & Gentner, 1996). In one variation a standard configuration consisted of a large circle flanked on each side by two small circles. A relational match involving the same dimensions (shape) consisted of a large square flanked on each side by two small squares. An opposite polarity match consisted of a small square flanked on each side by two large squares. Other relational matches occurred across dimensions (color) such as a black square flanked by two white squares.
Learning more complex spatial relations among natural objects also requires summarizing visual relations in a rule. Learners had to classify rock arrangements into one of three categories (Kurtz, Boukrina, & Gentner, 2013). One category included two vertically stacked rocks of the same color and shape. Another category included one rock supported by two others. A third category consisted of decreasing height from the left to the right of the arrangement. A comparison procedure that presented pairs of exemplars from either the same or different categories was helpful in promoting learning and transfer. All these examples of relational matches are classified as rule abstraction in my taxonomy.
Prototypes
A prototype model, in which people create an average pattern to represent the category, is another example of an abstraction at the category level of analysis. The prototype is an instance, but one that is created rather than experienced (Posner & Keele, 1968; Reed, 1972). As an average of category exemplars, the prototype would have the same (modal or amodal) representation as the category members.
Kuhl (1993) found that infants as young as 6 months formed prototypes to represent the basic speech sounds (phonemes) in their language. Evidence for prototype formation came from research demonstrating that infants could more easily discriminate between two nonprototypical sounds than between a prototypical and a nonprototypical sound. Kuhl (1991) used the metaphor of a “perceptual magnet” to describe the effect. The prototypic long-e sound draws similar long-e sounds closer to it, making these variations sound more like the prototype. A phonemic prototype is a modal representation with equivalent (and unspecified) attributes.
This perceptual magnet effect has several interesting implications. First, infants become worse at discriminating sounds within a phonemic category as they grow older. Forming prototypes reduces discrimination because variations of the prototype begin to sound more like the prototype. This should improve recognition by making variations of a phoneme sound more alike. Second, infants are worse at discriminating among familiar phonemes in their own language than among unfamiliar phonemes from a different language. For example, 6-month-old Swedish infants were better than U.S. infants at discriminating between a prototypic long-e sound and other long-e sounds (Kuhl, Williams, Lacerda, Stevens, & Linblom, 1992). The explanation is that the U.S. infants had formed a prototypic long-e sound and were therefore victims of the magnet effect, whereas the Swedish infants had not formed a prototypic long-e sound because this sound did not occur in their language.
Figure 3 shows another categorization task that likely encourages modal memory representations (visual in this case). The five faces in the upper row belong in one category and the five faces below belong in another category. The three faces at the bottom of the figure show the prototype for each category and a typical test pattern on the left. The best predicting of four general models for this task proposed that people compare each test pattern to a category prototype and select the category with the most similar prototype (Reed, 1972). Predictions of both the prototype and exemplar models improved when the features were differentially weighted by using a class-separating transformation that clusters patterns within a category and separates those that belong to different categories (Sebestyen, 1962). The normative weights for this example are .46 for forehead (eye height), .24 for eye separation, .24 for nose length, and .06 for mouth height. The improved predictions indicate that classifiers emphasize those attributes that are more informative when they judge the similarity between two patterns. The weighted features prototype model represents abstraction at the category (prototype) and attribute (distinctive features) levels but not at the instance (visual image) level.

Category 1 faces (row 1), Category 2 faces (row 2), and a test pattern, Category 1 prototype, Category 2 prototype (row 3). Based on Reed (1972).
My hypothesis that the prototypes and exemplars are stored as visual images is based on the difficulty of producing verbal descriptions of the spatial relations between facial features. Producing verbal descriptions of complex visual stimuli can impair subsequent recognition—a finding labeled verbal overshadowing (Schooler & Engstler-Schooler, 1990). For example, people who were asked to verbally describe pictures of faces were subsequently less able to recognize those faces than people who did not produce verbal descriptions.
Exemplars
In contrast to abstract the representations of categories by rules and prototypes, exemplar theories propose that new patterns are categorized by comparing their similarity to category members. Medin and Ross (1989) argued that (a) reasoning often relies on specific examples rather than on more abstract knowledge, (b) abstraction often appears to occur from using and comparing examples, and (c) induction is conservative in the sense that it preserves information about examples.
Early support for an exemplar theory came from Medin and Schaffer’s (1978) context theory of categorization. Their theory proposed that the item to be categorized acts as a retrieval cue that provides access to memory representations of one or more known examples. The greater the similarity between the unknown item and an example the greater is the probability that the example will be retrieved.
Nosofsky (1986) later formulated a generalized context model that generalizes the exemplar theory of categorization proposed by Medin and Schaffer (1978). His model assumes that both classification and identification are based on the similarity of patterns to stored exemplars but similarity is represented by a multidimensional scaling solution of the patterns. As for the weighted attribute models investigated by Reed (1972), feature attributes are differentially weighted to represent the influence of selective attention. A difference between the two formulations is that Nosofsky’s model uses estimated weights that can be compared to normative weights such as those calculated by Reed (1972).
Nosofsky (1986) evaluated the generalized context model on two observers who categorized stimuli composed of semicircles that varied in four levels of size and four levels of the angle of a radial line. The observers learned a sequence of different category structures consisting of four exemplars in each of two categories. Parameter estimates revealed some support for the hypothesis that the classifiers distributed their attention across the size and angle attributes so as to optimize classifications. The variation of these attributes across a relatively small range (52° to 59° for angle) would suggest that visual memory was influential for this task.
A different task also had distinctive attributes but was more amenable to a linguistic representation (Medin, Altom, Edelson, & Freko, 1982). Participants studied exemplars to learn to classify cases of burlosis. The cases varied on four binary attributes that indicated the presence or absence of swollen eyelids, ear splotches, discolored gums, and nosebleed. Two symptoms were perfectly correlated, such as discolored gums and nosebleeds, so a patient had either both or neither. When participants had to decide which member of pairs of patients had burlosis, they selected the patient with correlated features even when that patient had fewer typical symptoms. These findings are consistent with the context model (Medin & Schaffer, 1978) in which similar correlational information increases the probability of retrieving a category exemplar from memory.
SUSTAIN, a clustering model, has greater flexibility by combining exemplar and prototype models (Love, Medin, & Gureckis, 2004). SUSTAIN functions like prototype models when categories are very regular. When categories are very irregular—there is no obvious pattern linking members—the model functions like exemplar models. SUSTAIN is similar to both exemplar and prototype models in emphasizing critical features. In learning to classify car types it weights shape more than color because shape is a better predictor.
Summary
Rules, prototypes, and exemplars have all played prominent roles in theories of categorization. This section examines differences among categorization strategies at the category, instance, and attribute levels (Table 3). Rules typically require selection of attributes, but attributes become equivalent when selection is not required (Haygood & Bourne, 1965). Rules are often stated verbally (Ashby & Maddox, 2005), but distinctive features can be discovered through perceptual discrimination learning (Egeland, 1975). A prototype consists of the average value of the category exemplars (Posner & Keele, 1968; Reed, 1972) and help 6-month-old infants recognize the phonemes of their language (Kuhl, 1991). Prototypes do not eliminate attributes because they have the same number of attributes as the exemplars. However, weighted-feature models differentially weight the attributes based on their ability to discriminate between categories (Nosofsky, 1991; Reed, 1972).
Exemplars are concrete representations of categories in which classification decisions are based on category members (Medin & Ross, 1989). Two contrasting exemplar tasks require the categorization of visual stimuli that vary along continuous dimensions (Nosofsky, 1986) and the categorization of verbal stimuli that vary on binary dimensions (Medin et al., 1982). Both tasks showed evidence for a differential emphasis on attributes but in different ways. The visual task revealed a differential emphasis to two independent features (size, angle), whereas the verbal task revealed a differential emphasis to correlated features.
Application to Hierarchies
The third question asks how do the taxonomic levels apply to the subordinate, basic, and superordinate categories identified by Rosch. The theoretical and empirical work of Rosch et al. (1976) provide insights into how different levels within a hierarchy influence abstraction. Of particular importance for cognitive processing is the intermediate (basic) level consisting of categories such as piano, apple, hammer, shirt, lamp, and car. Basic categories can be partitioned into more specific, subordinate categories such as grand piano, delicious apple, claw hammer, dress shirt, floor lamp, and four-door sedan. Basic categories are also members of more general, superordinate categories such as musical instruments, fruit, tools, clothing, furniture, and vehicles.
This section examines how my proposed taxonomic levels at the attribute, instance, and category levels relate to subordinate, basic, and superordinate hierarchical levels. It would be reasonable to include Rosch’s hierarchy as a fourth level in my taxonomy based on grain size. Attributes are components of instances, instances belong to categories, and categories form hierarchies. Rosch, in fact, refers to subordinate, basic, and superordinate categories as occupying different levels of abstraction (Rosch et al., 1976). However, I did not include hierarchical levels in my proposed taxonomy for two reasons.
The first concerns definitions. Definitions are a prerequisite for formulating a taxonomy, and the APA Dictionary of Psychology does not mention hierarchies in its definitions of abstraction. The second reason concerns parsimony. Adding a fourth level composed of subordinate, basic, and superordinate categories to Table 3 would increase the number of potential cells from 12 to 36. Taxonomies lose their utility if they become too cumbersome to apply. This section, however, does apply the taxonomy to some key experiments performed by Rosch to examine how her three hierarchical levels influence abstraction at the attribute, instance, and category levels.
Perceptual recognition
Previous research had established that people can sometimes form an average pattern or prototype to represent a category, but a prototype strategy might not apply to all levels of her hierarchy. Rosch et al. (1976) found that people could not identify the correct superordinate category (animal, building, clothing, furniture, human body part, plant, tool, vehicle) from an average shape of two superordinate members. For instance, the average shape of a table and a chair would not be recognizable as furniture. The basic level (two tables) was the most inclusive level at which an average shape could be recognized. Average shapes at the subordinate level (two kitchen tables) were not more recognizable than average shapes at the basic level.
The consequences were revealed in an object detection experiment in which observers had to indicate whether a briefly exposed object was on the left or right side of a card (Rosch et al., 1976). The experiment tested the hypothesis that the basic level is the highest level in the hierarchy that can be represented by a code concrete enough to be called an image. Each exposure was preceded either by no hint, a superordinate hint (musical instrument), a basic hint (piano), or a subordinate hint (grand piano). There was no difference in accuracy between the no-hint and superordinate-hint conditions, demonstrating that superordinate hints are ineffective. A hint that the object was a musical instrument, a fruit, a tool, clothing, furniture, or a vehicle was too general to enable observers to form an image of a generic object. One could, of course, form an image of a particular tool such as a saw but this should be counterproductive if one guessed the wrong tool. A hint at the basic level, such as a hammer, did increase accuracy and was as accurate as a hint at the subordinate level.
Rosch’s emphasis on the primacy of the basic level led to her hypothesis that objects would be recognized most quickly at the basic level. She tested this hypothesis in a verification task that required participants to verify a picture of an object at one of the three hierarchical levels. The group who verified objects at the basic level had the fastest response times, and the group who verified objects at the subordinate level had the slowest response times. Rosch speculated that all objects were initially identified at the basic level. She suggested that superordinate verification then required an inference that the basic level object (apple) belonged to the superordinate category (fruit). Subordinate verification (delicious apple) required processing more perceptual attributes to discriminate among the members of the basic level category. Rosch’s research on shared attributes at different levels in the hierarchy supported explanations of people’s decisions both across (Rosch et al., 1976) and within (Rosch & Mervis, 1975) the levels, as we will see in the next section.
Attributes
Difficulty in forming an image at the superordinate level is related to another characteristic of abstraction—abstraction eliminates attributes. Rosch evaluated this characteristic by asking people to list the attributes of instances for each hierarchical level (Rosch et al., 1976). The data confirmed the hypothesis that instances of superordinate categories have few attributes in common. For example, there were three shared attributes (make things, fix things, metal) for tool at the superordinate level, seven additional attributes (handle, teeth, blade, sharp, cuts, edge, wooden handle) for saw at the basic level, and one additional attribute (used in construction) for cross-cutting handsaw at the subordinate level. The average across six hierarchies was approximately two shared attributes at the superordinate level, eight shared attributes at the basic level, and nine shared attributes at the subordinate level.
The smaller number of attributes at the superordinate level has implications for the loss of category knowledge from semantic dementia (Rogers & Patterson, 2007). Ability to match words with pictures rapidly declined with increasing semantic dementia at the basic (dogs, birds, cars, and boats) and subordinate (Labrador, robin, BMW, and ferry) levels. However, ability did not decline for superordinate categories (animal, vehicle). The investigators proposed that the more specific attributes within a semantic network become distorted with increasing semantic dementia while general attributes remain intact. A general attribute such as eat would therefore be less affected than a more specific attribute such as beak. Rogers and Patterson applied a previously developed neural network model (McClelland & Rogers, 2003) to successfully implement their hypothesis.
Shared attributes were also instrumental in Rosch and Mervis’s (1975) theory of why members of superordinate categories differ in typicality. One theory of typicality is a prototype theory in which more typical members are more similar to the category prototype (Love, 2013). However, prototypes do not exist for superordinate categories (Rosch et al., 1976), therefore an alternative theory is required. Rosch and Mervis (1975) proposed a theory based on family resemblance in which exemplars that have more attributes in common with other category exemplars are more typical. Participants in their study rank ordered 20 provided exemplars for each of six superordinate categories: furniture, vehicle, fruit, weapon, vegetable, and clothing. The correlations between predicted and obtained typicality ranged from .84 for vegetable to .94 for weapon. The three most typical members for weapon were gun, knife, and sword. The three least typical members were words, foot, and screwdriver. The implication is that the attributes of gun, knife, and sword are shared with other weapons whereas the attributes of words, foot, and screwdriver are not shared with other weapons.
It should be noted that family resemblance would likely not be a good predictor of typicality within basic categories. For the superordinate category tool, people listed an average of 8.7 shared attributes at the basic level (hammer, saw, screwdriver) and 9.2 shared attributes at the subordinate level (claw hammer, hack handsaw, Phillips screwdriver). The small gain in shared attributes at the subordinate level implies that subordinate members are very similar to each other and would therefore all have high family resemblance scores. Uniformly low family resemblance scores for goal-derived categories also fail to predict typicality (Barsalou, 1991). The items useful for a camping trip are so varied that they share few attributes with each other.
Summary
One contribution of Rosch’s research was identifying those hierarchical levels that support prototype formation. Prototypes can be formed at the subordinate and basic levels but not at the superordinate level. There are also fewer shared attributes at the superordinate level (Rosch et al., 1976). Shared attributes declined from approximately nine at the subordinate level (claw hammer, floor lamp) to approximately two at the superordinate level (tools, furniture). In addition, shared attributes influenced the judged typicality of instances within superordinate categories. Typicality was a function of family resemblance—the number of shared attributes with other category members (Rosch & Mervis, 1975). Neither instances nor attributes are abstracted in determining family resemblance. Instances are treated as exemplars, and attributes are equivalent. A family resemblance score for an instance is simply the sum of all shared attributes across all other category instances. The power of Rosch’s empirical and theoretical contributions is that they demonstrated how hierarchical levels influence shared attributes, exemplars, and prototypes.
Implications for Future Research
The fourth question asks, what are the implications of the taxonomy for future research? My hope is that the taxonomy will serve as a helpful theoretical framework for further development and possible modification. Further development can take the form of new empirical studies or new theoretical organization of existing studies.
One topic for future research is to determine whether there is a theoretical advantage for representing abstraction as a continuum. As stated by Burgoon et al. (2013), “given that abstraction operates on a continuum, we use the term levels of abstraction throughout the article to reflect this point” (p. 503). The taxonomy in the present article instead uses categorical information. As one example, the high- and low-imagery words investigated by Paivio were the words with very low and very high ratings on an imagery scale. Is there any theoretical advantage to also include words with intermediate values to establish a continuum rather than a dichotomy?
A second topic is to investigate constraints among the three taxonomic levels. Table 3 presents 6 of the 12 combinations of abstraction at the 3 category (rule, prototype, exemplar), 2 instance (modal, amodal), and 2 attribute (equivalent, distinct) levels. The greatest challenge to claiming that the three levels are orthogonal is to find modal representations of rule learning because rules are typically verbal (Ashby & Maddox, 2005). Even rules learned in visual category learning are often based on language:
In principle, an experimenter could teach a subject a collection of visual categories in a wide variety of ways. If the categories can be described by verbal rules, even imperfect verbal rules, explicit instruction in those rules is one approach. (Richler & Palmeri, 2014, p. 80)
The authors review exemplar and prototype theories as alternatives to rule learning and both of these can be based on modal representations, as illustrated in Table 3.
A third topic for further investigation is the influence of context in abstraction. An example is role-governed categories that consider the role that an object plays in a situation, rather than as just a member of a category (Goldwater, Markman, & Stilwell, 2011). Role-governed categories, such as guest and host, are typically components of a general schema such as visit. Relations are important in defining role-governed categories, but the relations are defined within a larger knowledge structure in which roles interact. Ideals are also prevalent in role-governed categories such as “We would like our guests to be clean, courteous, fun, and unobtrusive.”
A fourth topic is to link the taxonomy to cognitive architectures to investigate how concrete and abstract representations at different grain sizes can be integrated within the same information-processing system. Newell (1990) argued that such systems are necessary to build unified theories of cognition. However, his own development of a cognitive architecture—Soar—has been limited until recently by its emphasis on amodal propositional information. The addition of capabilities to process visual–spatial information in Soar/SVS (Lathrop, Wintermute, & Laird, 2011) now enables it to combine both propositional and visual information. I have proposed a modification of Soar/SVS to apply it to a variety of psychological paradigms including the progression from pictures to diagrams to rules depicted in Figure 2 (Reed, 2016).
A fifth topic is to extend application of the taxonomy to other domains of psychology. I intentionally omitted its application to actions because of the thorough review of this topic by Burgoon et al. (2013). They proposed that abstraction of actions is grounded in goal hierarchies in which actions can be represented more specifically as how they are performed and more abstractly as to why they are performed. Identification of actions at lower levels focuses on movements, whereas identification at higher levels focuses on comprehensive understanding (Vallacher & Wegner, 2012).
A taxonomy composed of terms such as modal, amodal, equivalent, distinct, exemplar, rule, prototype, and schema is itself abstract; therefore, I conclude with a short biographical summary of one person’s incredible display of abstract thinking in action.
A Postscript
Howard Gardner’s (1993) Creating Minds analyzed the construct of creativity as demonstrated through the lives of Freud, Einstein, Picasso, Stravinsky, Eliot, Graham, and Gandhi. For readers who remember my opening quote, it will come as no surprise that my choice for demonstrating the power of abstract thinking is John von Neumann. The opening quote is taken from Turing’s Cathedral: The Origins of the Digital Universe (Dyson, 2012). The book documents Turing’s contributions to computing, but the real star is von Neumann, whose story is scattered throughout the book. The following paragraphs summarize a few parts of Dyson’s documentation.
John von Neumann, born in Budapest in 1903, received an appointment in the Mathematics Department at Princeton University in 1931 followed 2 years later by a professorship at the Institute for Advanced Study. His remarkable mathematical abilities had already been demonstrated in his 1928 article “Die Axiomatisierung der Mengenlehre” (The Axiomatization of Set Theory). Axioms reduce a subject to a minimal set of initial assumptions that are sufficient to fully develop the subject without further assumptions. The axiomatization of set theory provided the foundation for the rest of mathematics. A previous attempt by Bertrand Russell and Alfred North Whitehead (Principia Mathematica published between 1910 and 1913) still left fundamental questions unanswered after 1,928 pages and three volumes. A surprising aspect of von Neumann’s approach was its conciseness; his axioms occupied approximately one page of print. Von Neumann’s interest in brevity would later serve him well in developing the first computing machines.
Von Neumann’s approach to problems enabled to make contributions to many aspects of mathematics. As described by mathematician Paul Halmos:
It was his genius at synthesizing and analyzing things. He could take large units, rings of operators, measures, continuous geometry, direct integrals, and express the unit in terms of infinitesimal little bits. And he could take infinitesimal little bits and put together large units with arbitrarily prescribed properties. (Dyson, 2012, p. 50)
Another landmark publication, Theory of Games and Economic Behavior, was written during the war years with Oskar Morgenstern (von Neumann & Morgenstern, 1944). The premise of the book was that a reliable economy could be constructed out of unreliable parts. Military strategists were the first to adopt the principles of game theory, followed by economists. As stated later by the economist Paul Samuelson, von Neumann “darted briefly into our domain and it has never been the same since” (Dyson, 2012, p. 45).
A third major contribution, indicated by the opening quote, was constructing the architectures of the first computers. The functional elements of a computer consisting of a hierarchical memory, a control system, a central arithmetic unit, and input/output channels are still known as the von Neumann architecture. These are only some of von Neumann’s remarkable achievements, any one of which would serve as a foundation for a brilliant career.
If von Neumann’s life shows us the power of abstraction, his death at age 53 from cancer reminds us of its fragility. In an interview with Dyson, 21-year-old Marina von Neumann recalled a visit to her father’s hospital room shortly before his death:
Her father “clearly realizes that the illness had gone to his brain and that he could no longer think, and he asked me to test him on really simple arithmetic problems, like seven plus four, and I did this for a few minutes, and then I couldn’t take it anymore; I left the room,” she remembers, overcome by “the mental anguish of recognizing that that by which he defined himself had slipped away.” (Dyson, 2012, p. 272)
Footnotes
Acknowledgements
I thank the editors and reviewers for their many helpful comments. Work on this manuscript occurred while I was a visiting scholar at the Center for the Study of Language and Information, Stanford University, and at the Department of Psychology, University of California, San Diego.
Declaration of Conflicting Interests
The author declared no conflicts of interest with respect to the authorship or the publication of this article.
