Abstract
There is a growing interest in the use of visual thinking techniques for promoting conceptual thinking in problem solving tasks as well as for reducing the complexity of ideas expressed in scientific and technical formats. The products of visual thinking, such as sketchnotes, graphics and diagrams, consist of ‘multimodal complexes’ that combine language, images, mathematical symbolism and various other semiotic resources. This article adopts a social semiotic perspective, more specifically a Systemic Functional Multimodal Discourse Analysis approach, to study the underlying semiotic mechanisms through which visual thinking makes complex scientific content accessible. To illustrate the approach, the authors analyse the roles of language, images, and mathematical graphs and symbolism in four sketchnotes based on scientific literature in physics. The analysis reveals that through the process of resemiotization, where meanings are transformed from one semiotic system to another, the abstractness of specialized discourses such as physics and mathematics is reduced by multimodal strategies which include reformulating the content in terms of entities which participate in observable (i.e. tangible) processes and enhancing the reader/viewer’s engagement with the text. Moreover, the compositional arrangement creates clear stages in the development of the ideas and arguments that are presented. In this regard, visual thinking is a form of cultural communication through which abstract ideas are translated and explained using a multimodal outline or summary of essential parts by adapting resources (e.g. linguistic resources and mathematical graphs), using new resources (e.g. stick figures and other simple schematic drawings) and maintaining others from the original text (e.g. mathematical symbolic notation), resulting in a congruent (or concrete) depiction of abstract concepts and ideas for a non-specialist audience.
Keywords
Introduction
The methodology of visual thinking rests on the intertwined relation between visual perception and cognition. It assembles various resources, such as written language (via handwriting, hand-drawn typography), basic visual shapes (e.g. dot, line, circle, arrow) and graphs (e.g. charts and timelines), to serve a variety of purposes (e.g. make complex ideas clear, support and generate deep thinking in problem-solving tasks and summarize main ideas) through a diverse range of techniques (e.g. graphic recording, visual storytelling, sketching, clay modelling, use of sticky notes or index cards) in a variety of genres, such as infographics, sketchnotes and storyboards.
Visual thinking practices, as defined in this article, are informed from knowledge generated in different disciplines. Some of the most relevant contributions to visual thinking are found in the works of Horn (1998) and Tufte (1983) in the field of information design, in Tversky’s (2000, 2004, 2005, 2011) research in the field of cognitive psychology on spatial language and thinking, and diagram production and comprehension, and in the field of knowledge visualization (Bertschi et al. 2013; Eppler, 2013). Theories on diagrammatic reasoning, introduced by Charles S. Peirce, have also contributed work of value to visual thinking (Hoffman, 2003; Shin, 2002). Visual thinking is also grounded in the work of Arnheim (1954, 1969) who, influenced by the principles of Gestalt, challenged the traditional dominance of reasoning and language by developing a theory of visual perception and thinking.
The past few years have seen an upsurge of interest in visual thinking techniques in professional fields such as marketing, business and journalism, with several handbooks promoting the practice of visual thinking (e.g. Brown, 2014; Gray et al., 2010; Roam, 2008, 2009, 2011; Rohde, 2013, 2015; Sibbet, 2010, 2011, 2013). However, the potential value of visual thinking is still underexplored in academic practice and research. There has been some research which explores the effects of different formats of visual thinking on different aspects of learning (Berry and Chew, 2008; Dexter and Hughes, 2011; Nesbit and Adesope, 2006), as well as an interest in issues related to visual thinking processes within artificial intelligence and computer science, such as effective information and diagram design (Chabris and Kosslyn, 2005; Tufte, 1983), analysis of spontaneous visuals via whiteboards (Walny et al., 2011) and the capacity of machines to solve visual problems during thinking processes (Les and Les, 2008). Despite these studies, little is known so far about how meaning is made in visual thinking through the various semiotic resources it deploys since, to our knowledge, consistent and systematic identification of the semiotic principles underlying visual thinking has not taken place in the scientific literature.
The Sketchnote: A Visual Thinking Genre
Visual thinking is present in a wide array of formats or techniques, such as doodling, graphic recording, graphic facilitation and scribing. The result of these techniques can be one, several, or a mix of the following genres: mind maps, concept maps, storyboards, maps, storymaps and visual journals, amongst others. This article focuses on the analysis of the scientific sketchnote, a visual thinking form that integrates notes and sketches to explain scientific topics. In design studies, a sketch is defined as ‘a quick rough drawing or outline by hand in simple strokes’ (Kurz, 2008: 360). Rohde (2013: 2) defines them as ‘rich visual notes created from a mix of handwriting, drawings, hand-drawn typography, shapes, and visual elements like arrows, boxes & lines’. Sketches have received increasing attention in various research areas in recent years, including cognitive science, computer graphics, human computer interaction and machine learning, among others (Buxton, 2007; Eitz et al., 2012; Forbus et al., 2011; Li et al., 2015; Wang et al., 2016; Zhang et al., 2011), while extensive research on sketches and (visual) thinking has been conducted by Tversky and her collaborators (e.g. Suwa, Tversky et al., 2001; Tversky, 2002; Tversky and Suwa, 2009).
Sketchnotes range from technical to more creative and artistic designs. Recently, sketchnoting has been gaining popularity at scientific and technical events such as conferences and workshops, where the content of an event is captured visually, later edited and shared via social networks and blogs. 1 In this type of event, as well as in business meetings and other professional activities, other visual methods are also adopted, which result in quasi similar visual notes to sketchnoting: for example, in graphic recording, the content of an event is recorded visually and displayed in real time for an audience and, in graphic facilitation, the audience is visually led towards a goal.
Paralleling the use of genres traditionally considered simplistic or frivolous, such as comics, for communicating and teaching in science (Bahr et al., 2016; Gonick and Criddle, 2005; Gonick and Huffman, 1990) and in medicine or healthcare (Green and Myers, 2010), a style of sketchnote is emerging in the scientific domain which manipulates the complexity of original (written or oral) sources in ways that the original discourse cannot.
In this article, we attempt to decipher the meaning-making procedures through which the scientific sketchnote reduces the complexity of highly specialized scientific knowledge in various academic papers in physics. For this purpose, we analyse four sketchnotes 2 based on scientific papers in physics produced by Dr Robert Dimeo. 3 The sketchnotes are illustrated in Figure 1 (a–d).

(a) Sketchnote 1, (b) Sketchnote 2, (c) Sketchnote 3, and (d) Sketchnote 4. © Robert Dimeo. Reproduced with permission.
Dimeo (2015) explains that he started to use the sketchnoting technique while he was looking for a way to improve his understanding of technical seminars. A sketchnote of an article would allow him ‘to review it later and understand the main points’. The combination of technical wording and other meaning-making strategies found in scientific articles often results in intricate descriptions which are enigmatic to the uninformed reader. Due to the mathematization of physics concepts such as motion, time or distance, the language of physics adopts the discourse of mathematics, inclusive of mathematical symbolisms, and visual images such as diagrams and graphs. 4
The multimodal nature of the scientific sketchnote, where various semiotic resources are deployed for meaning making, requires a multimodal approach. Hence, using Systemic Functional Multimodal Discourse Analysis (SF-MDA), this article seeks to explore the behaviour of the various semiotic resources at stake in the four sketchnotes. The analysis intends to serve two main purposes: (1) to investigate strategies that visual thinking uses to facilitate understanding of complex topics; and (2) to establish the validity of the SF-MDA model in providing a robust theoretical approach for analysis, evaluation and monitoring of visual thinking texts.
Systemic Functional Multimodal Discourse Analysis (SF-MDA)
The rise of digital technologies has helped widen the spectrum of tools that allow for a richer integration of forms of communication. Accordingly, the focus of study in language-related fields has been extended from language as a stand-alone resource to other semiotic resources that contribute to meaning-making, resulting in Multimodal Discourse Analysis (MDA). MDA draws on the social semiotics tradition, a branch of semiotics which focuses on ‘the way people use semiotic “resources” both to produce communicative artefacts and events and to interpret them … in the context of specific social situations and practices’ (Van Leeuwen, 2005: Preface). Social semiotics builds largely upon Halliday’s theory of Systemic Functional Linguistics (SFL) (e.g. Halliday, 2009; Halliday and Matthiessen, 2014). According to Systemic Functional Theory (SFT), language is functional as it serves a variety of functions in the social reality or culture in which it is used, and it is systemic as it consists of networks of systems with sets of semantic options available to speakers. SF-MDA broadens this scope by including semiotic resources other than language. The approach has been adapted to the analysis of images (Kress and Van Leeuwen, 2001), art and architecture (O’Toole, 2011), music (Van Leeuwen, 1999) and mathematical images and symbolism (O’Halloran, 2008) in artefacts such as websites and infographics (O’Halloran et al., 2016) and films (Bateman, 2014; Bateman and Schmidt, 2012), among others. SF-MDA is informed by a number of defining principles from SFT (O’Halloran et al., 2016). The two most relevant here are: (1) the metafunctional hypothesis; and (2) resemiotization.
Metafunctions
Linguistic texts typically make not just one, but a number of meanings simultaneously. In SFT, language and other semiotic systems interact to realize three kinds of meanings (metafunctions) simultaneously: (a) ideational meaning, for making sense of and construing human experience (i.e. experiential meaning) and making logical connections (i.e. logical meaning); (b) interpersonal meaning, for enacting social relations; and (c) textual meaning, for arranging meanings in coherent text (Halliday, 1978, 2009; Halliday and Matthiessen, 2014).
In the grammar of language, experiential meaning is realized through the Transitivity system. The grammar of Transitivity accounts for the world of human experience by realizing it, bit by bit, in clauses. At the centre of a clause is a Process: some type of action or relationship, realized by a verb. Associated with the Process are the participants involved in the process (who is doing what to whom) and the Circumstances in which the process is taking place (for example, when, where, how, why). There are six fundamental types of processes, for example Material Processes, which are processes of doing, typically concrete actions such as doing or creating. Logical meaning addresses logical relations among clauses in clause complexes (more or less equivalent to sentences) and is typically realized by conjunctions.
Interpersonal meaning is realized in systems of Speech Function, Mood, and Modality. Speech Function looks at language as an exchange of either information or action. Exchanges of information are realized by statements, and questions and exchanges involving action are realized by offers and commands. Each of these speech functions is realized by choices from the Mood system as clauses in either declarative, interrogative, or imperative Mood. Modality accounts for degrees of likelihood, obligation, usability, obligation and inclination. It operates to fill in the space between ‘Yes’ and No’.
Textual meaning is realized in how information is organized. In language, the system of Theme addresses the organization messages based on the position of information in a clause. The starting point of the message, which connects the current message to preceding discourse and to context as well as introducing the topic of the clause is called the Theme. The rest of the clause, which presents new information (what is added to the Theme) is known as the Rheme. For example, in the sentence ‘Babies cry for many reasons such as hunger and discomfort’, ‘Babies’ is the Theme and the rest of the sentence is the Rheme.
Resemiotization
Resemiotization is the process through which meanings are transformed from one semiotic system to another as social processes unfold (e.g. Iedema, 2003). Inspired by Jakobson’s notion of intersemioticity and the Actor-Network-Theory concept of translation (Callon, 1986; Latour, 1987; Law and Hassard, 1999), Iedema (2003: 41) introduces the term resemiotization to refer to ‘meaning-making shifts from context to context, from practice to practice, or from one stage of a practice to the next’. Viewed in terms of Activity Theory (Engeström, 2000, 2009; Engeström et al., 1999), resemiotization involves transformations ‘triggered by disturbances and concrete innovative actions’ (Engeström, 2000: 309). This, in turn, can generate expansive learning which can lead to a reconstrual of an activity, or, as is the case in this article, a reconstrual of knowledge. The concept of resemiotization is crucial in visual thinking, which involves translating and explaining ideas from a source text, interaction or event into another semiotic form using language, images and other resources. In this regard, resemiotization is the basis for understanding the procedures involved in visual thinking, as illustrated in the analysis below.
Analysis
Given the multimodal nature of visual thinking, the four sketchnotes are analysed from an SF-MDA perspective, by adapting and applying the principles of metafunctions and resemiotization. The analysis examines how meaning (ideational, interpersonal and textual) is made through each system (language, image, mathematical graphs and mathematical symbolism). For instance, using this approach, the organization and position of the elements in language, image, mathematical image or mathematical symbolism can be explained with reference to textual meaning as the textual metafunction deals with the organization and positioning of different elements, such as images, by means of the use of different resources, such as framing, colour and perspective.
We use SFL (Halliday and Matthiessen, 2014; Martin and Rose, 2007) for the analysis of language; O’Halloran et al.’s (2016) adaptation of O’Toole’s (2011) framework for the analysis of images; and O’Halloran’s (1999) framework for analysing mathematical images and symbolism.
The sketchnote as a multimodal complex
Sketchnotes are ‘multimodal complexes’ that consist of multi-layered structures made up of further multimodal complexes (MC), utilizing combinations of language, image, graphs, mathematical symbolic notation and other visual resources. The ways in which the different semiotic resources interact to make meanings comply with certain visual principles that are different from those that govern language. For instance, the distribution of linguistic, visual and mathematical selections in scientific sketchnotes is not necessarily sequential, as found in linguistic texts. The different elements in visual thinking do not necessarily unfold in time and space as words do in written language. 5 Visual thinking permits certain options in spatial organization that cannot be found in language. However, in order to meet the instructive purpose of the scientific sketchnote, this wider set of affordances must be narrowed by compositionally organizing the text in specific ways, for example, by following a layout and by using vectors and frames around multimodal complexes, which mark them as semantic units. In Sketchnote 1, the different MCs are arranged in a two-column structure (e.g. see Figure 2); the MCs in Sketchnote 2 are organized via a shaded path; the MCs in Sketchnote 3 correspond to framed boxes; while Sketchnote 4 follows a bubble-path layout. In all cases, the logical order in the visualization is achieved through the use of arrows.

Constituency-based analysis of Sketchnote 1. © Robert Dimeo. Reproduced with permission.
In the next section we explore how the different metafunctions unfold in scientific sketchnotes both globally and in their MCs.
Metafunctions
Textual meaning
Textual meaning makes use of different defining features to address the organization of information in each system (language, image, mathematical image and mathematical symbolism). The sketchnotes interpret the generic structure of the scientific paper in different ways. In general, the multimodal complexes follow the original structure of the paper. However, on some occasions, the original information is rearranged and moved forward or backward in the sketchnote. Sketchnote 1, for example, follows the typical generic structure of the scientific paper, but Sketchnote 3 integrates in the first multimodal complex, ‘What is a supramolecule?’, certain information that appears later in the paper.
Differences also apply to headings. Some multimodal complexes are introduced by means of headings. Some of these headings coincide with headings in the papers (e.g. ‘Analysis’, ‘Approach’); others include less academic headings not found in the paper (e.g. ‘What is a megasupramolecule?’ or ‘Results – Made it!’ in Sketchnote 3). Some papers or MCs do not include headings (e.g. Sketchnote 2); yet we can easily identify the purpose of their parts. For example, in Sketchnote 1, we know the purpose of MC1 and MC4, first, by means of its position in the sketchnote, i.e. the title (and reference) opens the sketchnote, and the conclusions close it, and, second, by their contents. In all cases, the title and headings of the sketchnotes provide the Theme for what follows in the multimodal complex, which could be considered the Rheme.
Textual meaning in MC1 in Sketchnote 1 is explored as an example of interaction between different semiotic modes. MC1 opens the sketchnote. It seems quite reasonable to think that MC1 is the first part to focus upon, although different readers/viewers might approach this MC differently. If we first look at the stick figures that configure the image work in the upper part of the MC, the next move is to focus on the text ‘HOW DO PEDESTRIANS MOVING IN CROWDS INTERACT WITH EACH OTHER’, which is necessary to make meaning out of the stick figures. We could also begin with the text and in that case the next move could be the figures. The use of the same colours in the image and text establishes the link between the two elements. On the other hand, the reader can move to the reference of the paper that the sketchnote summarizes, i.e. ‘from “Universal Power…”’, which, we assume, the experienced reader/viewer will easily recognize due to its use of reference citation format. In any case, a drawing of a paper is also found next to the textual reference. The linguistic and visual elements interact constantly to contextualize their meanings.
Finally, the use of colour and other devices, such as letter size, numbered lists, bold type, capitalization, or framing in different semiotic systems (text, images, diagrams) contributes to organizing the message and establishing various levels of attention for different pieces of information.
Ideational meaning
In order to make sense of and construe our experience of the world, processes, participants and circumstances, along with the logical structure behind the message, are expressed through different descriptors in each system. For example, ideational meaning will cover the use of mathematical symbols as logical connectors. Scientific English is characterized by a reformulation of certain linguistic processes via grammatical metaphor. Following Halliday’s (1993) analysis of scientific English, O’Halloran (2015) points to the reformulation of linguistic processes congruently expressed with verbs (e.g. ‘α happens’) into nouns (e.g. ‘happening α’) as one of the most common defining features of the language of mathematics, whereby the location of the meaning ‘happens’ is transferred from one place in the grammar to another. This phenomenon is called grammatical metaphor and produces various complex features of scientific English such as nominalization, lexical density and syntactic ambiguity (for a discussion of grammatical metaphor, see Halliday and Matthiessen, 2014: ch. 10.) For instance, a process such as ‘cracking’, which could be expressed congruently through two clauses ‘When glass cracks, how fast do the cracks grow?’, could be reconstrued in scientific English through grammatical metaphor into a long noun phrase such as ‘glass crack growth rate’ (adapted from Halliday, 1993: 87). In turn, the visual thinking strategies employed in the sketchnote can serve to unpack this kind of content through more congruent language structures via resemiotization. The result of this unpacking procedure is a more readily understandable text.
In terms of experiential meaning, all four sketchnotes contain instances of resemiotization. In Sketchnote 3, for instance, resemiotization occurs under the section ‘“GOLDILOCKS FORMULATION”’ (Figure 3a). Here, four ‘if-then’ statements expressed in language in the article (Figure 3b) are conveyed in the sketchnote by a mixture of language and arrows. Resemiotization is carried out from metaphorical language in the original text to congruent language in the sketchnote. The metaphorical forms that express processes through nouns in the original text (e.g. ‘formation of small cyclics’) undergo resemiotization in the sketchnote whereby the processes are now expressed via verbs (e.g. ‘cyclics form’). The same process applies to ‘degradation’, which is resemiotized as ‘break apart’; and ‘formation of supramolecules’, which is resemiotized as ‘MSMs made’. The fourth ‘if-then’ statement does not undergo resemiotization (i.e. ‘too few linear species form’ (original text) – ‘can’t form enough linear species’ (sketchnote)) since it is already found in a congruent form in the original source.

(a) Example of resemiotization in Sketchnote 3; (b) If-then arguments in Wei et al.’s (2015: 73) paper.
An interesting case of resemiotization occurs in Sketchnote 1, where various instances of resemiotization occur in MC1 that act upon varying degrees of difficulty (see Figure 4a). First, resemiotization from metaphorical linguistic mode to congruent linguistic mode takes place. ‘HOW DO PEDESTRIANS …’ is a resemiotization of the title of the original article ‘Universal Power Law Governing Pedestrian Interactions’. In this case, the title of the article includes an example of grammatical metaphor whereby the noun phrase ‘pedestrian interactions’ re-encodes the process ‘to interact’. In the sketchnote, the noun phrase is reinstated as the clause ‘pedestrians interact’, a more congruent form that makes the text more accessible to a non-specialist reader. Second, resemiotization from congruent linguistic mode to visual mode occurs. The accessible title ‘HOW DO PEDESTRIANS MOVING IN CROWDS INTERACT WITH EACH OTHER’ is resemiotized in the episode of the highlighted stick figures approaching each other in crowds, i.e. the meaning is transformed from the language system to the image system. By means of the image system, resemiotization solves the possible ambiguity of the related meanings of the verb ‘interact’ (e.g. to act together or towards others or with others or upon others).

(a) Example of resemiotization in Sketchnote 1; (b) Instances of logical and mathematical symbols in Sketchnote 1; (c) Line graph in Karamouzas et al.’s (2014: 2) article; (d) Line graph in Sketchnote 1.
Experientially, in both types of resemiotization, participants (pedestrians – stick figures), processes (interact/moving – stick figures approaching each other) and circumstances (in crowds) are shared in the systems involved as they have the same meaning. In order to convey the same experiential meaning in both systems and guarantee effective resemiotization, visualizations in the visual system must conform to the Congruence Principle (Tversky et al., 2002); that is, their structure and content should match the structure and content of the desired representation.
Ideationally, the scientific sketchnotes also utilize some features typical of mathematics discourse (see O’Halloran, 2008, 2015), such as relational identifying processes (processes where something is being identified as being equivalent to something else, e.g. Beijing is the capital of China, which could be expressed in mathematical terms as x = y) and logical meaning resources. In both cases, logical and mathematical symbols are used with the aim of displaying the same information in the scientific paper in a condensed but unambiguous form.
For example, as shown in Figure 4(b), relational identifying processes in Sketchnote 1 are found in the definition of functions mainly via the equal symbol (=), e.g. ‘g(x) = pair distribution function’, or ‘E0 = scene dependent char. energy’. Further relational identifying processes occur in the form of mathematical formulae, which are identical in the paper and sketchnote, as the information compressed in them cannot be transmitted in a more compact way. The sketchnote also deploys an array of mathematical symbols as logical connectors that condense information in a highly practical manner. Among them, we find an alpha symbol (α) to mean ‘is proportional to’, or the ‘therefore’ sign (∴) placed before the conclusion of the study.
Within the ideational metafunction, graphs are also displayed in particular ways in the sketchnotes. Not all graphs in the scientific papers are displayed in the sketchnotes. Only those that the sketchnoter considers relevant for understanding and recalling are selected. Most of them represent simplified resemiotized versions of the graphs of the scientific article. This idea aligns with Tversky et al.’s (2002) Principle of Apprehension, whereby the structure and content of effective visualizations should be readily perceived and comprehended. According to Tversky (2005: 37), graphs and diagrams, as schematic visualizations, ‘preprocess the actual information, extracting what is needed, even distorting it for emphasis, and eliminating what is non-informative’. Therefore, two degrees of schematization are implied in the graphs in the scientific sketchnote; the schematization of the original graph over the reality and the schematization of the sketchnote graph over the original one.
In Sketchnote 1, various graphs are drawn as adapted versions of the graphs in the scientific article, yet they manage to convey the essential data of the original diagrams. Compare, for instance, the line graph in Figure 4(c) to the line graph in Figure 4(d). In this example, the meaning of each graph is different in the article and in the sketchnote. In the framed line graph in the scientific article, the x axis represents distance and the y axis represents g(x), i.e. probability for finding two people within some distance x as they approach each other. Different relational processes take place. On the one hand, the function g(x) and distance x stand in a relational process materialized by the increase and decrease of three lines of different colours. On the other hand, by means of another relational process, these colours are identified with the colours of three straight lines included in a key or legend in the upper-right corner of the framed graph. Furthermore, these three straight lines are related to a range of pedestrians’ rate of approach. This relational process within the legend is based on the adjacency of each line to each formula. In this graph, g(r) exhibits two marked patterns of behaviour when rate of approach is considered: (1) if rate v is lower or equal to 1 m/s, and (2) if rate v is higher than 1 m/s. Based on this, the most interesting finding is that at lower rate, g(r) exhibits higher values than at fast rate.
The ideational meaning of the graph is contextualized using language in the caption below the graph and in the text of the scientific article. The numbered caption, ‘Figure or Fig. 1(c)’, describes the processes displayed in the graph. The identifier in the caption allows for referencing the diagram in the text (Karamouzas et al., 2014: 2). The linguistic mode in the caption summarizes the information in the graph, as follows: ‘Fig.1(c) The pair distribution function g as a function of interpedestrian separation r shows very different behaviour when plotted for pedestrian pairs with different rate of approach v = −dr/dt. Units of v are m/s’, and the linguistic information in the text itself repeats the information in the following lines:
However, as can be seen in Fig. 1(c), g(r) has large, qualitative differences when the data are binned by the rate at which the two pedestrians are approaching each other, v = −dr/dt. In particular, pedestrians with a small rate of approach are more likely to be found close together than those that are approaching each other quickly (as evidenced by the separation between the curves at small r).
This explanation serves as a reference point for explaining how semiotic resources work in the graph of the sketchnote, a naïve resemiotized representation of the essential data depicted in the original graph. At first glance, this second graph does not provide as much detailed information as the graph in the scientific article, a tendency observed in most graphs in these scientific sketchnotes. Sketchnote graphs are characterized by simplicity, in comparison with those in the scientific articles. The overall visual pattern of relations prevails but adjustments have been made in different parts of the graphs. For instance, the mathematical symbolism in the original legend has been resemiotized by using language. Thus, the two patterns of behaviour under ‘rate of approach’ are simply labelled here ‘slow approach’ and ‘fast approach’. These labels or captions are further explained by two episodes (visual mode) in which two stick figures (participants) take part in each of the slow and fast practices. Each of the episodes is linked to each caption in a double manner. On the one hand, the episode is placed contiguously to its corresponding caption. On the other hand, running and walking (material processes) are represented by the figures’ stance and accompanying visual devices (circumstances), i.e. two vertical lines behind the figure indicating slow walking, and three horizontal lines behind it implying running.
This part can be understood as the result of a resemiotization from the caption text in the scientific article ‘(Fig. 1(c))’ and the part in the text (Karamouzas et al., 2014: 2) which repeats the information, i.e. ‘However, as can be seen in Fig. 1(c), g(r) has large, qualitative differences when the data … (as evidenced by the separation between the curves at small r)’.
Two findings are implied in the linguistic component of the article: (1) there are differences in g(r) when the rate of approach is considered, and (2) these differences are further made explicit in the idea that pedestrians approaching each other slowly are more likely to be found close together than those approaching each other quickly. The second finding is made explicit in the sketchnote through a combination of language and visual mode in ‘g(x) small’ linked by means of an arrow to ‘collision avoidance’, as displayed in Figure 4(d). Here the explanation of the findings has been reduced to the minimum, where whole findings or clauses, such as ‘pedestrians with a small rate of approach are more likely to be found close together than those that are approaching each other quickly’ (Karamouzas et al., 2014: 2), are condensed into an adjective (a quality of a function: small), a mathematical symbol (the function (g(x)) and a noun phrase (collision avoidance). Therefore, differently from what happens in the scientific paper, the sketchnote eschews repetition in favour of the selection of essential information.
Differences in semiotic processes also take place in Sketchnote 2 and its reference article. In the caption for the graph in the article (see Figure 5a), the identification of the different participants in the graph (i.e. microspheres, microgels and depletion layers) is made first from language mode (e.g. ‘microsphere’) to language mode (e.g. ‘blue circles’), and second from language mode (‘blue circle’) to visual mode (e.g. the blue circle depicted in the graph) by means of relational identifying processes. These processes occur differently in the corresponding graph in the sketchnote (Figure 5b).

(a) Diagram in Luo et al.’s (2015: 2497) article; (b) Diagram in Sketchnote 2; (c) Identifying relational processes in Sketchnote 2.
In the sketchnote there is no need to establish the identification of the participants in the visualization via the linguistic mode as the sketchnoter has decided to advance this identification in a previous multimodal complex (Figure 5c). In this multimodal complex, the participants are identified through the relation between the visual mode (the big grey circle) to the language mode (e.g. POLYSTYRENE MICROSPHERES), which occur side by side. Together with the formulae retained from the original graph, the only information that is included in language in the diagrams in Figure 5(b) is the series of descriptive captions (e.g. ‘HOMOGENEOUS DISPERSION OF MS’, ‘STABILIZED’) that summarize information taken from the main body of the article. The result of these processes is a resemiotized graph that facilitates understanding by increasing the emphasis on interpersonal meaning.
Interpersonal meaning
In scientific English, especially in mathematics, the modality (i.e. truth value) is consistently high and there is little margin for variations of interpersonal meanings in terms of the ways in which social relations are established and maintained as the reader/viewer engages with the text. O’Halloran (2015: 66) addresses this aspect of scientific writing, also found in the physics scientific article under consideration here, as follows:
The grammatical strategies in scientific English refashioned the world of experiential meaning (i.e. physical happenings and events) into a logical discourse of argumentation. In this realm, choices in interpersonal meaning, for enacting social relations, including modality (e.g. probability, usuality and potentiality) (Halliday, 1994b; Halliday and Matthiessen, 2004) became largely invariant, providing a backdrop (i.e. a blank canvas) for the expansions of experiential and logical meanings which occurred. That is, as the experiential and logical meaning expanded in scientific English, the interpersonal realm contracted into formal, hierarchical relations with a high truth-value.
However, in order to establish a non-threatening rapport with the reader, interpersonal meaning is significantly intensified in sketchnotes through instances of resemiotization, for example in graphs and diagrams, and through the interaction of a series of devices (e.g. hand-drawn typography and imagery, anthropomorphized characters, humour, visual metaphors or the use of certain speech functions). In the scientific sketchnote, the result is a visually appealing and user-friendly product that contrasts with the gravity of the scientific topic it depicts.
One of the most recurrent devices in visual thinking is the use of stick figures, which are used for representing human figures that can serve a variety of purposes. Among others, they can be the participants of a process already described in a different semiotic resource, they can be projections of the reader/viewer-as-participant onto the visual thinking product, or they can serve to anthropomorphize non-human elements. These figures resemiotize the information encapsulated in the linguistic mode into more understandable meaning by playing the role of the interacting participants in the linguistic mode. For example, the interaction of stick figures in Sketchnote 1 represents circumstantial meanings, such as distance or speed, which are better grasped, and recalled, visually than textually.
Figure 6(a) shows an example of resemiotization through visual metaphor based on stick figures, in which the undertaking of a complex chemical process, i.e. the ‘gas separation challenge’, expressed in written form in the paper, is metaphorically represented by a pole-vaulting stick-figure athlete in the sketchnote. In many instances, these anthropomorphized cartoon-like characters provide humour, which enhances the readers’/viewers’ engagement with the text. Figure 6(b), for example, displays an ultra long polymer which by means of synecdoche is represented as two frightened cartoon faces. Humour is produced when the sketchnoter takes a conflictive situation in physics, as described in the paper, and humanizes it by visually dramatizing the characters’ reactions and emotions.

(a) Visual metaphor in Sketchnote 4; (b) Instance of humour in Sketchnote 3.
Another mechanism for engaging and guiding the reader/viewer is the use of a variety of speech functions, such as questions and emphasized statements. The interrogative mood directly addresses the reader/viewer. For example, in Sketchnote 1 not only does the question in MC1 (‘HOW DO PEDESTRIANS MOVING IN CROWDS INTERACT WITH EACH OTHER?’) work as an introduction to the topic but it also helps bring the topic closer to the reader/viewer as if it was his or her question on the topic. In MC3.1, the question ‘How close is he?’ is inserted in a comic-like thought bubble that links to a stick figure. The question and the visual strategy of the thought bubble plus the stick figure help make the meanings accessible and explicit via verbalization. Also, the information given through statements is emphasized in some cases: for example, ‘curves are the same for fast
Resemiotization types: summary
Table 1 provides a summary of the resemiotization types found in the analysis of sketchnotes. At least five types of resemiotization are identified, based on the transferral of meanings between scientific papers and sketchnotes. Types 1, 2, 4 and 5 address shifts that take place from the scientific paper to the sketchnote; Type 3 includes resemiotization within the sketchnote itself. Each type results in a series of meanings (ideational, interpersonal and textual) that converge in facilitating scientific contents to the reader/viewer of the sketchnote. Reference to instances of each resemiotization type in the analysis above is provided.
Resemiotization types in the sketchnotes.
Conclusions and Future Research
One of the major objectives of visual thinking is to enhance understanding of difficult, abstract, complex or ambiguous content by means of a compendium of resources and through a variety of multimodal genres. A representative example of a visual thinking genre is the sketchnote. Sketchnotes render a series of key features of visual thinking methodology via a combination of semiotic resources. This study has examined the characteristics of four sketchnotes based on scientific articles from physics. An SF-MDA model has been used to offer some insight into the meaning-making mechanisms underlying visual thinking methodology. This model has proved a useful means to explore the behaviour of the various semiotic modes both in the sketchnote and the paper.
Although the sketchnote might be simply regarded as a resource for communicating informal content, it has been shown to be a useful instrument for making complex scientific information accessible. Built upon a structure of multimodal complexes, ideational, textual and interpersonal meanings are achieved in ways which function to engage the reader/viewer and make the ideational context accessible. By means of a deliberate mix of semiotic resources, the sketchnote reduces the complexity of specialized discourse, such as physics or mathematics, while providing access to key ideas. In the examples addressed here, through the interaction of linguistic and visual resources, the sketchnotes manage to unpack the condensed, abstract meanings in the scientific article into more concrete forms. Along with these ideational mechanisms, the efficient arrangement of the contents in different meaning-making layers contributes to the clarification of the message, taking into account the reader/viewer’s engagement with the materials presented.
A key strategy that visual thinking exploits is resemiotization. This strategy is particularly important in the scientific sketchnote, where different types of resemiotization have been identified in the systems at stake (language, image, mathematical image, mathematical symbolism), and which act on various degrees of complexity. One of these types manages to unravel the complexity of language instances generated by grammatical metaphor, thereby achieving more readable discourse forms.
It is apparent from this research that a close examination of meaning-making processes requires a systematic approach such as the one offered by social semiotics and systemic functional theory. Through theoretical concepts and systematic analysis, it is envisaged that the full potential of visual thinking can be explored in the vast number of contexts that require information to be accurately and succinctly summarized in accessible and memorable forms. Contexts such as business management, marketing, administrative services and journalism are already using visual thinking as a means of facilitating content and stimulating thinking. In other fields, such as education, visual thinking is an emerging practice. Motivated by a lack of research on visual thinking texts from a multimodal perspective, the present study contributes to providing a starting point for the exploration of the underlying functions of the communicative mechanisms in visual thinking products in these contexts. The identification of meaning-making processes from a social semiotics and systemic functional perspective may contribute useful knowledge to experimental research on cognitive and learning abilities, which could inform more effective visual practices in the above-mentioned domains. For example, research on the attainment of cognitive tasks, such as perception, comprehension and reasoning, as ultimate goals of visual thinking, could benefit from the identification and manipulation of semiotic mechanisms ideationally, interpersonally and textually.
Although the study has successfully identified the underlying semiotic processes in a specific type of sketchnote, several limitations need to be acknowledged that may lead to further investigation. The present study has included four sketchnotes by the same author as he is one of the few practitioners doing scientific sketchnoting. Further research on sketchnotes done by others might complement the findings reported here. Similarly, future lines of research could extend to scientific sketchnoting based on oral discourse.
Footnotes
Acknowledgements
We are grateful to Dr Robert Dimeo for providing us with the sketchnotes focus of our study and for his valuable assistance throughout our work. Any remaining errors are our own.
Funding
This research was supported by the Spanish Ministerio de Educación, Cultura y Deporte via the ‘José Castillejo’ programme (CAS15/00205) within the framework of Programa Estatal de Promoción del Talento y su Empleabilidad en I+D+i, Subprograma Estatal de Movilidad, del Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016. There is no conflict of interest.
Notes
Biographical Notes
ALMUDENA FERNÁNDEZ-FONTECHA is a Lecturer in the Department of Modern Languages at the University of La Rioja (UR), Spain. She is a member of the UR Applied Linguistics Group (GLAUR). Her research interests include different dimensions of Content and Language Integrated Learning (CLIL), such as the study of multimodality for scaffolding language learning, foreign vocabulary learning and the study of learners’ factors in foreign language acquisition, such as motivation and creativity. Address: Department of Modern Languages, University of La Rioja, C/San Jose de Calasanz 33, Logrono, La Rioja 26004, Spain. [email:
KAY O’HALLORAN is Professor in the School of Education, Faculty of Humanities at Curtin University. Her areas of research include multimodal analysis, social semiotics, mathematics discourse, and the development of interactive digital media technologies and visualization techniques for multimodal and sociocultural analytics. Address: School of Education, Curtin University, GPO Box U1987, Perth, Western Australia 6845, Australia. [email:
SABINE TAN is a Senior Research Fellow in the School of Education, Faculty of Humanities at Curtin University. Her research interests include critical multimodal discourse analysis, social semiotics, and visual communication. She is particularly interested in the application of multidisciplinary perspectives within social semiotic theory to the analysis of institutional discourses involving traditional and new media. Address: as Kay O’Halloran. [email:
PETER WIGNELL is a Senior Research Fellow in the School of Education, Faculty of Humanities at Curtin University. Peter’s current research interests are in Systemic Functional Linguistics, especially in its application to the analysis of multimodal texts. His research has also focused on the role of language in the construction of specialized knowledge. Address: as Kay O’Halloran. [email:
