Abstract
In the field of data visualization, there has been a recent trend of using a complex type of visualization with a multidimensional structure or using several visualizations in parallel when summarizing the results of sentiment analysis. Although this trend may be useful for sophisticated sentiment analysis, such analysis is difficult for the general public and novice researchers. To address this issue, there has recently been a trend of visualizing sentiments using visual metaphors. To facilitate the understanding of related cases, it is necessary to have a systematic means of grasping the sentiment target, the purpose and motivation of research, and the representations used as substitutes for visual metaphors. Therefore, the objective of the present study was to develop an exploration system that can analyze the visual metaphors used in the case of sentiment visualization. For this study, (1) sentiment visualization cases in which visual metaphors are used were collected. (2) After a taxonomy composed of the categories of “target, intermediation, representation, visual variable, and visualization technique” was constructed, it was used to analyze sentences of visual metaphors appearing in sentiment visualization cases. (3) An exploration system capable of grasping the semantic relationships of sub-elements within the five categories of the taxonomy and intuitively interpreting visual metaphors was developed so that appropriate cases can be recommended to sentiment visualization researchers. (4) The approach and usefulness of the exploration system were explained using user scenarios. (5) A case study was conducted to show that the provided system can be analyzed from various perspectives. (6) The usability of the exploration system was demonstrated through a verification targeting experts. The proposed system allows researchers and analysts to intuitively grasp “what types of visual metaphor method and idea should be equipped to visualize sentiment data in an easier way to understand.”
Introduction
Research background
Sentiment analysis involves arranging the opinions, emotions, positions, attitudes, evaluations, and arguments that people have toward a specific object, content, or information to identify their characteristics. 1
For example, there are cases in which people report their overall evaluation of a movie, opinions about the characters, and emotions felt while watching the movie in the form of comments. Comments are examples of sentiment data, and sentiment analysis is the act of collecting data, refining it in the form of text, and then identifying and quantifying the frequency of sentiment information that frequently appears in it or classifying opinion holders who exhibit similar sentiments.
After conducting a sentiment analysis, it is necessary to summarize and present the opinion or emotion information reported by people. In the field of data visualization, there have been numerous visualization studies using sentiment analysis. Sentiment visualization analyzes the polarity, frequency, keywords, and relationships of sentiment information by type in text containing specific objects, content, and information, and expresses them in the form of a simple chart or diagram or as a complex multidimensional structure using a visualization technique. 2
In recent studies, characteristic vectors have been constructed from unstructured text using machine learning and deep learning, after which the polarity of sentiments appearing in it has been subdivided and classified, and the collected sentiment information has extended beyond positive and negative characteristics to various types of polarity.3,4 Moreover, owing to a tendency toward conducting “aspect-based sentiment analysis” which focuses on a detailed element of the subject rather than “the entire” subject, the amount of data being collected is increasing.5,6
In this context, data visualization researchers express sentiment information by using complex visualizations with multidimensional structures or by combining several visualization techniques. The results can serve as a useful analysis tool for more in-depth and sophisticated sentiment analysis, but the analysis is difficult for the general public as well as novice researchers who wish to explore sentiment information. In addition to visualizing “opinions, emotions, attitudes, positions, etc.,” it is important to comprehensively represent information such as “subjects expressing sentiments, holders of opinions with sentiment information, and the time when sentiments are recorded” for sentiment visualization; however, few visualization techniques have been introduced in the field of data visualization that can display all of this information at once
Therefore, when producing sentiment visualization, there is a tendency to use a combination of several visualization techniques. Further, the concept of “sentiment” is subjective in that each person can have different feelings even when viewing the same object, and different analyses can be performed even for the same sentiment visualization analysis results presented by the researcher. To address these issues, there has been a gradual increase in the number of cases in which sentiment visualizations are created using “visual metaphors.” Lakoff and Johnson 7 stated that a “metaphor” is used to explain an object that is difficult to understand by using an already known object, while Richards 8 considered a “metaphor” as comparing the originally intended expression, That is, tenor, to a vehicle. Here, the tenor refers to what was originally intended to be expressed, and the vehicle refers to the object that replaces what is intended to appear.
Visual metaphors are based on the principle of “metaphor” described above, which refers to “facilitating cognition by visually expressing an object or phenomenon that is difficult to understand in general with an object or phenomenon that is already known.”9,10 In visual metaphors using sentiment analysis data, the target of sentiment analysis can be considered the tenor whereas the representation that can replace the target of sentiment analysis can be considered the vehicle. The corresponding two elements are essential elements for explaining the visual metaphor.
Further, to understand metaphors, it can be useful to consider the goal or the intermediation of visual metaphors. 11 As representations used for visual metaphors can explain the tenor in its original form, they can be transformed (visual variables)—for example, by applying colors or changing the size—to reflect the information contained in the tenor in various ways. 12 Further, because this study deals with sentiment analysis visualization, it is necessary to consider visualization techniques that can be used in harmony with representations to help better interpret metaphors. 13
In this study, the five categories of visual metaphors were “target, intermediation, representation, visual variable, and visualization technique.”Figure 1 shows the definition for each category.

Definition of five categories of visual metaphors: organizing with a “what-why-how” approach. 33
In the present work, a taxonomy was utilized as a preliminary study to analyze the process of visual metaphor in sentiment visualization. 14 A taxonomy is the result of categorizing certain data according to specific criteria. 15 Figure 2 shows the structure of the taxonomy. The elements that constitute the taxonomy are largely composed of the main categories of “target, intermediation, representation, visual variable, and visualization technique,” each of which also has sub-elements. By using a taxonomy to analyze visual metaphors, it is possible to identify the types of representations that replace the targets in sentiment visualization, the purpose of the metaphor, the visual variables that are applied to the representation, and the visualization techniques with which the representation can be mixed.

However, among the collected visual metaphors, some are easily understood in terms of the overall approach of the metaphor, but they become more complex when applied to diverse targets and representations. For instance, when the target is divided into various aspects rather than representing the entire target, multiple types of representations may be employed, or the structure of representations may be further subdivided to convey the metaphor effectively. Therefore, if there is a method to systematically organize these various visual metaphor cases, it can significantly aid in intuitively understanding the multitude of visual metaphors present in sentiment visualization cases.
Additionally, aggregating various visual metaphors as data enables the creation of tools to “analyze frequently used metaphors, identify cases with similar metaphors,” or “compare metaphors using different representations for the same target and intermediation.”
With this background, the objective of this study was to develop an exploration system that can intuitively grasp visual metaphors in the case of sentiment analysis. This exploration system should be designed to connect and interpret the visual metaphors appearing in the cases of the five categories of the taxonomy according to the context. Having an understanding of the relationships between the sub-elements included in the main category constituting the taxonomy should provide the ability to semantically connect and interpret all the categories of visual metaphors.
Research purpose and method
This study aims to create an exploration system that assists in the step-by-step understanding of visual metaphors in sentiment visualization cases. Through this system, it intends to provide researchers (developers) and analysts planning sentiment visualization with an intuitive understanding of “what visual metaphor methods and ideas are necessary for making sentiment data more comprehensible.” We present the general process flow of this study in Figure 3.

General process flow.
First, sentiment visualization cases were collected by referring to literature data in the field of sentiment analysis and data visualization, and elements corresponding to the five categories utilized in the preliminary study were extracted from the cases.
Second, the visual metaphor elements extracted from the cases were organized by type according to the characteristics of the targets or representations, and then used for system production based on this organization. The exploration system provides a function to show semantic relationships between sub-elements needed to interpret visual metaphors. This provides the ability to connect and observe the sub- elements that constitute the visual metaphor.
Third, a user scenario was crafted to elucidate the methodology for accessing the created exploration system. Within these user scenarios, users can understand how to use the exploration system and how a fictional character in the scenario manipulates the system to achieve their objectives. These scenarios are broadly divided into two categories: “after selecting a theme suitable for a given target from the data, identifying the visual metaphor frequently used in that theme” and “exploring the purpose of visualization and the target based on the representation to be used.”
Fourth, to demonstrate the system’s contributions and validate its utility, case studies were created to present the characteristics of visual metaphors that can be found in the system and the analytical results from various perspectives. The case study involved three types: “deriving visual metaphors that frequently appear in the exploration system by type,”“selecting frequently appearing intermediations and interpreting related visual metaphors,” and “analyzing visual metaphors using system-provided themes.”
Fifth, a usability evaluation was conducted for the exploration system, and the results were analyzed. The usability verification was conducted in two parts: measuring the level of understanding of the major visual metaphors in the system and examining the level of satisfaction with the function configuration and interaction of the system.
Here, we present the study results, discuss the significance of the exploration system, and outline future research directions.
Contributions
The exploration system in this study verifies if the taxonomy from the preliminary study aids in analyzing visual metaphors within sentiment visualization cases. 14 It can also help analyze whether there are a series of regularities in the visual metaphors used in sentiment visualization. Users can sequentially explore individual metaphors or analyze usage patterns of representations, visual variables, and visualization techniques based on the most frequently appearing purpose and motivation. The contributions of this study can be summarized as follows:
(1) As preliminary research for the production of the exploration system, a category of a taxonomy utilizing of “target, intermediation, representation, visual variable, and visualization technique” was established according to the specific analysis tasks that appear when visual metaphors are created. This provides indicators that researchers can use to clearly identify the purpose and background for producing sentiment visualizations and to select representations that are suitable for specific purposes.
(2) To show the semantic relationships of sub-elements (keywords) in the categories of the taxonomy needed to interpret visual metaphors, the exploration system provides a network that shows the “target–intermediation relationship,”“intermediation–representation relationship,”“representation–visual variable relationship,” and “representation–visualization technique relationship.” Moreover, nodes of sub-elements that can be used to contextually interpret visual metaphors are clustered and displayed as colored ellipses, and the links constituting the axes of the Sankey diagram are then interlocked to enable multi-selection. This helps users easily select the trend of the use of representations according to the purpose or motivation as well as the combination of frequently appearing targets, thus facilitating the understanding of the context of the visual metaphor with the sub-elements in each category of the taxonomy.
(3) When visual metaphors are employed, the elements within each category are not acting independently; instead, they have causal relationships with each other. Our exploration system shows the frequency of visual metaphors appearing in sentiment visualization cases and utilizes a Sankey diagram to connect and interpret each category constituting visual metaphors through causal relationships. To easily organize the various visual metaphors that appear in sentiment analysis visualization, the five categories constituting the taxonomy of the preliminary study are shown as axes of the Sankey diagram. This allows the user to freely analyze visual metaphors with various targets and purposes by examining the categories the user has selected. The Sankey diagram can connect sub-elements in the five categories one-by-one to evaluate which visual metaphors appear mainly in the exploration system, thus allowing for easy determination of whether there is a series of regularities in interpreting visual metaphors. Furthermore, it offers the advantage of allowing users to effectively compare the frequency of each element used in the preceding and succeeding categories based on a specific category of interest, using the thickness of links.
(4) In the exploration system, representative images of collected studies are inserted so that, even if they have the same target or purpose, differences that appear when different representations are used can be identified, or similarities between cases involving the same visual metaphor can be observed briefly.
Related work
Visual metaphor for sentiment visualization
To create an exploration system, it is necessary to begin by examining and analyzing the visual metaphors that appear in sentiment visualization cases. This section elucidates the visual metaphors within sentiment visualization, utilizing the five categories established in the preliminary study’s taxonomy. 14
Six cases are presented, each highlighting diverse applications and combinations of sub-elements within the taxonomy. Figure 4 displays a representative diagram for each case.

Cases of sentiment visualization for which a visual metaphor is created.
In the study of Cao et al., 16 a metaphor was created using the “structure of a sunflower” to examine how original tweets spread over time on social media. The content and polarity of the tweet were expressed as the seeds (disk floret) and the color of the seeds, respectively; the shape of the seed changed when a tweet was activated and retweeted, while the diffusion pathway of seeds was marked using sunflower petals (ray floret) when the retweeted sentiment information was spread. Further, the appearance of a Voronoi icon at the tip of the petal indicated that the tweet spread to the place, institution, or community where the tweet was active. Moreover, time-oriented visualization and space- based visualization were applied, where sunflower seeds gradually moved and spread over time and the location of tweeting activity was indicated.
In the study of Chen et al., 17 tweets with similar sentiment information were clustered and expressed as areas on a map with different colors, contrasts, sizes, and shapes, and a process metaphor was used whereby the area of the map expanded to represent the spread of sentiment information. To demonstrate the relationship between sentiment keywords in the map, the representation of the bridge and the node-link diagram were expressed together.
In the study of Xu et al., 18 to evaluate the various aspects of a product or service, sentiment text for each aspect was divided into two categories—positive and negative—and then visualized as food, specifically pizza pies. If there was a target called “camera,” the aspects of this target were “the image quality of the camera, the exterior design of the camera, and the model that advertised the camera.” The size of the pie increased with the amount of sentiment information collected for each aspect, and pies of different colors were replaced according to the polarity ratio of the sentiment. Moreover, a tree map visualization technique was used to hierarchically organize the aspects, and positive and negative keywords for each aspect appeared in the form of a word cloud in the pie representation.
In the study of Ha et al., 19 after the amounts of emotional vocabulary felt when watching a movie were organized by frequency and after the similarity of emotional vocabulary was calculated and displayed as an MDS map, the relationship between movie contents based on the frequency of emotional vocabulary as similarity was represented using a node-link diagram. Then, movies that stimulated major emotions were clustered, and the cluster was metaphorized as a celestial graphic. The metaphors helped to compare movies that stimulated different emotions or easily explore movies that stimulated similar emotions as a group.
In a study conducted by El-Assady et al., 20 the topics of discussions were represented in the shape of a circle while using representations of the discussion floor to analyze the opinions, positions, attitudes, etc. of speakers engaging in discussions; compare the amounts of utterances in these engagements; and detect areas where disputes occurred between speakers. Then, each utterance was converted into a bubble, and a metaphor was used to express the utterance by accumulating the bubbles. The color of the bubble represented the identity of the speaker, and the size of the bubble represented the amount of speech. The width of the discussion floor varied according to how long a particular discussion topic by various speakers continued. Moreover, time-oriented visualization was mixed with all the processes in which bubbles accumulated or discussion floor were formed.
To collect the contents of tweets posted by people who watched the French presidential election TV debate and analyze the collected tweets by section, Huron et al. 21 used bubbles decorated with the profile pictures of each tweet opinion holder as representations. The bubbles were then accumulated in the form of a bar chart.
The visualization cases presented in Figure 4 can be easily comprehended by interpreting each result through the lens of the five specified categories (See Table 1). This study aimed to amass additional instances of sentiment visualization encompassing all five categories of visual metaphor and treat them as data for the exploration system.
Examples of the extraction of five categories in sentiment visualization, six examples of visualizations presented in Figure 4 are introduced.
Exploration system
In the field of data visualization, when various visualization cases are collected, an exploration system is sometimes created to provide an environment in which they can be analyzed more efficiently. An exploration system is a platform that collects various cases to be analyzed and organizes the characteristics into keywords or categories. It helps users quickly understand the characteristics of the main topic dealt with in each case without needing to directly analyze the characteristics of the target cases.
The exploration system can also group and show similar cases, thus helping the user easily discover similar research cases that are not yet known when conducting an exploration based on cases that are already known to the user. 22
Aigner et al.23,24 proposed an exploration system that can classify cases of time-oriented visualization techniques using variables such as data properties, temporal properties, and visual representations such as dimension and mapping.
This system uses a taxonomy for analyzing time-oriented data, and when a user selects a variable, suitable cases are filtered. The system also provides representative images and summaries of the studies, as shown in Figure 5.

Schulz25,26 used variables such as dimensionality or tree-based representations along with layout alignment methods to classify tree-oriented visualization cases. Moreover, the cases analyzed using the corresponding variables were presented in a graphical user interface (GUI) format, similar to the exploration system proposed by Aigner et al.
In the study of Kucher et al., 2 cases of sentiment visualization from the last 15 years were presented, and the characteristics of sentiment visualization research found in each case were categorized and organized. 27 Further, the system handled cases of sentiment visualization in which analysis tasks such as polarity analysis, subjectivity detection, opinion mining, sentiment analysis, and stance analysis were performed. The correlation between categories was examined, and a method of recommending appropriate visualization cases to sentiment visualization researchers was proposed.
Certain systems employ a GUI with configurations distinct from those utilized in the previously introduced exploration system.
Beck et al.28,29 proposed an exploration system that could survey dynamic graph cases, as shown in Figure 5, and which expressed the keywords, authors, and publication series that were mainly dealt with in the case to be analyzed in a word cloud while expressing the frequency of studies and the number of citations of studies over time as bar charts. Cases with similar research subjects were also clustered and presented.
As shown in Figure 6, Kerren et al. 30 surveyed biological data visualization techniques using classification criteria such as types of data, properties of data, and tasks, and then expressed the relationships between cases with similar research topics as a network. 31

In research investigating exploration systems in the field of data visualization, taxonomies have been used to analyze the collected cases. Moreover, a faceted search system has been introduced in the exploration system, which presents a list of representative images from the introduced cases and highlights the characteristics of each study in a keyword format. It also displays the similarity relationships and frequency counts among the research.
In the exploration system for sentiment visualization, correlations between data analysis and work categories are identified, and similar examples of sentiment visualization are selected and displayed. However, an exploration system that allows for a step-by-step examination of the overall research process involved in data visualization has not yet been developed. Furthermore, an effective method for comparing the frequency of appearance of various research keywords (constituting elements of the taxonomy) in sentiment visualization cases where visual metaphors are used, broken down by the process, has not yet emerged.
MetaphorVis explorer: design process and functions
Construction of taxonomy
In this study, a taxonomy was utilized to organize the visual metaphors that appear in sentiment visualization cases. 14 Tables 2–6 present the compositions and definitions of the sub-elements for the five categories (major categories) constituting the taxonomy.
Taxonomy of “Target.” 14
Taxonomy of “Intermediation.” 14
Taxonomy of ‘Representation”. 14
Taxonomy of “Visual Variable.” 14
Taxonomy of “Visualization Techniques.” 14
The method used to establish the taxonomy in the previous research can be summarized as follows. 14 Initially, collaboration with four experts specializing in sentiment visualization, data analysis, linguistics, and language engineering was undertaken. These experts presented diverse opinions and perspectives regarding the concept of visual metaphors and engaged in discussions. Subsequently, a consensus was reached, establishing the concept as the process of “replacing the data intended for analysis with readily understandable representations, guided by specific context and motivations.”
Next, sentiment visualization cases utilizing visual metaphors were gathered from scholarly papers, technical reports, posters, infographics, and other sources. In-depth analysis of the sentences related to visual metaphors employed in each study was conducted. During this process, core words and phrases related to five key elements were identified: "Target (the sentiment data),”“Intermediation (the case’s design goals, challenges, strategies, tasks, motivations, and background),”“Representation (explanation of the principles underlying the elements or processes replaced for visual metaphor),”“Visual Variable (shapes, sizes, colors, etc., employed in the representation),” and “Visualization Techniques (bar charts, pie charts, networks, etc., used for understanding visual metaphors and adding meaning to the representation).”
Based on the extracted information, five categories were derived. To define and extract sub-elements for each category, a comprehensive examination of literature related to each category was conducted.
Initially, the study of Liu 32 was referenced to create a taxonomy related to the target. In that study, the opinion quintuples constituting the subject of analysis were defined by sentiment analysis, and they consisted of: “entity, aspect, sentiment, opinion holder, and time”; these were used as sub-elements in the present study.
Next, to create a taxonomy for intermediation, the purpose and motivation were constructed around the main analysis tasks necessary for sentiment analysis. The analysis tasks classified at this time were “detection, summarization, classification, comparison, and exploration”; these were used as sub-elements in the present study. 33
Representations can be classified into two broad categories: those appearing as a specific individual or object and those appearing as a process, action, or phenomenon. In this study, the former were referred to as element metaphors whereas the latter were referred to as process metaphors; both were used as sub-elements of the taxonomy. 34
Moreover, referring to the classification of the Vienna Convention Code, the sub-elements belonging to each sub-element were subdivided into “natural objects/artificial objects” and “natural processes/artificial processes.” 35
To create a taxonomy for visual variable, Bertin’s 36 visual variable system was referenced for the concepts of “contrast, position, size, and length.” Wong’s 37 study was referenced for the concepts of “contrast and distance,” while Carpendale’s 38 study was referenced for the concepts of “color and location.” Moreover, Roth’s 39 study was reference for the concepts of “contrast, size, shape, and direction.”
After synthesizing the relevant studies, we selected “contrast, color, size (width, height, length), shape, location, direction, and distance” as sub-elements. The corresponding sub-elements can be subdivided to assess which sentiment information is represented by each changed element when the appearance of the representation is changed.
Among the various visualization techniques introduced in the study of Chi and the study of Chengzi et al. (21 in Chi and 68 in Chengzi et al.), there are 17 representative visualization techniques that are frequently used in the field of sentiment visualization.13,40
Data collection and analysis of visual metaphors
To collect examples of sentiment visualization using visual metaphors, we used IEEE TVCG, which is an international academic journal in the field of data visualization and computer graphics; the Computer Graphics Forum; and conference proceedings from conferences such as IEEE-VIS, EuroVis, PacificVis, ACM-CHI, and Siggraph. We searched for studies using the keywords of “visual metaphor,”“sentiment analysis,” and “sentiment visualization,” and we reviewed “whether studies used sentiment analysis,”“whether the design strategy, research purpose, and specific grounds for the formation of visualization are presented when data is visualized,” and “whether there is an object expressed as a metaphor.”
As a result, 60 cases were found in total (See Figure 7). Subsequently, sentences containing expressions encompassing the various sub-elements within the five categories of the taxonomy were extracted by crawling through papers, presentation materials, and video subtitles from these 60 cases, resulting in the collection of 433,350 sentences.

Representation elements of visual metaphors appearing in 60 cases (Enlarged ver.: https://image-indol-alpha.vercel.app/).
These collected sentences served as the basis for the analysis of the research processes underpinning sentiment visualization employing visual metaphors in each case. For the analysis of visual metaphors, collaborative efforts were undertaken with four experts, each specializing in sentiment visualization, data analysis, linguistics, and linguistics, who were invited during the preliminary development of the taxonomy. These experts conducted the meticulous curation, evaluation, and in-depth discussions regarding the narrative aspects of visual metaphors inherent within the research.
Keywords and phrases linked to visual metaphors were culled from the collected sentences, and particular words or phrases were isolated and assigned labels corresponding to the sub-elements within the taxonomy’s five categories.
In analyzing the collected sentences using the taxonomy, we referred to the studies of Liu et al. 41 and Federico et al. 42
An example of visual metaphor analysis based on the five categories is presented in Table 1. Within Table 1, it is possible to review words and phrases mentioned in the actual papers, keywords related to sub-elements (indicated in parentheses) for each category.
On average, approximately eight visual metaphors were analyzed for each case, resulting in a final total of 511 visual metaphors being analyzed and utilized in the exploration system.
Visualization
The exploration system used in this study provides network visualization,43–46 Sankey diagram visualization47–51 to facilitate the overall interpretation of visual metaphors in the case of sentiment visualization.
The function of the network to understand the hierarchical structure and relationships of the sub-elements belonging to the five categories constituting the visual metaphor was added, as was the function of the Sankey diagram to freely follow and interpret the visual metaphor method in sentiment visualization according to the context. Details are presented in Sections “Network” and “Sankey Diagram.”
Network visualization
The network visualization provided in this study helps interpret semantic relationships and clusters among sub-elements belonging to the five categories, which are the main criteria of the taxonomy.
The network consists of four types, with consideration of the order of interpretation of visual metaphors and types of sentences in sentiment visualization cases: “target–intermediation relationship,”“intermediation–representation relationship,”“representation–visual variable relationship,” and “representation–visualization technique relationship.”
The data for network visualization were generated as follows. First, to organize the similarity relationship for the connectivity between the sub-elements, the frequency between the sub-elements of each taxonomy was determined and used to fill a similarity matrix.
For example, in the visual metaphor sentence, “Sunflower petals are used to analyze how the original tweet left by the tweeter is retweeted and the diffusion process over time,” 16 the targets are “tweet users, original tweets, and retweets”; “timeline analysis to show the diffusion process” is an intermediation with the task of exploration; and “sunflower petals” are representations. According to the information in the sentence, the targets—such as tweet users, original tweets, and retweets—are used for time series analysis, and the frequencies of “tweet user-time series analysis” and “emotion-time series analysis of original tweets, retweets, etc.” are determined. According to the information that sunflower petals are used for time series analysis, the frequency is determined for “time series analysis-plants.”
Thus, a similarity matrix was constructed by quantifying the frequency between all sub-elements found in 511 visual metaphors. Because four networks in total were provided in this study, a total of four 2D metrics were created according to the frequency value of the data for each network.
The resulting data in the four two-dimensional matrices primarily consisted of values of 0 or relatively low weights ranging from 1 to 6. However, a few outliers with higher values, such as 18–20, were observed. These outliers had the potential to disrupt the overall trends in the data. To address this issue and provide robust handling of outliers, z-score normalization was performed. Robustness, in this context, refers to a form of resilience that mitigates abnormal data behavior, enhancing data accuracy. After this normalization process, the final similarity values were calculated, and the network visualizations were created.
The overviews for each network can be seen in Figures 8–11. The network visualizations presented in Figures 8–11 respectively show the “target–intermediation relationship,”“intermediation–representation relationship,”“representation–visual variable relationship,” and “representation–visualization technique relationship.”

Network visualization provided by the exploration system: target-intermediation network.

Network visualization provided by the exploration system: intermediation-representation network.

Network visualization provided by the exploration system: representation-visual variable network.

Network visualization provided by the exploration system: representation-visualization technique network.
The nodes in the network represent sub-elements in the main category of the taxonomy. The unique colors of the nodes represent the five categories: green for the target, pink for the intermediation, blue for the representation, yellow for the visual variable, and orange for the visualization technique.
The relationship between the sub-elements included in each category was expressed as a network link, while the thickness of the link reflected the frequency of metaphors that appeared. If a research method using similar visual metaphors appeared in several cases, the nodes of the sub-elements constituting the related metaphors were configured to form a network so that they could be clustered together.
Specifically, the colored ellipses provided by the network were created using the “contextual clustering process,” 52 and “the process of semantic clustering” refers to a method of grouping the sub-elements into the same cluster when two or more sub-elements exist in a sentence representing one visual metaphor.
If a colored ellipse is created using the corresponding method, then contextual interpretation can be achieved by using the nodes of the network included in the area of the ellipse.
In the “target-intermediation network,” the colored ellipse indicates “a combination of targets that become metaphorical materials to meet a specific purpose.” In the “target-intermediation network,” the colored ellipse indicates “a combination of targets that become metaphorical materials to meet a specific purpose.” In the “intermediation- representation network,” the color ellipse shows the “pattern of frequently used representations according to a specific intermediation.” In the “representation-visual variable network,” the color ellipse shows “to which representations the visual variables showing similarity in close proximity are mainly applied.” In the “representation-visualization technique network,” the color ellipse shows “what kinds of representations are used together with which visualization techniques that show similarity in close proximity to convey sentiment information more meaningfully.”
Figure 8 shows the network of targets and intermediations. The targets of “key player, psychology, stance, and opinion” are connected to “argument expression detection and analysis of debates comments and argumentation” in the main clusters in the network (See T-I1). This means that to achieve the purpose of detecting and analyzing arguments, the cases in our system need to have data on the opinions, positions, and psychology of key players. And the “emotion” is connected to the intermediation of “emotion detection and classification, polarity classification (See T-I2).” Targets such as “composite attributes, derived attributes” are connected to the intermediations related to “aspect-based sentiment analysis” which shows that the features of the target are applied comprehensively in aspect-based sentiment analysis (See T-I3).
Various targets related to “time (continuity, start/destruction, accumulation, repeat, growth/constraction, peak/valley)” are connected to “timeline analysis,” which is a time-related exploration task. Also, Targets such as “public, attitude, service, community, thought, continuity, growth, and constraction” are connected to the intermediations related to “finding significant, easy exploration of sentiment information, and comparison of different sentiments (See T-I4).”
Figure 9 introduces the intermediation-representation network—which plays a key role in the visual metaphor among the four networks—as an example.
The main clusters in the network are as follows: first, various representations are used for “time series analysis and easy exploration of sentiment information,” and in particular, representations such as “geometric figures, natural landscapes, plants, and natural phenomena” are used the most (See I-R1). For the “aspect-based sentiment analysis,” each attribute of the target is metaphorized using representations such as “food, celestial bodies, and cells (See I-R2).” For the “classification of individual sentiments and emotions,” opinion holders are expressed as “fabrics (sewing patterns) or animals, or emotional words are expressed as pictures (See I-R3).” Lastly, for “sentiment analysis related to debate,”“compound (bubble), building, and structure” are used frequently (See I-R4).
Figure 10 shows the network of representations and visual variables. The main clusters in the network are outlined below. First, most of the representations have frequent changes in “color and size” when it comes to visual variables (See R-VV1).
Also, representations such as “patterns, buildings and structures, non-metal, maps, and celestial” sometimes change their “position, value, and shape (See R-VV2).” Furthermore, representations such as “geometry, machine, and plant” are sometimes placed with “orientation and distance (See R-VV3).”
Figure 11 shows the network of representations and visualization techniques. The main clusters that appear in the network are as follows.
First, “radar charts and MDS maps” are often mixed with representations of “buildings and structures (See R-VT1).”
“Box plots and tag clouds” tend to be used with representations such as “maps, patterns, and letters (See R-VT2).”“Node-link diagrams, bubble charts, space-based visualizations, and area charts” are used with “maps, patterns, geometry, cells, and compounds (See R-VT3)” and “pie charts, line charts, and scatter plots” are used with “maps, geometry, and pictures (See R-VT4).” This explains why “maps, geometry, and patterns” are often used together with the majority of the visualization techniques in our taxonomy (9 out of 17).
And “Time-oriented visualization” is used with representations of process metaphors such as “space-time movement, natural phenomena, creation and destruction, and software, as well as with representations such as metals, machine, and pictures (See R-VT5).”
Sankey diagram
The Sankey diagram provided in this study can be helpful for sequentially analyzing visual metaphors in sentiment visualization using five categories or identifying the most frequently used visual metaphors. In the exploration system, the corresponding types are labeled “target theme” and “representation theme,” respectively. To manage the visual metaphors appearing in each case independently of each other, process variables are added to configure the data. Each process variable is of the format “study title-number of visual metaphors appearing.”
Figure 12 shows the result of visualizing the Sankey diagram according to the node and link attribute data described above. The Sankey diagram has five axes: The study case axis is located on the far left, while the other four axes are located in the five categories of visual metaphors.

Overview of the Sankey diagram. 53
The nodes constituting each axis represent sub-elements of the taxonomy. To explicitly express the sub-elements of the taxonomy, the colors of the links constituting the Sankey diagram are meant to be distinguishable according to the sub-element. The height of each node depends on the number of metaphors passing through it.
A link in the Sankey diagram is a line that connects the axis of the sentiment visualization cases and the five category axes that constitute the visual metaphor. Moreover, although each is classified as a different type of visual metaphor because the representations used differ, when the target or intermediation is the same, the value of the shared link information is weighted and expressed. Further, the link connecting the previous axis, the target axis, and the next axis, which is the representation axis, and which is centered on each axis, has the same color so that the visual metaphors appearing in the case can be understood at a glance according to their color.
Figure 13(a) shows all the links connected to the corresponding node when the “opinion summarization” node is clicked on each axis of the Sankey diagram. Here, the colored links represent visual metaphors that have the intermediation of “summary of opinions.” In this study, to make the visual metaphor appear naturally connected, as shown in Figure 13(a), when a node or link is selected, all links related to the selected part are located at the top of the node. Moreover, the coloring interaction is improved by using the “traceable multi-level feature” 54 to display the entire link of metaphors related to the selected option when the user selects the desired link.

(a) Links that are colored when the “opinion summarization” node is clicked on each axis & Zoomed-in view, links are arranged in ascending order so that visual metaphors are connected and interpreted and (b) View when a link is clicked in the Sankey diagram.
Figure 13(b), shows how all the links in the visual metaphor related to the clicked link are colored when one link is clicked in a certain section. The same filtering interaction is applied when a node in the Sankey diagram is clicked. Applying the two methods shown in Figure 13(a) and (b) allows for the entire data flow to be traced, even in a Sankey diagram with multiple axes.
The Sankey diagram offers a valuable feature, enabling users to efficiently compare the frequency of each sub-element connecting adjacent categories based on the specific category they intend to analyze, as indicated by the thickness of the links.
For example, as depicted in Figure 14, when considering the objective and motivation behind “comparison of different sentiments,” it becomes apparent that sentiment data (target) used for intermediation exhibits a higher frequency of categories such as “sentiment (opinion, appraisal, stance) and opinion holder (writer, key player, public),” rather than “entity (person, behavior, psychology, event) and aspect (simple, composite, derived).” Furthermore, for the purpose of comparison, natural elements like “plant (e.g. tree, sunflower), landscape (e.g. cascade, forest, river) and compound (e.g. DNA structure, bubble),” as opposed to artificial elements like “building & structure (e.g. bridge, floor) and geometry (e.g. block, circle)” are similarly used. The Sankey diagram, in this way, allows users to easily compare the frequency of sub-element keywords used in visual metaphors within each study by observing the thickness of filtered links after clicking on nodes for their desired sub-element. Moreover, by following the flow of links that connect the elements of each category, users can understand the overall research content, context, and causality of the visual metaphors used in sentiment visualization. This differentiates our system from traditional faceted search systems that list the characteristics of each case in a keyword (or independent dimension) format.

Sankey diagram facilitates comparing the frequency of sub-elements between adjacent categories using link thickness.
The data structure used in the exploration system and algorithms for implementing the Sankey diagram are presented on
Exploration system layout and interaction
This section discusses the overall layout composition and interaction of the exploration system, which was implemented using the JavaScript-based react framework. 55
Along with using the visualization library D3.js, 56 visualization production developed additional functions in-house to enhance the utility of visualization elements and the quality of design elements. Figure 15 shows an overview of the exploration system.

MetaphorVis: (a) Network View, (b) Sankey Diagram View, and (c) Paper View (System URL: https://hm00081.github.io/metaphorVis/) (Video URL: https://www.youtube.com/watch?v=XLNXB8RLWTI). 53
Figure 15(a) shows the Network View, which provides four networks through the drop-down menu and helps the user explore the relationships between the desired main categories. For each network, a legend is constructed that shows the main categories of the nodes. In the network, after the parts that can interpret the semantic relationship is clustered, the corresponding area is expressed as a colored ellipse. As shown in Figure 16, in the Sankey diagram, when a user clicks a specific clustering area in the network, the visual metaphor links that pass through all the nodes included in the clicked area are interlocked and colored.

Links to related visual metaphors are linked and colored in Sankey diagrams when the color ellipse area is clicked in the Network View.
Figure 15(b) shows the Sankey Diagram View, where two types of interactions can be performed. The first is an interaction in which the corresponding link is colored when selecting an analysis type provided by the system in advance.
The types of analysis include the “target theme,” For example, “cases of using political data, evaluating products or services, and using people’s comment data on specific events,” and the “representation theme,” for example, “cases in which rivers are used as representations and wheels are used as representations.” These two themes were selected based on the high frequency of target and representation in the 511 metaphors covered in this study. This interaction has the effect of enabling users to quickly trace the desired metaphors. The second is an interaction that filters all metaphors that include the clicked part when the user directly clicks on a node or link. This interaction has the effect of easily tracking and showing the metaphors desired by the user.
Figure 15(c) shows the Paper View, where representative images of cases related to the visual metaphor that have been finally selected in the network are displayed along with the Sankey diagram.
User scenario
In this section, the study presents a systematic guide on how to access and operate the exploration system. It explains the process of achieving specific objectives through user scenarios; these scenarios can be broadly divided into two types:
Scenario 1: Identifying the intermediation, representations, visual variables, and visualization techniques frequently used in the theme after selecting a theme suitable for a given target from the data
This scenario assumes that Researcher A, who has collected and processed data from politicians’ statements in TV debates, is looking for a frequently used visual metaphor to effectively represent “the appearance of politicians’ debates in data.”
(1) First, Researcher A clicks the “Politician’s Speech” button related to the data held in the target theme. After clicking the corresponding button, the user can check which intermediation is mainly used when the politician’s speech is the target by examining the nodes through which the colored links pass on each intermediation axis of the Sankey diagram, as shown in (1) in Figure 17. Researcher A confirms that the relevant intermediations are “analysis of discussion, commentary, and debate” and “detection of argumentation expression.”
(2) Next, Researcher A selects “Int-Rep Network” in the Network View to find the representations that match the selected analysis purpose. The user clicks on the yellow ellipse area that includes the two previously mentioned parameters, as indicated in part (2) of Figure 17. In metaphors involving the intermediation of “analysis of discussion, commentary, and debate” and “detection of argument expression,”“compound, building, and structure” are often utilized.
(3) After reviewing the Network View, Researcher A proceeds to the Sankey Diagram View to identify frequently used visual metaphors. Here, Researcher A can confirm that representations of building and structure are commonly employed. Researcher A clicks on the “buildings and structure” node in the Sankey diagram, as shown in part (3) of Figure 17, to explore visual metaphors where “building and structure” are used as representations.

Interaction process in Scenario 1.
Then, the Paper View is used to view the images of the cases that have ultimately been extracted. It can be seen that “bridge (CAA2057), pipe (LWW1358), and wheel (ASG2159)” are frequently used, as shown in (4) in Figure 17.
(4) Finally, Researcher A examines the links extending from the representation axis to the visual variable axis using the Sankey Diagram View. As a result, “buildings and structures” are found to use variables such as color, size, and shape. Moreover, examining the links extending from the representation axis to the visualization technique axis reveals that “buildings and structures” are used together with radar charts.
As such, frequently used representations, visual variables, and visualization techniques that are identified by examining the visual metaphors of sentiment visualization cases that have a subject theme similar to that of the data of Researcher A can be used as ideas in planning sentiment visualization.
Scenario 2: Exploring the purpose of visualization and the target based on the representation to use
The second scenario is based on the premise that Researcher B—who intends to use “decorative patterns” as representations of visual metaphors—uses the exploration system to determine what data and purpose should be visualized.
(1) First, Researcher B checks the “decorative pattern” visualization, which is the representation that will be used, in the representation theme, as shown in (1) in Figure 18.
(2) After clicking “Decorative Pattern” in the corresponding theme, the intermediations connected to the representation of the decorative pattern in the Sankey diagram and the sub-elements of the target are checked to determine the research direction. As indicated in (2) in Figure 18, the most frequently used intermediation is “finding important people, opinions, facts, stances, and attitudes, summarizing them, and portraying sentiment information at various levels,” while “opinion” is a commonly used target.
(3) Researcher B clicks the Rep-Var Network, which shows the relationship between representations and visual variables in the Network View, as shown in (3) in Figure 18, and if the decorative pattern is a representation, the Sankey Diagram View can be used to check which visual variables are used, as shown in (4) in Figure 18.

Interaction process in Scenario 2.
As a result of this process, Researcher B finds that decorative patterns appear as “contrast, shape, and position” changes, and the visual variable with the highest connectivity is “shape and position.”
To determine what type of visualization technique is employed for sentiment visualization cases using decorative patterns, representative images are checked in the Paper View. This yields the examples of “CSL12, 16 FCF09, 60 YSK14, 61 and CSL16” 62 presented in (5) in Figure 18.
(4) Finally, considering the links in the colored Sankey diagram between the representation and visualization technique axes, various visualization techniques—such as “node-link diagrams, box plots, time-oriented visualizations, and space-based visualizations”—are used together with the decorative-pattern representation, as shown in (4) in Figure 18.
(5) After performing the above steps, Researcher B mainly uses “opinion” data to plan sentiment visualization with decorative patterns. To accomplish the intended purpose of the present research, “finding important figures, opinions, facts, stances, and attitudes, summarizing opinions, and describing sentiment information at various levels” can be considered. Moreover, by clicking on the node of the representation to use in the Sankey diagram and checking the visual variables and visualization techniques that show connectivity, the researcher can easily select the visual metaphor method that they should use.
Case study
In this section, a detailed explanation is provided regarding the analysis of the various features of visual metaphors uncovered by the exploration system. The case studies are categorized into three main types.
Deriving visual metaphors by frequently appearing types in exploration system
To identify the most frequently appearing visual metaphors by type, the links in the Sankey diagram were analyzed for all cases of sentiment visualization that were included in the exploration system.
As a result, six types of visual metaphors were found to appear most frequently, and these are listed in Table 7.
Six types of visual metaphors most frequently appearing in the exploration system, for each type, analyzed in order from the target to the visualization techniques.
The following describes how the visual metaphors presented in Table 7 are interpreted by connecting them to each other according to the five categories of interest.
Taking type 1 as an example, “metaphor of opinions about products or services with compounds and decorative patterns” appears 53 times out of the 511 visual metaphors collected. When the target is “opinion on a product or sentiment on a service,” an intermediation for “summarizing opinion or sentiment data” is utilized.
For the intermediation, opinions or sentiment information are metaphorically expressed as “bubbles” or “decorative patterns.” A visual variable that changes the size of the bubble according to the number of opinions collected or changes the color of the bubble according to the polarity of sentiment is also applied. Using “bubbles” and “decorative patterns” together with tag clouds and bubble chart visualizations allows for summarized opinion information to be viewed easily. By utilizing the visualization provided by the exploration system, users can comprehend the overall content of the six most frequently used metaphor types through the analysis method described above.
Interpreting associated visual metaphors after selecting intermediation that appears most frequently in exploration system
This section introduces visual metaphors that pass through the top three frequently used intermediations according to the Sankey diagram. The three intermediations are “timeline analysis, easy exploration of sentiment information, and comparison of different sentiments.”
The visual metaphors mediated by “Timeline Analysis” are shown in Figure 19. In the metaphors, accumulation and continuity—which are the subjects of analysis related to time—can be seen to appear often, and opinions and evaluations collected over time are used together. The spread of opinions over time is represented by natural phenomena such as the flowering of flowers (CSL12, 16 XD9963) the growth of trees and vines (JSM16, 64 YCC2065), and the flow of rivers (YSK14, 61 MDD10, 66 GYS14, 67 DJM1268). Using sewing work (GGS12TPS 69 ), the construction of bridges (CAA2057), and the operation of space–time tunnels (FAKM1570), “process metaphors” are employed as representations to visualize opinions or evaluations exchanged during a specific period. Further, the accumulation of time is shown by arranging or stacking geometric figures such as hexagons (SCS1671), colored dots (SCS17, 72 BN11B 73 ), and lines by section (CSL1662).

View when the user clicks on the “Timeline Analysis (I1)” node in the Sankey diagram.
Figure 20 shows the visual metaphors mediated by “Easy Exploration of Sentiment Information.” Here, two or more sub-elements of the target are used for one visual metaphor; For example, the object, sentiment, and time are used together, or the opinion holder and sentiment are used together.

View when the user clicks on the “Easy Exploration of Sentiment Information (I2)” node in the Sankey diagram.
Moreover, various parts of the representation are used to metaphorize the target. For example, 2D geometric figures such as squares (HYZ13, 74 FCF0960) and spirals (FZC1875) are used along with maps of territories (EYC1576), movement routes (SCS16, 71 SCS1772), and city roads (SCS1917) to represent the sentiment information generated by key players, while a double helix such as that of DNA (LLN1477), and bubble (VWH13, 78 MVM17, 79 PC15PV 80 ) are used to organize the sentiments of opinion holders by year, date, and time. Then, place-time-sentiment keyword topics are explored using three-dimensional (3D) geometric figures, such as cubes (LJC1881).
The visual metaphor for “Comparison of Different Sentiments” uses performance opinion holders as the target, as shown in Figure 21. In the distribution of representations, the thicknesses of rivers (MDJW07, 82 XWS16, 83 GYS14, 67 YSK1461), bubble (MEV16, 20 RSRY14, 21 PC15PV, 80 JX1718), and bridges (CAA2057) are considered to examine the differences in the amount of information contained. The shapes of the petal (XD99, 63 CSL1216) and the wheel (a type of structure) (GGS12ST, 69 LGX16, 84 YFS1085) are used to compare the polarities of the comments on the original post by the opinion holder. Further, by changing visual variable parts such as size or color for representations, the amounts of information or connection relationships can be compared, or the identity of the opinion holder can be determined.

View when the user clicks on the “Comparison of Different Sentiments (I3)” node in the Sankey diagram.
Analyzing visual metaphors using system-provided themes
We analyze the visual metaphor with the theme of “Target and Representation.” For this analysis, we utilize the target theme button provided by the Sankey Diagram View, the nodes and links of the Sankey diagram visualization, and the Paper View. Among the themes that were not introduced in the user scenarios, sub-elements constituting visual metaphors are selected and introduced in five categories.
Target theme: Opinion on a specific event
Figure 22 shows the results of analyzing visual metaphors with the theme of “Opinion on a Specific Event.” The main targets used in the theme were “events, opinions, evaluations, authors, and institutions.” In other words, the targets are the opinions or evaluations that authors or organizations leave about specific events. The main intermediations that appear in the theme are “detection of events or evidence that cause sentiment patterns, summary of opinions, and easy exploration of sentiment information.”

Visual metaphors and case images under the theme of “opinion on a specific event” over time.
Analyzing the usage patterns of representations according to intermediation reveals that, for “detection of events or evidence that cause sentiment patterns,” the nucleobase pair within the structure of DNA (LLN1477) is used to indicate the event that causes division when the double helix—which represents two communities—is divided and separated; labels with textual information are used to show events that occurred during specific intervals with a high frequency of comments or evaluations; a territory (SCS1772) represents a cluster of tweets that mention a specific event; and the growth of vines (YCC2065) is used to represent various incidents related to school violence over time. For “summary of opinions,” the flow of river (MDD10, 66 DJM1268) is used to accumulate tweets that mention a specific event by summarizing them by period; the comments left on the French presidential election debate web service (SRJ1386) and the information of the author are summarized using bubbles; and the contents of tweet with hashtags related to a specific event are displayed using a rhythm game disk (MA1687).
For “easy exploration of sentiment information,” bubbles representing opinions are accumulated and displayed efficiently in a limited space to allow for easy understanding of the amount and flow of opinions over time; the incidents and causes of school violence can be easily explored by examining the flow of the leaves and branches of the vine; and a spiral shape is used to indicate the author or organization of the article.
Regarding the relationships between representations and visual variables, when using vines, the visual variables appear as follows. Colors are assigned for the polarity of appraisal, a vine becomes longer when there are more issues for a specific event, and past events appear closer to the bottom than recent ones. When rivers are used, past and recent events are presented in dark and light green, respectively, and the river sizes change based on the number of topics for events accumulating at the same time. Next, when using a nucleobase pair as a representation, visual variables are presented as follows: The color is changed to indicate the polarity of the appraisal left for a particular event, the size is changed to indicate the number of comments and evaluations, and the bands in nucleobase pairs are expressed more diversely when more events occur over time.
When using decorative patterns in bubbles, the shape of the bubble is changed using profile pictures to represent the identity of the author and organization, and the size of the bubble is changed to indicate the number of comments and evaluations collected. When using maps, the visual variables are established as follows: when clustering the same opinions or evaluations, they are distinguished by the color of the territory, and when more people comment on a particular event, the map becomes larger. When using a rhythm game disk, the color of the disk is changed according to the hashtag.
Regarding the relationship between representations and visualization techniques, when using vines, because the relationship between leaves and branches is expressed in the form of connecting nodes and links, it is associated with the node-link diagram. Decorative patterns and maps are associated with related visualizations; as key sentiment information is represented as a tag cloud on the decorative patterns or over the territory of the map. Geometric shapes such as squares and spirals are associated with time-oriented visualizations to represent the information of opinion holders that changes over time.
Representation theme: Structure (wheel)
Figure 23 shows the results of the visual metaphor using the “wheel (GGS12ST, 69 LGX16, 84 YFS10, 85 FAKM15, 70 FA20, 88 WHWS12, 89 ASG2159)” among buildings and structures as a representation.

Visual metaphors and case images in the case of using the “wheel” as the representation.
Considering the relationship between the target and the intermediation, the target of places, opinions, evaluations, and authors is associated with comparisons between different sentiments. This implies a comparison of the evaluations or opinions left by the author about a specific place. The target of the opinions, sentiments, and authors is associated with the intermediation of sentiment detection and classification. This implies detecting and classifying the author’s sentiment in the text representing the sentiment information. Further, the target of opinions, evaluations, positions, attitudes, and sentiments is associated with the intermediation of the easy exploration of sentiment. This implies that the aim is to facilitate easy exploration after refining the expression of opinions, evaluations, positions, attitudes, and sentiments in the text data.
To satisfy the three intermediations introduced above, the wheel representation is used as follows: first, the internal structure of the wheel is mainly used for comparison between different sentiments. For example, to compare the authors’ ages, places of residence, and times of leaving comments for hotel services, a triangular structure is created and expressed inside the wheel, or the brand evaluation information left by an author is divided into five types and expressed as spokes.
For sentiment detection and classification, the rim of the wheel (i.e. the circumferential, ring-shaped part of the wheel) is used. For example, after sentiments and appraisals for food are categorized and organized, they are displayed on the rim of a wheel, or news text data are expressed on the rim whereas sentiments are displayed in glyphs on top. For easy exploration of sentiments, a representation with the entire outer appearance of a wheel that has been transformed is used.
For example, after the opinion, evaluation, and sentiment information is refined by polarity and time, the wheel’s appearance is transformed by displaying the wheel in a hierarchical order from the past to the present.
The visual variables associated with the wheel include the contrast, color, size, and position. Contrast is mainly used in the section where the frequency of comments is high or a conversion of sentiment occurs. Color is used to indicate the polarity of the appraisal, the type of sentiment, the strength of the sentiment, the age group of the opinion holder, and the identity of the brand. Size is adjusted to indicate the frequency of text related to opinions, evaluations, attitudes, and sentiments. Lastly, the glyphs constituting the wheel are shifted to position them according to the similarity relationship between the sentimental vocabularies expressed on the wheel.
Visualization techniques related to the wheel include tag clouds and box plots. A tag cloud is used to indicate the context of the main sentiment keywords and the text at the center or rim of the wheel, and box plots are used to show the frequency of the collected sentiment keywords.
Evaluation
We conducted an experiment with participants to validate the usability of the exploration system, and the results are presented in this section. In the verification experiment, we measured the degree of understanding of the visual metaphor derived from the exploration system and summarized the strengths and weaknesses of the system collected during the process. After measuring satisfaction with the overall system’s functionality and user interaction, we summarized the results.
Experimental design
The data for the verification experiment were collected over a span of 8 days, from October 20 to 27, 2022. In total, there were 20 participants in the experiment, including seven experts with research experience related to rhetoric and metaphor, six experts with experience in sentiment data visualization, and six experts with experience in information visualization using text mining and natural language processing. Among them, there were 11 male participants and nine female participants, with two in the age group from 25 to 27 years old, 10 in the age group from 28 to 31 years old, and eight in the age group of ≥32 years old. Information about the participants is presented below.
The method of the verification experiment was as follows:
First, the basic concepts of the system—that is, sentiment analysis, sentiment visualization, and visual metaphors—were explained to the participants for 30 min. Then, the participants were given 15 min to freely use the exploration system.
Second, following participants’ sufficient exposure to the exploration system, we assessed their comprehension of the visual metaphors originating from the target theme of the Sankey diagram (Stage 1) and the visual metaphor process stemming from the representation theme (Stage 2). In Stages 1 and 2, the experimenter guided participants through the operation of the exploration system sequentially, granting them a 10 min window to grasp the visual metaphors under guided assistance.
In the provided time frame, participants were guided to hover over the links in the Sankey diagram one by one and view a pop-up window with all the metaphorical stages included within the respective links.
In the process of comprehending visual metaphors, if participants requested additional information, the experimental subjects were presented with additional images and sentences explaining the visual metaphors related to the sentiment visualization cases. Furthermore, supplementary explanations were provided for the relevant cases. After the training of the system, participants were asked to assess their understanding of the learned visual metaphors using a 5-point Likert scale and they were asked why they had selected the corresponding score in a short-answer narrative item.
Third, a survey was conducted to evaluate the participants’ overall satisfaction with the exploration system. The items used were largely composed of “topics asking about the function of each view provided by the exploration system and topics asking about the association and interaction of visualization tools provided by the system.” The items constituting each topic were based on the study by Ahn et al. 90
For the topic related to function, the factors of “usefulness, feature functionality, and comprehensiveness” were measured, while for topics asking about association and interaction, the factors of “accuracy and easiness” were measured. Hereafter, the topic asking about function is referred to as Topic 1 whereas the topic asking about association and interaction is referred to as Topic 2. For the experiments described above, the average total experimental time required per participant was 85–90 min.
Experimental results
First, the participants who listened to the explanation of the experiment spent 5–9 min freely using the exploration system. This was somewhat faster than expected, as 15 min of experience was considered to be appropriate. Two of the 20 participants asked questions about the prior explanation to better understand the system. While the participants experienced the system, the experimenter asked them two to three times if they were facing any inconveniences in using each part, and all the participants answered that there were no inconveniences in understanding the contents while freely exploring the system. This suggested that the participants quickly understood the components of the exploration system and were able to use them well without any inconvenience.
Measuring understanding of visual metaphors derived from exploration system
In Stage 1, after the participants clicked the Diffusion of Sentiment button in the theme to be analyzed, the participants were shown a visual metaphor process using a “map” as a representation (Stage 1A) and a visual metaphor process using a “plant” as a representation (Stage 1B). The participants were given 10 min to learn each visual metaphor process. Then, they answered questions about what they had discovered. Moreover, the view that was used most often in Stage 1 (Stage 1C) was determined. To learn the visual metaphor provided in Stage 1, the participants took an average of 5–6 min using the exploration system. The participant who used the most time of the 10 min given in the experiment used 7 min, including the time taken for questions and answers. The results of Stage 1 are shown in Figure 24.

Results for Stage 1 of the verification experiment: (a) the average value of understanding the visual metaphor of Stage 1A and the visual metaphor of Stage 1B (five points for good understanding), (b) cumulative bar graph of the numerical values for each item, and (c) histogram showing how frequently the different views were used in the system.
Results of stage 1A
For the item asking, “Is it easy to understand the visual metaphor expression in which the area of the map expands to represent the process of spreading sentiment information over time?,” 13 participants (65%) answered “strongly agree,” five participants (25%) answered “agree,” one participant (5%) answered “neutral,” one participant (5%) answered “disagree,” and no participants (0%) answered “strongly disagree,” as shown in 1A in Figure 24(b). As shown in 1A in Figure 24(a), the average score for this item was 4.5 points out of 5 points, and the standard deviation was 0.8. Participants who answered “strongly agree” made comments such as, “I think they have a lot in common in that both the concept of spreading opinions and the concept of increasing the size of the land on the map are gradually spreading to a wider area” and “It was easier to understand that the growing area of the map coincided with the growing number of opinions.”
In general, many of them thought that the “diffusion process” of opinions was conceptually similar to the “expansion” of territories, which was helpful for understanding. However, those who answered “neutral” or “disagree” made comments such as, “The metaphor is not well understood because the pictures showing the study cases do not seem to represent territorial expansion,” indicating that the two processes were not well matched in the images provided in the Paper View, which made it difficult to convey information.
Results of stage 1B
For the item asking, “Is it easy to understand the visual metaphor of a plant growing or a seed spreading to represent the process of spreading sentiment information over time?,” nine participants (45%) answered “strongly agree,” eight participants (40%) answered “agree,” two participants (10%) answered “neutral,” one participant (5%) answered “disagree,” and no participants (0%) answered “strongly disagree” as shown in 1B in Figure 24(b). As shown in 1B in Figure 24(a), the average score for this item was 4.3 points out of 5 points, and the standard deviation was 0.82.
Compared with the previous survey on the visual metaphor expression of expanding territories on the map, a smaller number of participants answered “strongly agree” to this item. In this regard, the participants made comments such as “The metaphor of the spreading seeds seems to represent the spread of opinions well, but I wondered how the part where plants grow upward has anything to do with the spread of opinions” and “The expression of plants growing or seeds spreading also represented the spread of sentiment information well, but the previous metaphor using a map was easier to understand,” which suggested that the visual metaphor of expanding territories on the map was more useful to the participants than the concept of vertical growth of plants as a representation for the process of sentiment information spreading over time.
Results of stage 1C
For the item asking, “In the process of participating in the system, which view did you use the most among the Network View, the Sankey Diagram View, and the Paper View for understanding the visual metaphors presented by the experimenter?,” the results indicated that four participants (20%) used the Network View the most, 20 participants (100%) used the Sankey Diagram View the most, and four participants (20%) used the Paper View the most, as shown in Figure 24(c).
This suggested that all the participants used the Sankey diagram view the most to understand the visual metaphors. The comments paired with such responses included “The Sankey Diagram View was used the most to grasp the entire contents of the metaphor, and the Paper View was used to check the image along with it.” and “I mainly used it because it is a view that can interpret the method of visual metaphor in order through the words introduced in the Sankey diagram and the colored lines connecting them” Many participants reported using the Sankey diagram view to check the detailed research methods that appeared when visual metaphors were created.
In Stage 2, after selecting the Bubble button in the Representation theme, the participants were shown the visual metaphor process using bubbles for the intermediation of “comparison of different sentiments” (Stage 2A) and the visual metaphor process using bubbles for the intermediation of “easy exploration of sentiment information” (Stage 2B). The participants were given 10 min to learn each visual metaphor process. Then, they answered questions about what they learned. These questions were used to determine the view that was used most often in Stage 2 (Stage 2C). To learn the visual metaphor provided in Stage 2, it took the participants an average of 6 min of using the exploration system. The participant who used the most time of the 10 min given in the experiment used 7 min, including the time for questions and answers. The results of Stage 2 are shown in Figure 25.

Results of Stage 2 of the verification experiment: (a) average value of understanding the visual metaphor of Stage 2A and the visual metaphor of Stage 2B (five points for good understanding), (b) cumulative bar graph of the numerical values for each item, and (c) histogram showing how frequently the different views were used in the system.
Results of stage 2A
For the item asking, “How appropriate is it to use bubble representation for mediating ‘comparison of different sentiments’?,” seven participants (35%) answered “strongly agree,” 10 participants (50%) answered “agree,” three participants (15%) answered “neutral,” no participants (0%) answered “disagree” or “strongly disagree,” as shown in 2A in Figure 25(b). As shown in 2A in Figure 25(a), the average score for this item was 4.2 points out of 5 points, and the standard deviation was 0.67. The reasons provided for such responses in the comments included “It is a bit difficult to understand the purpose of the comparison with only bubble representations, and it is easier to understand when viewed in conjunction with other factors such as visual variables.” and “While the comparison is made by the size and color of the bubbles, there seem to be cases in which an image does not appear to be the shape of the bubble.”
Results of stage 2B
For the item asking, “How appropriate is it to use bubble representation for mediating ‘easy exploration of sentiment information’?,” seven participants (35%) answered “strongly agree,” seven participants (35%) answered “agree,” two participants (10%) answered “neutral,” two participants (10%) answered “disagree,” and no participants (0%) answered “strongly disagree,” as shown in 2B in Figure 25(b). As shown in 2B in Figure 25(a), the average score for this item was 3.9 points out of 5 points, and the standard deviation was 1.10.
Compared with the previous item, fewer participants answered “agree” and more participants answered “disagree.” The reasons for such responses reported in the comments included “The purpose of using bubbles for easy navigation didn’t make sense.” and “While the part where the bubbles grow or the bubbles accumulate makes sense to some extent for the purpose of comparing the amount, the content does not come close to achieving the purpose of facilitating exploration so I answered ‘neutral,’” suggesting that the representation method using bubbles was unsuitable for the intermediation of “easy exploration of sentiment information.” However, some positive comments were also made, such as “It would be possible to search in the direction that the user wants to understand, not only by comparison, but also by the number of bubbles, size, distribution, etc.”
Results of stage 2C
For the item asking, “In the process of experiencing the system, which view did you use the most among the Network View, the Sankey Diagram View, and the Paper View for understanding the visual metaphors presented by the experimenter?,” the results indicated that nine participants (45%) used the Network View the most, 20 participants (100%) used the Sankey Diagram View the most, and four participants (20%) used the Paper View the most, as shown in Figure 25(c). In the comments, the reasons for selecting the views included “The Network View was used to confirm the relationship between the representation and the intermediation, and the Sankey diagram was checked to see the overall appearance of the visual metaphor one by one”; “Since it seemed like a situation that connected specific intermediations and representations, network visualization was also utilized”; and “I tried to understand the overall appearance of visual metaphor through Sankey diagrams.” Similar to Stage 1, the Sankey Diagram View was used as the main view to see the overall flow of the research, while the Network View was used to check detailed relationships.
Evaluation of satisfaction with exploration system
Reliability analysis
A reliability analysis was performed by considering Cronbach’s α coefficient, which measures internal consistency between questionnaire items, using the collected response data. The Cronbach’s α coefficient ranges from 0 to 1, with values closer to 1 indicating higher reliability. In general, a value of ≥0.6 is considered to indicate good reliability. 91
The results of analyzing the reliability using measurement data are presented in Tables 8 and 9. Specifically, Table 8 presents the questionnaire items related to the usefulness, functionality, and comprehensiveness of the system. The results for the reliability analysis of the items indicated an average Cronbach’s α value of 0.831, a minimum value of 0.805, and a maximum value of 0.838. Meanwhile, Table 9 presents the questionnaire items related to the accuracy and easiness of the system. The results for the reliability analysis indicated an overall Cronbach’s α value of 0.674, a minimum value of 0.607, and a maximum value of 0.752. Because all the items exhibited Cronbach’s α values of ≥0.6, they were considered to have high internal consistency; that is, high reliability.
Questionnaire items about the usefulness, functionality, and comprehensiveness of the system and the Cronbach’s α value for each item.
Questionnaire items about the accuracy and easiness of the system and the Cronbach’s α value for each item.
Satisfaction survey results
Two topics were used to evaluate the participants’ satisfaction with the exploration system. For Topic 1, an experiment was conducted to evaluate the usefulness, feature functionality, and comprehensiveness of “the function of each view provided by the system.”Figure 26 shows the average responses of the participants for each item of the satisfaction evaluation questionnaire for Topic 1, while Figure 27 presents a bar graph showing the cumulative opinions obtained from all the participants in the experiment for each item, which ranged from 1 to 5 points. As shown in Figure 26, the item with the highest average score (4.70) was Item 12, which read, “The Network View, Sankey Diagram View, and Paper View provided by the system help to comprehensively understand the visual metaphor process and representative images of the case in the sentiment visualization case.” Meanwhile, the item with the lowest average score (3.05) was Item 13, which read, “The ratio (size) of the Network View area provided is appropriate on the screen.” Both items were related to comprehensiveness, which suggested that the three views provided by the visualization exploration system contributed to a comprehensive understanding of the visual metaphors and examples. However, in the case of the Network View, there appeared to be a need for improvement. Next, as shown in Figure 27, the opinions obtained from all the participants in the experiment were accumulated for each item and scored from 1 to 5 points. For most of the items, there were no or few negative opinions. In particular, exclusively positive responses were obtained for Item 8, which read, “The Target Theme button of the Sankey diagram sufficiently provides a function that helps to understand the metaphorical process of cases based on the characteristics of the target,” and Item 10, which read, “Each axis constituting the Sankey diagram provides enough functions to intuitively check the steps (sequence) of the visual metaphor process shown in the case of sentiment visualization.” Thus, the views provided by the system, including the Sankey Diagram View, helped the participants understand the visual metaphors.

Topic 1 (evaluating satisfaction with the functions of the views provided by the system): average participant responses for each item.

Cumulative bar graph of the numerical values for each item in Topic 1.
However, for Item 13, that is, “The ratio (size) of the Network View area provided is appropriate on the screen,” eight participants answered “disagree” by selecting a score of 2. This was larger than the numbers of participants who answered “strongly agree” and “agree,” indicating the need for improvement in the area size of the Network View, which is consistent with the previous analysis of the average response for each item. For Topic 2, an experiment was conducted to evaluate the accuracy and easiness with regard to “the association and interaction of visualization tools provided by the system.”Figure 28 shows the average responses of the participants for each item of the satisfaction evaluation questionnaire for Topic 2. Figure 29 presents a bar graph showing the opinions obtained from all the participants in the experiment for each item.

Topic 2 (evaluating satisfaction with system association and interaction): average participant responses for each item.

Cumulative bar graph of the numerical values for each item in Topic 2.
As shown in Figure 28, the items with the highest average score (4.75) were Item 1, which read, “When a colored ellipse in the network is selected, the metaphorical process of that part can be clearly seen in the Sankey diagram and Paper Views,” and Item 6, which read, “When Show Full, Show Empty of the Sankey diagram is clicked, the corresponding view appears well in the Sankey diagram.” Meanwhile, the item with the lowest average score (3.60) was Item 2, which read, “When a node in the network is clicked, the visual metaphorical processes included in the node can be clearly seen in the Sankey diagram and Paper Views.” Thus, the association was accurate between the visualization tools, including the Network View and Sankey Diagram View. However, the system received low scores for the interworking between the nodes of the Network View and other visualization tools. This suggests that the identified shortcomings of the Network View affected the interoperability. Next, based on the results obtained by accumulating the opinions obtained from all the participants for each item and expressing each item from 1 to 5 points, as shown in Figure 29, it was confirmed that the responses were consistent with the participants’ average responses for each of the previous items, and that the level of satisfaction for most of the items was high. However, for Item 2, the scores were roughly evenly distributed from 2 to 5 points.
This indicates that the accuracy of the node function provided by the Network View for linking other visualization tools was lacking, suggesting the need for a method to allow for accurate interactions with the nodes of the Network View. In the verification experiment, most of the participants reported high levels of satisfaction with the usability, functionality, comprehensiveness, accuracy, and easiness of the system. In addition to participants who had background knowledge on sentiment visualization and visual metaphors, those who were working on data analysis quickly understood the research methods and visualization cases of visual metaphors by utilizing the various functions provided by the system.
Among the numerous positive opinions received, the participants who had studied sentiment visualization said, “By creating sentiment visualizations with representations that many people can easily relate to, much more efficient planning will be possible, and to get ideas for visual metaphors, it will be easier to work by freely observing cases based on the five categories provided by the system of this study and proceed with planning based on them.” The experimental results suggest that the proposed system has satisfactory usability for understanding visual metaphors in sentiment visualization.
Conclusion
In this work, we developed a system for analyzing visual metaphors in sentiment visualization cases. In this study, sentiment visualization cases in which visual metaphors were used were collected, and sub-elements belonging to the five categories of a constructed taxonomy—target, intermediation, representation, visual variable, and visualization technique—were extracted from these cases. After the extraction, the sentences were divided into the themes of target and representation, and these were used to build the system.
Using this system, it was possible to see in detail the semantic relationships of the hierarchical structure that constituted the visual metaphor by using the network. Additionally, the implementation of Sankey Diagram View facilitates the connection and observation of analyzed metaphors along distinct axes. Moreover, by varying the link thickness, the system effectively enables the comparison of the frequency of usage for sub-elements connecting adjacent categories based on specific thematic criteria.
Then, user scenarios are employed to introduce the approach to accessing the exploration system and conducting the analysis in accordance with specific objectives. Furthermore, the case studies aim to identify the most frequently used visual metaphors in the system and interpret visual metaphors based on the three prominent intermediations (analysis objectives and motivations).
Furthermore, the analysis of visual metaphors was conducted by utilizing the themes assigned to the subjects and representations. These case studies serve to demonstrate the system’s contribution, showcasing its capacity to comprehensively interpret and analyze visual metaphors within the context of sentiment visualization from various perspectives.
Finally, the effectiveness of the proposed system was verified, and the strengths and weaknesses of the system were summarized.
To synthesize the above research process, the advantages of the exploration system developed in this study are listed in the following.
First, arranging the five categories of visual metaphors into a taxonomy and constructing an exploration system helps intuitively interpret the concept of a metaphor according to the procedure. This can allow researchers—even those who are unfamiliar with the concept—to quickly understand the concept.
Second, the system user can clearly identify the purpose and background for applying the visual metaphor by using the network and Sankey diagram, and they can naturally compare and interpret the metaphors created according to the purpose and background. In addition, when developing a visual metaphor method, the researcher can select the desired category among the five categories, identify the usage pattern of the metaphor, and accordingly plan sentiment visualization.
Third, the layout of the proposed exploration system is configured for the user to intuitively grasp the research method of a specific research case. Therefore, even in fields beyond those covered in this study, this system can be used to help identify the characteristics of research cases.
This study has the following limitations, and resolving them remains a challenge for future research: Because cases that deal with sentiment visualization and visual metaphor simultaneously represent a new research area, the number of cases handled was small compared with those for other systems; the data used to create the exploration system were analyzed by directly finding and analyzing the visual metaphors that appeared in sentiment visualization and then manually organizing the characteristics, thus requiring a method to automate the data; the usability of the Network View requires improvement; it is necessary to verify the usability of the system among the general public, as the participants in the verification experiments consisted entirely of experts in related research field; it is necessary to verify the generalizability of the system by applying it to data from other research fields; and it is necessary to experimentally compare the usability of the proposed exploration system with that of previously reported exploration systems.
In the future, we plan to enhance this research by revisiting the taxonomy using survey papers related to sentiment visualization92,93 and incorporating more metaphorical examples of sentiment visualization. Subsequently, we intend to support a variety of themes based on the patterns of frequently occurring visual metaphors and examine which visual metaphors are effective in visualizing sentiment data. Additionally, we plan to provide a pop-up view that displays detailed information about the final cases, including the paper title, publication year, abstract, DOI link, and more, for a more comprehensive understanding.
For data collection, we are planning to devise a learning model for sentiment detection based on metaphors and then automate the extraction of relevant expressions from collected cases. Furthermore, we will work on improving the convenience of the key components of the exploration system: network visualization and Sankey diagrams.
In network visualization, we aim to make the patterns of usage relationships and causality in visual metaphors even clearer, and we will add semantic labels to the provided clusters (colored ellipses). To address the readability issues caused by the crossover lines in Sankey diagrams, we plan to explore methods for grouping or further segmenting nodes on each axis according to subcategories.
Moreover, we will validate the research’s scalability by introducing case studies that apply data from other research fields and present analysis approaches. We intend to conduct comparative experiments using different layout models (Sankey diagrams, network-based layouts, facet-based layouts, etc.) with the data from this research to validate the layouts’ utility. We will conduct quantitative assessments of user satisfaction among the general public and make improvements to ensure that the exploration system is user-friendly for everyone.
Footnotes
Acknowledgements
This paper is based on a main portion of Hyoji Ha’s doctoral thesis research completed at Ajou University under the direction of Kyungwon Lee.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2020S1A5A2A01043532).
