Abstract
In this article, we investigate the problem of visually representing and analyzing large dynamic directed graphs that consist of many vertices, edges, and time steps. With this work we do not primarily focus on graph details but more on achieving an overview about long graph sequences with the major focus to be scalable in vertex, edge, and time dimensions. To reach this goal, we first map each graph to a bipartite layout with vertices in the same order for each graph supporting a preservation of the viewer’s mental map. A sequence of graphs is placed in a left-to-right reading direction. To further reduce link crossings, we draw partial links with user-definable lengths and finally apply edge splatting as a concept to emphasize graph structures by color coding the generated density fields. Time-varying visual patterns can be recognized by inspecting the changes in the color coding in certain regions in the display. We illustrate the usefulness of the approach in two case studies investigating call graphs changing during software development with 21 releases which is a rather short graph sequence but contains several thousand vertices and edges. Visual scalability in the time dimension is shown with more than 1000 graphs from a dynamic social network dataset consisting of face-to-face contacts acquired during the Hypertext 2009 conference recorded by radio-frequency identification badges.
Introduction
Dynamic graphs appear in many fields of application, for example, call graphs in software engineering, metabolic networks in bioinformatics, face-to-face contacts in social networking, eye movements between areas of interest (AOI), team player behavior, or co-author relationships in digital bibliographies to mention some of a longer list of examples. The visualization of such data remains challenging due to the many data dimensions to be visually encoded. Graph structures are composed of vertices with labels and attributes, edges with weights and directions, and the time dimension which can be inherent in all of the single graph features. All of these dimensions can vary in size depending on the field of application. Consequently, not only the pure size of graphs has to be considered when talking about big graph visual analytics but also the application domain. However, the dynamic graph datasets that we are analyzing in this work are already big enough to build the basis for a serious dynamic graph visualization problem.
Generally, there are two concepts when it comes to visualizing evolving graph data. Dynamic graph visualization approaches deal with the time dimension by applying either a time-to-time mapping (animation) or a time-to-space mapping (static display) as surveyed in the work of Beck et al. 1 Graph animation typically uses node-link diagrams which are then smoothly animated from one diagram to the next using as little layout changes as possible to achieve a dynamic stability2,3 during the animation process. The preservation of the mental map4,5 is a very important aspect in this visualization strategy to keep the dynamic graph readable and explorable for the human viewer.
In our work, we rely on a static representation (apart from interaction techniques) which can be used to display long graph sequences consisting of single static directed graphs. Such a representation can serve as a good overview about long graph evolution and relies on the efficient pattern recognition of the human’s visual system. 6 Animation is more suitable for presentation, whereas static diagrams are useful for exploration, 7 a fact that speaks in favor of our visualization strategy.
For each graph in the time-to-space mapping, we exploit the traditional concept of node-link diagrams introduced in the early work of Euler. 8 Although node-link diagrams are intuitive and useful for solving path-related tasks, 9 they lead to visual clutter, 10 in particular when the graphs become large and dense. Due to the fact that we use a bipartite graph layout which maps graph vertices to one-dimensional (1D) vertical lines, the visual clutter phenomenon is getting worse. In our approach, we mitigate this situation using the concept of edge splatting 11 of partially drawn links 12 which has benefits concerning two aspects:
First, it improves visual scalability since the reduced link lengths allow to show more graphs in the same display space;
Second, the color-coded peaks of the density fields still give a hint about the target of a directed graph edge or at least about the coarse direction a link is pointing to.
The focus of this concept is consequently not on seeing and identifying graph details but more on the exploration of longer graph sequences with respect to structural changes.
We apply our interactive tool to a dynamic graph dataset from the field of software engineering with dynamic call relations consisting of thousands of vertices and edges but only 21 time steps. Moreover, to test the visual scalability issue in the time dimension, we show a social network evolution describing face-to-face contacts during the 2009 Hypertext conference with more than 1000 time steps.
In both scenarios, we can easily see that the displayed dynamic graph still reveals interesting time-varying patterns while simultaneously scaling to many vertices, edges, and long graph sequences that may also contain dense graph regions. To already start the visual analysis process with a preprocessed dynamic graph dataset, we compute some kind of improved vertex order by either using an inherent hierarchical organization among the vertices or applying a hierarchical clustering algorithm. Moreover, a graph analyst does not have to visualize the complete dynamic graph but he or she can also preprocess the graph data by filtering, clustering, or aggregation techniques.
Related work
Various techniques have been designed for visualizing static as well as dynamic graphs as surveyed by Von Landesberger et al. 13 and Beck et al. 1 Some of the techniques also focus more on a visual analytics aspect instead of just supporting a viewer by a single visualization technique. Federico et al., 14 for example, describe a visual analytics tool for exploring dynamic social networks. In their work, they used three time-based views and support the combination and linking of several node-link diagrams, also a 2.5–dimensional view. They rather focus on the linking of views than on providing a visualization technique that is scalable in the time dimension while it still shows some insights about individual graphs.
A more recent work of Van den Elzen et al. 15 reduces individual graphs to points and shows the changing of those points overtime. This reduction scales very well in the time dimension but has the drawback that now not many aspects about the original graph data are shown for comparing them overtime. Their abstraction level is much higher than the one used in our work, that is, each graph becomes a single point. Liu et al. 16 designed the EgoNetCloud system in which they combined graph data analyses with visualization techniques. Their event-based dynamic graph visualization also uses abstractions to provide an overview about the dynamics of a network. In the egoSlider tool by Wu et al., 17 the focus is on different abstraction levels, that is, a macroscopic, mesoscopic, and a microscopic level. Multiple coordinated views are given to show evolutionary graph patterns. Negatively, only a few time steps can be displayed, that is, the visual scalability aspect in the time dimension becomes the challenging issue in their work.
However, after a data preprocessing step a graph analyst oftentimes wishes to see the graph dynamics in the form of a dynamic graph visualization technique that is able to give a scalable overview about the graph evolution. To achieve such a time-based overview, two general trends can be found in the field of dynamic graph visualization, that is, time-to-time mappings which make use of graph animation and, on the other hand, time-to-space mappings which are static displays of dynamic graph data.
Static diagrams have benefits when graphs have to be compared since such a comparison can be done directly by inspecting the graphs on screen. This is difficult for animated graphs2,3 since only one graph at a time is shown and is smoothly animated into the next one in the sequence. The preservation of the mental map 4 which is typically achieved by having a high degree of dynamic stability is one of the key concepts to obtain a better graph animation for solving graph exploration tasks. 18 But displaying longer graph sequences with the goal to derive time-varying patterns is a challenging, that is, even impossible task when graph animation is applied.
We base our approach on static small multiples with graph vertices aligned to horizontal lines. This is similar to the approach of Burch et al. 11 which was inspired by the visual concept in parallel coordinates plots, 19 that is, small vertical stripes which make longer graph sequences visually scalable. Burch et al., instead, use complete links which have a higher probability to cause link crossings making a diagram unreadable which is mitigated by applying edge splatting to generate density fields for the link coverage information similar to the approach used for generating continuous parallel coordinates plots. 20
Apart from the major visual metaphor of node-link diagrams, also adjacency matrices21,22 or adjacency lists 23 are possible concepts applicable to dynamic graph data. Node-link diagrams are in favor of adjacency matrices for path-related tasks visually encoding graphs with more than 20 vertices as investigated in a user study by Ghoniem et al. 9 Node-link diagrams are also more scalable in the time dimension and do not produce that many empty gaps as adjacency matrices or lists would do. By exploiting the concept of bipartite layouts, a 1D graph layout with aligned vertices can be generated. This places vertices on horizontal lines in an aligned manner benefitting from a preservation of a viewer’s mental map by reducing cognitive efforts when inspecting the graph sequence and performing visual comparison tasks.
Although we decide to use the node-link visual metaphor in this work, we argue that additional enhancements to the link representation can become a powerful concept to represent long graph sequences. In this article, we argue that we do not need the complete link in a node-link diagram to show connectedness of objects as evaluated by Burch et al., 12 invented by Becker et al., 24 and mathematically modeled as partial edge drawing by Bruckdorfer et al. 25 Although there are many other link representation styles like curved, orthogonal, or tapered ones which are evaluated by Holten and Van Wijk, 26 we use traditional straight link drawings with equally thick lines. But, as an enhancement to the parallel edge splatting concept introduced by Burch et al., 11 we focus on space-efficiency aspects. Consequently, in our approach, we use partially drawn links which make the diagrams more visually scalable in the time dimension since they do not require the full link length and consequently, the full display space to show an individual graph. But to be still able to identify hot spots and edge directions, that is, graph patterns, we also apply edge splatting to make them more readable, in particular, when the graphs are dense and time-varying. This edge splatting concept, applied to partial links, allows to display longer graph sequences while they still allow to identify changing graph patterns.
Data model and transformations
In the context of this article, we are dealing with relational data that can be mathematically modeled as graphs. Moreover, the abstract nature of the data allows to apply a vertex ordering which can be used to generate a graph layout following certain esthetic graph drawing criteria. This order is already inherently given by the dataset or can be computed by applying a hierarchical clustering algorithm.
Static and dynamic graph data
In the context of this article, a directed graph is modeled as
A dynamic graph Γ is modeled as a sequence of
Before visualizing the dynamic graph, we compute the union of all vertex sets
Data preprocessing
Before we provide a graph analyst with a dynamic graph visualization, the graph data can be preprocessed to reduce the amount of displayed vertices, edges, or time steps:
Filtering. To reach this goal, the data can be filtered in any of the dimensions. The number of vertices can be reduced by filtering for text annotations (e.g. like file names in software systems), edges can be filtered by edge weight thresholds. More complex path filters are useful to throw away vertices and edges that lie on paths with a certain property or not (like path length). The third data dimension, that is, the inherent temporal aspect in dynamic graphs, can also be filtered either by taking graph subsequences into account or by looking at time stamps each indicating the point in time an individual graph existed. Combining vertex, edge, and time filters provides insights about interesting correlations in the time-varying graph data.
Clustering, grouping, and ordering. Apart from reducing the amount of data, it can also be restructured by means of vertex ordering and grouping techniques. If the vertices are not ordered by an inherent hierarchical organization, a hierarchical clustering algorithm can be applied to obtain a suitable 1D order. Clustering on the time axis can also be a very useful concept since it provides an overview about time-varying behavior among the relations. Similar trends and subtrends can be identified making an exploration of the graph dynamics much easier. To achieve this, we compute differences between all individual graphs by computing difference values taking the weighted edges into account. All those differences result in a weighted matrix on which a hierarchical clustering can be applied. In this clustering, we have to take into account the temporal order of the graphs, that is, the resulting hierarchy cannot be freely traversed but has to preserve the temporal order. Another ordering strategy that we take into account is based on the optimal linear arrangement problem (OLAP). 27 This non-deterministic polynomial-time (NP)-hard problem describes the overlap of graph links running in parallel when the graph vertices are equidistantly placed on a straight axis. Several approaches already exist trying to find a suitable solution to this problem28,29 that is, in particular, useful if graph vertices are laid out on a 1D axis.
Aggregation. Also, vertex aggregation into super vertices is a powerful concept. Positively, it reduces the amount of vertices to be displayed but negatively, important information about individual vertices can be hidden by this grouping principle. Consequently, we leave this decision for the user. If vertices are aggregated, the adjacent edges with their weights are aggregated too. A similar aggregation strategy can be applied to the time axis. Subsequent graphs can be aggregated into an individual graph that is a representative graph for the subsequence. In this supergraph, all weighted edges are aggregated and the new edge weights are either generated by the sum, the maximum, or the average of all weights. If a temporal clustering has been applied, all the identified time clusters can be aggregated into individual graphs.
Dynamic graph visualization
In many applications, graph data have a dynamic nature and consist of very many time steps. Moreover, if the graphs to be visualized are large and dense, that is, contain many edges compared to the vertices, visualization can result in hairball-like representations when node-link diagrams are used or, on the other hand, layout algorithms are very time-consuming leading to non-interactive visualizations.
In our work, we try to give an overview about long, large, and dense graph sequences with the goal to support an analyst to see time-varying visual patterns, to filter, group, and aggregate the data for specific patterns, and finally, get insights in the dynamic graph which would otherwise be hard to find. We base our work on the visual information-seeking mantra by Shneiderman 30 which states overview first, zoom and filter, and then details on demand.
Graph layout and visual design
The layout of the graphs is very important to obtain a visually scalable dynamic graph visualization. We first copy the vertex set and place all vertices equidistantly to vertical lines, that is, the original vertex set and its copy appear on two vertical parallel axes on which the vertex copies are aligned on imaginary horizontal lines. This procedure generates a bipartite layout (see Figure 1(a) and (b)).

The transformation step of a directed graph from a node-link diagram into a bipartite partial link diagram: (a) a small node-link diagram consisting of five vertices and six edges, (b) a bipartite layout of the same graph in a left-to-right reading direction, (c) drawing the links partially frees some display space, and (d) explicit link direction indicators in the form of arrow heads are redundant information and are not needed to understand the representation.
Complete links are useful to avoid target vertex ambiguities but as a negative aspect they waste display space. In our approach, we use partially drawn links instead to indicate the direction of the links (see Figure 1(c)). The direction of the links is clear since it follows a left-to-right reading direction. Consequently, we can also remove the arrowheads that are redundant information in this layout (see Figure 1(d)).
Our approach is technically not restricted to a special layout but if a two-dimensional (2D) node-link layout was used, it might be more problematic to display that many time steps, that is, the visual scalability would suffer from a 2D layout. In general, we define a layout of a graph as a function
which maps graph vertices to 2D positions on the display. In our work, we use a bipartite layout, that is, at first the vertex set is copied which is also a 2D layout but the x-coordinate only has two values (for the parallel axes). This fact is the reason why we would rather talk about a 1D layout than about a 2D one.
Graph sequence representation
In this work, we are merely interested in visualizing longer graph sequences in which the individual graphs can be large, dense, directed, and weighted. This means our intent is to give an overview about time-varying visual patterns in the evolution of large dynamic digraphs which can then be further analyzed by preprocessing the data and by interaction techniques working directly on the visual representation of the data. We argue that in many dynamic graph visualization scenarios, an overview is missing. The viewer definitely needs a starting point for further explorations, that is, most of the data should be displayed at once. 31 This overview helps to detect patterns in the visualization to build, refine, accept, or reject hypotheses about the data which are crucial for visual analytics32,33 techniques.
A dynamic graph, that is, a sequence of individual graphs, is also mapped in a left-to-right reading direction with the individual graphs mapped to small vertical stripes (see Figure 2). The vertex positions on the vertical axes remain the same since we are focusing on the preservation of the mental map which is crucial in our approach to derive time-varying graph patterns.

A sequence of four graphs (T1)–(T4) displayed as a traditional node-link diagram (top row) and as a sequence of graphs in a bipartite partial layout (bottom row). The time steps are shown individually in separate views.
As a novelty to the traditional edge splatting technique proposed by Burch et al., we apply the concept of partially drawn links allowing to win some display space for showing more graphs from the complete graph sequence. The partially drawn links do not show the complete link information but they still give a hint about the starting point of an edge and the direction in which it is pointing. Computing a density field from all those partial links and using color coding to visually encode the hot spots in the diagram supports an overview about the graph evolution (see Figure 3 for a bipartite partial link representation of a dynamic graph).

The graph sequence in a left-to-right reading direction mapped to one row in a combined view. Leaving away the vertex labels and scaling down to pixel-based size lead to a visually scalable dynamic graph visualization. Splatting the edges generates density fields that can be color coded to show time-varying patterns in the graph sequence.
Figure 4 shows the scalability of the technique in the time dimension. The graph sequence in Figure 4(a) is mapped to the full display space. In Figure 4(b)–(e), it is displayed at smaller and smaller becoming display areas. The time-varying visual patterns are still recognizable. Edges are drawn partially but still 50 % link lengths are used. In the case studies, we use even shorter partial links which allow to display many more graphs next to each other while the changing graph structures can still be visually explored. The length of the links can be interactively chosen by the user. Playing around with those link lengths can give different perspectives on the dynamic graph data.

Rendering a sequence of splatted graphs on changing display space: 14 graphs in a bipartite layout on (a) 100 % of display space, (b) 75 % of display space, (c) 50 % of display space, (d) 25 % of display space, and (e) 12.5 % of display space. In all, 50 % partial links are used to reduce the amount of display space to render the graphs. Time-varying patterns can be detected in all of the representations.
Interaction techniques
Our visualization tool supports several interaction techniques. The most important ones are listed below:
Partial link length. The link lengths can be varied by changing a corresponding percentage value. Each link is displayed in the requested length while it is simultaneously splatted to achieve a useful visualization technique for the graph evolution.
Graph gaps. To better visually separate the individual graphs from each other, the user can add empty display gaps between two consecutive graphs. The larger the gap, the less visual scalability is achieved.
Display space. Since the node-link representations of the graphs are mapped to real-valued density fields, we are able to visually compress the individual graphs. Although the visual patterns do not have that much display space anymore, a compressed dynamic graph visualization is still useful to see time-varying patterns.
Vertex, edge, and time interval selection. Vertex groups, corresponding edges, and time intervals are selectable to apply an aggregation or filtering technique based on the visualization and not on the data as already described in the data preprocessing step.
Time granularity. Graphs can be aggregated on different levels of time granularity. Time intervals can be selected and the contained graphs can be merged into one. A similar approach works for the set of vertices. Neighbored ones on the vertical lines can be aggregated into individual merged nodes. The edges are summed up accordingly.
Color coding. The applied color coding also plays a crucial role in this visualization technique since the generated hot spots in the splatted graphs can make smaller value differences disappear. Consequently, a logarithmic color coding might be applied.
Details-on-demand. Clicking on an individual graph can show it in a different view and in a different layout. This allows the analyst to explore the relational data for details. Also, textual information in the form of text labels are difficult to be displayed but in a detail view such additional information might be displayable at least to some extent.
Static and dynamic visual patterns
A dynamic graph visualization is useful if data patterns can be visually detected in the provided visual encoding. The human viewer is able to rapidly detect visual patterns due to the strengths of his visual system and perceptual abilities. In this section, we describe how a readable dynamic graph visualization is computed, which esthetic graph drawing criteria we follow, and how detected visual patterns can be remapped to data patterns in order to analyze dynamic graph data.
Esthetic graph drawing criteria
In the field of graph drawing, we are interested in graph layouts that follow certain esthetic graph drawing criteria.34–37 Since we are visually encoding dynamic relational data with node-link diagrams (although they are only drawn partially and not completely), we have to be aware of guaranteeing that the layouts follow those esthetics criteria. In general, not all of them can be incorporated into a graph layout simultaneously due to the fact that they stand in trade-off behaviors:
Minimization of link crossings. The number of link crossings can be reduced to a minimum by reducing the length of the partial links until no link crossings remain anymore. Negatively, this strategy would result in ambiguities when it comes to locating a target vertex in the representation.
Minimization of overlaps. Node overlaps are avoided by placing the vertices equidistantly to vertical lines. Links overlapping nodes are also not possible since the links are always drawn from left-to-right starting at a node at the left-hand side. The only overlap possibility comes from links with other links. They can be reduced by reordering the vertices in a way that takes into account the link lengths and orientations.
Minimization of link lengths. Since we are mapping graph vertices to a 1D axis, it is more difficult to reduce the link lengths. In 2D, we have more freedom to do that than when we are restricted to 1D which is required to achieve a horizontal alignment of the vertices in order to preserve a viewer’s mental map in the dynamic graph.
Minimization of aliases. If partial links are used, spatial aliases can occur meaning a certain vertex can be taken as a target vertex although it is not. This is caused by the misinterpretation of directions due to perceptual problems or nodes located next to each other.
Evenly distribution of vertices. Vertices are equidistantly placed to vertical lines. This means they are evenly distributed if we take into account spatially neighbored nodes. Looking at all pairs of nodes, the average distances are much larger in 1D layouts than in traditional 2D layouts.
Maximization of symmetries. Symmetries can be generated by applying a hierarchical clustering algorithm to the graph vertices taking into account the edge weights. Force-directed approaches that focus on symmetries in 2D layouts are difficult to apply for 1D layouts. The restriction to 1D makes the visualization of symmetries much more challenging.
Maximization of orthogonality. Angles between adjacent links or crossing links are typically much smaller than in a traditional 2D layout. The 1D layout is also to blame for this drawback.
Visualization-data pattern matching
To design a useful visualization technique, it is important that the viewer using the technique in a visual form can detect certain patterns in the visualization. These patterns must be remapped to the data again to interpret the data and to derive insights and finally, knowledge. In this section, we describe which static and dynamic patterns can be found in the dynamic graph visualization and what their meaning is with respect to the visually encoded data:
Static graph patterns. Inspecting an individual graph, we can get an impression about the density of a graph, that is, if it contains many links or not. The density property can be explored globally or locally meaning the graph is either completely dense or it contains only a certain number of weakly connected clusters. Visually, this property can be derived from the color-coded density values. Also, vertices that have many adjacent outgoing edges (star patterns) can be detected easily in the visualization by fan-like graphical shapes.
Dynamic graph patterns. The graph dynamics demands for visually exploring many neighbored vertical stripes. Visually comparing the graphical density patterns and the outlines of the individual graph, visualization can give insights about the evolution of relations. Graphs can become denser and denser, sparser and sparser, or they have a stagnating behavior, that is, they are not changing at all. Also, an oscillating behavior might be present, meaning an alternating change in color codings with a certain period. All of these phenomena can happen over fixed time intervals or they can be present over the complete history.
Case studies
We applied our dynamic graph visualization with splatted partial links to two scenarios from software development and from social networking. These scenarios are different from each other since the dynamic call graph only consists of 21 time steps, whereas the dynamic social network is a longer graph sequence with more than 1000 graphs at the finest level of temporal granularity. Visually, representing such a long sequence of relational information on one screen is challenging, hence worth investigating by a visual analytics technique focusing on big graph data.
Dynamic call relations
Software development generates evolving data since software projects are typically designed, developed, and implemented over a certain time period. Lots of data sources might be worth investigating like developer behaviors, the source code, bug databases, log messages, or email conversations among the developers. In this work, we are interested in call relations. During the development process such call relations are not static but change overtime, building the basis for a dynamic-weighted directed graph. To generate this specific kind of data, we first apply an external tool, called the Dependency Finder. The call graphs are then internally stored in our tool-specific dynamic graph format.
The positive aspect with this kind of data is that the dynamic call relations already contain some kind of structure given by the hierarchical organization of the software functions/methods, which build the vertices of the graphs. Mapping the vertices in a hierarchy-preserving linear order reduces the amount of link crossings (if the software is well structured). This is a natural phenomenon in software systems because elements closer together in the hierarchy are more likely to call each other than those that are far apart. We do not need a hierarchical clustering to preprocess these data (but we can do if we like generating another hierarchy apart from the one inherent in the software system).
By preserving this hierarchical organization in the visualization, we obtain the diagram shown in Figure 5 which shows the 21 releases of the JUnit open source software project. It may be noted that there is not a unique flat 1D hierarchy representation, that is, the hierarchy can still be traversed recursively. As a benefit, it does not matter how we traverse the hierarchy, the hierarchical organization of the vertices is always preserved.

The JUnit open source software project with its 21 releases can be explored on a call graph perspective. We can see the call graph changes although the links are only drawn partially. The graph contains 2817 vertices (program methods) and 15,339 directed edges (method calls).
In Figure 5, we can directly see that the graph structure is changing overtime, which is due to changes during the software development. Source code and functionality are added introducing many more links. But several links are also disappearing during the evolution meaning that the functionality is either moved to different modules or that it is completely removed from the project. Understanding and analyzing such details during software development require to have a look inside the source code. The benefit of the partial link splatting is obvious. Long crossing links are avoided that would clutter the display. This means that the evolution of the call graph can be explored in a better way; there is not that much overdraw as in the standard parallel edge splatting technique introduced by Burch et al. 11
In the 21 call graphs, we can identify three major phases which are annotated in the figure:
Phase 1. In the first phase, we can see that not all areas are covered by color-coded peaks. There is a large empty gap at the bottom left. This means not all vertices in the graph are connected by edges, that is, many methods are still not defined in this early stage of the project development.
Phase 2. As we can detect in the second phase, new modules are added and consequently, the number of call relations increases. But there are also changes in the call relations in those that were present from the beginning. This can mean that existing functionality might have been moved to other modules and now the call relations are changed too.
Phase 3. The third phase means another extension for the call graph. Now we can see color-coded peaks covering nearly the complete vertical stripes. The project is extended again. This time, there are nearly only added relations, only a few calls seem to be deleted or moved to other modules.
Apart from the global phases, we can discover various more fine-granular insights in the dynamic call graph. For illustrative purposes, we show some of them in the following:
I: These call relations only existed for four revisions. They never appear again. Maybe they are completely removed from the system because the system got restructured or they are just distributed over other modules.
II + III: In the second and third phases, we can detect a stability pattern in a bunch of call relations. Maybe this part of the project builds some kind of core and is consequently, not supposed to any changes in the functionality anymore.
IV: Toward the end of phase 2 only a few call relations are added. This first happens for two revisions. Then with beginning of phase 3, many more call relations come into play. Maybe only a part of the functionality is added and later on it is extended by many more features.
These are just a few examples that can be directly read from the visualization. To dig deeper in the dynamic graph data, we have to apply interaction techniques. Details on demand are very important in this application domain since a software developer has to have a look on the corresponding source code to understand the call relation changes and their meaning.
Dynamic social network
The second application example deals with much longer graph sequences. Moreover, a preprocessing of the data in the form of hierarchical clustering and time aggregation is required to provide suitable data for an overview representation in the form of a dynamic graph visualization.
The Hypertext 2009 conference was held in Torino, Italy, and 150 participants were registered. As a special experiment face-to-face contacts between the participants were recorded generating a time-varying social network that is worth analyzing with our visual analytics approach. The participants build the vertices in the graph, the face-to-face contacts, the edges, and the recording granularity that is used for the time dimension.
We can read from the conference website that
the attendees of Hypertext 2009 will also have a chance to experiment with applications mixing real-world data and online data. We will deploy active RFID tags in the badges of volunteers and run a data collection platform tracking the real-time relations of physical proximity between the attendees. The data collection and visualization systems will be provided by the SocioPatterns project and will expose API methods that allow developers to mash up real-world links between the attendees with other types of linking information from the Web.
The recorded dynamic graph data are interesting and builds a challenging problem for big data visual analytics focusing on time-varying graphs. The actors in social networks are typically unstructured, that is, every person is initially treated in a similar way. Hierarchical clustering can be used to derive a structure among the people which can be used to order the vertices of the underlying graph. If this procedure was not applied, the resulting visualization would suffer from vast amount of visual clutter. For the dynamic social network dataset, we first compute an adjacency matrix for all conference attendees, use the face-to-face contacts as relational data, and finally apply a hierarchical clustering on the time-aggregated adjacency matrix.
The face-to-face contacts recorded during a conference at 20 s intervals were leading to a long sequence of social networks. Using the formerly computed hierarchical information to map the vertices to the 1D lines gives some structure among the dynamic graph data. If we are intending to analyze this time-varying network, we first need an overview about the dynamic patterns. The dynamic data might be aggregated to hourly intervals shown in the diagram in Figure 6. In this visualization, we can see that there are three time clusters being acquired during the conference days; people are attending the conference and are talking to each other. The empty gaps in the display visually encode the face-to-face contacts over night. In these time periods, people are not talking to each other. The hierarchical clustering unhides some smaller groups of people who are talking to each other more frequently than others and also over longer time periods.

Visualization of a time-varying social network acquired during 3 days of the Hypertext 2009 conference on a per hour basis: 113 vertices, 20,818 edges, and 57 time steps are displayed. Links are drawn partially to save display space and density fields are computed to show some overall graph structures.
The same data are shown in Figure 7 on a per 3-min basis. In this figure, we do not use one single display line but we split the display space into five rows that roughly show the day–night–day–night–day pattern as already illustrated in Figure 6. Here, we can easily detect the daily patterns again but now we can also derive more fine-granular time-varying graph patterns of more than 1000 social networks. In the more fine-granular data representation, time clusters can also be found during the conference days. These refer to the coffee breaks during the conference where only small groups of conference attendees are talking to each other.

Face-to-face contacts during three conference days displayed on a per 3-min basis. The data from Figure 6 with 113 vertices (conference attendees) and 20,818 edges are now displayed on a more fine-granular time scale with more than 1000 graphs. The graph sequence is split into five rows. We can see that all graphs are displayable on screen without scrolling interaction, rapid serial visual presentation (RSVP), or graph animation in general. General trends can be seen directly.
Discussion and limitations
Although we illustrated a visual analytics approach to explore and analyze long graph sequences, we are aware of the fact that the approach has various limitations and drawbacks that will be discussed in the following:
Data preprocessing. Although a data preprocessing step is needed if we have to deal with big graph data, we primarily do not know what to search for. Consequently, the clustering, grouping, filtering, or aggregating operations can have a bad influence on the dynamic graph data that are visually represented later on. This may cause the graph analyst to derive wrong hypotheses based on misinterpretations of the data. This challenge occurs for all dimensions in the data, that is, vertices, edges, and time.
Scalability issues. Our visual analytics approach can be applied to growing dynamic graph datasets. But, on the negative side, algorithmic, visual, as well as perceptual issues can become a problem as with many other approaches dealing with dynamic graph data. From a visualization perspective, we can scale down the individual graphs to very small display regions but as a drawback the smaller the display region becomes, the lesser details about the graphs can be revealed.
Interaction techniques. Playing around with several parameters like link lengths, color codings, edge weights, and the like is very important to not leave alone the viewer with a static representation. But although interaction principles are definitely required, it becomes harder and harder to select individual graphical elements like vertices, edges, or even individual graphs when those are visually encoded in small display regions.
Visual ambiguities. Partial links are in favor of complete links focusing on space-efficiency aspects. Depending on the user-defined length, more or less graphs are displayable making the dynamic graph visualization scalable in the time dimension. But as a drawback we can identify target vertex ambiguities, that is, if the partial links are too short the user is not able to solve path-related tasks due to the fact that individual links cannot be traced by the eye.
Multiple time scales. Multiple time scales can be addressed by the visualization, but with the current version, only one scale at a time can be displayed. Multiple scales are selectable by a slider ranging from the finest temporal aggregation to the coarsest one. Several time scales might be displayable, that is, a sliding time window might be selectable in each of the time scales in the dynamic graph visualization while highlighting each of the finer scales in the coarser ones. Although this might be a valuable concept to compare the dynamics of a graph on several temporal granularities, this approach will waste display space and can hide many more dynamic patterns.
Parameter pre-computation. It might be possible to pre-compute some of the visualization features. Depending on the number of displayed graphs, the display space of each individual graph can be computed. Based on those individual graph regions, the partial link lengths can be adapted. But also the density of the graphs might be used as an indicator for algorithmically deciding about the partial link length. Negatively, those link lengths should not vary overtime, that is, applying different partial link lengths for each individual graph, since that would lead to perceptual problems and a loss of the mental map. Changing link lengths overtime would consequently lead to misinterpretations of the dynamics of the graph data. The best option is to let the user decide how parameters are adjusted.
Conclusion and future work
In this article, we illustrated how long graph sequences can be displayed with the goal to give an overview about time-varying patterns. To achieve a more scalable dynamic graph visualization, we applied the concept of partially drawn links which are still useful to show the rough directions of the graph edges. To obtain visual graph structures, those partial links are splatted which results in time-varying density fields. The densities are color coded leading to a sequence of small vertical stripes with visual patterns. Consequently, the approach can easily be used to understand an evolving graph focusing on the extraction of time-varying graph patterns. Simple interaction techniques are integrated such as the adjustment of the partial link lengths and the applied color coding.
For future work, we plan to test our technique with even longer graph sequences from other fields of application. A user study should be conducted with the goal to find out whether people are able to explore dynamic graphs using this visualization technique.
Footnotes
Funding
This work was supported by DFG under grant WE 2836/6-1.
