Abstract
Examination of temporally changing adaptive social networks has been difficult given the need for extensive and usually real-time data collection. Building from interdisciplinary advances, the authors propose a web search engine–based method (called retrospective relatedness reconstruction or 3R) for collecting approximated historical data of temporally changing adaptive social networks. As quantifying relatedness among people in social networks leads to difficulty in assigning proper weights to relationship ties, 3R offers a means for assessing relatedness between people over time. Additionally, 3R can be applied beyond people relatedness to include word associations. To illustrate these two novel contributions, the authors reconstructed the temporal evolution of a social network from 2005 to 2009 of 92 individuals (key leaders) related to the U.S. financial crisis and also examined the temporal evolution of social sentiment (i.e., fear, shame, blame, confidence) related to the same 92 individuals. We found several illustrative cases where temporal changes in centrality and/or sentiment captured actual events related to these individuals during this time period.
Research advances outside organizational science have provided relevant methods to apply to complex problems within the social sciences. Specific advances in methods can be linked to particular fields, for example, the contribution economics has made to the development of time-series analysis (Klein, 1997) and contributions from medical sciences such as the Peto brothers’ advances in meta-analytic techniques (Peto, Collins, Gray, & Horwtiz, 1995) and mathematical statistics (Peto & Peto, 1972). Although in these cases researchers were predominantly working to perfect analytic techniques within their respective fields, other fields benefited when it became apparent certain methods could be adapted and applied beyond the primary field of origin or development.
A slightly different approach to this diffusion of methods has received attention of late, where purposeful and collaborative interdisciplinary research is conducted for the advancement of all fields. One such approach occurred in the interdisciplinary field of network science, where rather than a limited view of methodological development from within one particular field, network scientists rely on integration of interdisciplinary research and contribution. Interdisciplinary research teams seek means of advancing methods with the specific purpose of being broadly applicable to multiple fields simultaneously. It may represent a return to a “science for science’s sake” mentality, with no particular field set to gain advantage over any other field. For network science, the primary interest is the advancement of understanding in any research surrounding natural or manmade networks.
Building from network science methods, we employ network analysis techniques in the current study. Moreover, in the spirit of offering advancement to social science, we propose a new means for collecting approximated historical data of temporally changing adaptive social networks, which are rather difficult to obtain experimentally with conventional research methods. Although we offer this adaptation as a preliminary investigative tool, interdisciplinary advances have informed the development and application of our method. Moreover, this method may help social scientists assess the appropriateness of employing analytic tools used successfully in other disciplines of science (e.g., statistical machine learning applied to natural language processing, see the following) for a specific research project.
Applying Interdisciplinary Science
Two key areas of science have significant advances that enable interdisciplinary application and adaptation of techniques that can be applied in our retrospective relatedness reconstruction (3R). First, the primary adaption of our method follows contributions from network science, especially as it informs both social network and semantic network analysis. Semantic network analysis is a relational approach to text analysis, using “network analytic techniques on paired associations among shared meaning, as opposed to paired associations of behavioral or perceived communication links” (Doerfel, 1998, p. 16).
The second area is another interdisciplinary scientific approach involving computer science and linguistics, natural language processing (NLP). NLP measures semantic similarity and/or networks and plays a key role in more modern applications of statistical machine learning algorithms (Bollegala, Matsuo, & Ishizuka, 2007). As such, both areas of science offer opportunities that relate to our retrospective relatedness reconstructions and we briefly review advances in each area before introducing our network analysis techniques and applications.
Network science
The unique interdisciplinary approach within network science involves fields of study from the disciplines of mathematics, statistics, physics, biology, social science, information science, and computer science, to name only a few (Börner , Sanyal, & Vespignani, 2007). The power of this approach is evident in the vast incorporation of network science methods into research in biology (Barabási & Oltvai, 2004; Hodgman, 2000), physics (Dorogovstev & Mendes, 2002; Pastor-Satorras & Vespignani, 2002), and sociology (Carrington, Scott, & Wasserman, 2004; Wasserman & Faust, 1994).
However, despite significant gains in biology and physics due to the capability of collecting and analyzing immense data sets, this computational approach to social science research has been significantly slower (Lazer et al., 2009). According to Lazer and colleagues (2009), several barriers need to be overcome before social science can fully embrace such computational approaches, namely, data infrastructure paradigms focused mostly on cross-sectional views of hundreds or thousands of subjects at one time. Network and computational social science enables longitudinal research on millions of people, raising the ante on new perspectives of collective human behavior (Börner et al., 2007). To accomplish this work, we need to establish new data collection, analysis, and interpretation methods (Börner et al., 2007; Lazer et al., 2009). Unfortunately, this paradigm shift may be slowed by our limited insight from snapshot views of subjects at one point in time rather than over time.
In an effort to improve incorporation of network science into social science research, network science experts noted that training and accessibility surrounding easy-to-use programs and techniques can increase the acceptance of computational methods (Lazer et al., 2009). They suggested sharing of programs and data sets can build from existing computational tools available in the hard sciences; however, they cautioned that modifications for social science research are necessary to address the unique issues that are related to the use of human beings as research subjects.
Related, Lee, Kim, Ahn, and Jeong (2010) noted one key problem in the examination of social relationships among people is assigning a proper weight for the relationships among the various ties, as objectively quantifying the relatedness among people is somewhat difficult. They proposed search engine counting or estimating of the number of web pages containing common words in a search query can be used as a measure of relatedness between pairs of people, as increasing volume of pages found indicates increasing relevance between the pair. As such, Lee et al. assert the occurrence of two individuals in many and/or multiple “hits” on the web implies a closer relationship between these two individuals than would be seen in random counterparts.
Given the interdisciplinary nature of network science, it stands to reason that there is significant variety in approaches and research interests. For example, specialties exist in the search for common laws across application domains; development of measurement, modeling, and visualization algorithms; and/or detailed focus on a particular type of network (Börner et al., 2007). The inspiration for Lee and colleagues’ (2010) methodology is based on analytic tools used in the context of statistical physics of complex systems. Moreover, viewing the web as an automatic generator of sociological or relatedness data mirrors the vast data compiled from “high-throughput” biological experiments that have introduced and incorporated network science as a means of analyses into traditional biology. Thus, a new means for constructing social relatedness networks of people grew from the fields of biology and physics.
Our contribution above and beyond Lee et al. (2010) is twofold: (a) We expand the scope of web search relatedness measures from a one-time snapshot to dynamical processes (i.e., retrospective reconstruction), and (2) we expand the technique beyond social network applications to any kind of keyword associations. Two of our examples illustrate each novelty. The two studies are linked by use of a common method, 3R, as well as a common sample. We apply this method to retrospectively reconstruct the social network surrounding leaders and key individuals involved in the 2008 financial crisis in the United States (Study 1). We then extend this proposed technique beyond social network analysis and examine the evolution of social sentiment (i.e., fear, shame, blame, and confidence) related to these same leaders and key individuals during the same timeframe (Study 2).
Semantic similarity
Although robust measurement of semantic similarity remains challenging (Bollegala et al., 2007), several applications have appeared in social science that offer promising contributions. One such application, semantic network analysis, represents the text as a network of objects. This network combined with other useful information is then queried to answer different research questions. For example, human coded semantic network content analysis has been applied to political analysis (cf. Kleinnijenhuis et al., 2007) and followed by a comparison using the same political material to perform an automated semantic network analysis (Van Atteveldt, Kleinnijenhuis, & Ruigrok, 2008). The automated semantic network analysis produced satisfactory performance at the level of measurement and the level of analysis when compared to the hand-coded analysis.
Human coded content analysis also has been applied to other areas of organizational research. Deephouse and Carter (2005) content analyzed two local midwestern newspapers for associations and relations between organizational legitimacy and organizational reputation. Moreover, Meijer and Kleinnijenhuis (2006) used a form of relational content analysis to code issue news and their relationship with corporate reputation. Showing the progression of interest in using news articles, Rocha and Cobo (2011) offered strategies for automating classification of digital news media content. Given the media’s use of the Internet as a platform to “increase social penetration and to influence public opinion” (Rocha & Cobo, 2011, p. 418), Rocha and Cobo devised their strategies as means for helping organizations manage/classify relevant information that can inform timely planning and control of their activities. Using a vector model to represent documents in a simple matrix form (e.g., documents and queries represented as vectors), distance measurements between documents are enabled, and the model represents documents by a weighted vector related with various selected features. A feature ranking method and a particle swarm optimization (PSO)–based selection method revealed these strategies can obtain good classification results using a small feature subset, although findings favored the feature ranking strategy (Rocha & Cobo, 2011). PSO is a computational problem optimization method moving around a population of candidate solutions (i.e., particles) within a search space using mathematical formulae regarding a particle’s position and velocity (Kennedy & Eberhart, 1995).
Thus, although both Rocha and Cobo (2011) and Van Atteveldt and colleagues (2008) revealed progression of robustness of method and incorporation of automated semantic analysis into organizational science, there may be several challenging elements to incorporating such techniques for social scientists. First and foremost, lack of familiarity with such elements as vector models, particle swarm optimization methods, and parser and extractor development used in semantic analysis may reduce accessibility to these methods, as these may require interdisciplinary collaboration and considerable time to develop and implement. This does not imply one should not pursue these options; however, it may behoove social scientists to first explore less complex methods as means of clarifying needs and relationships within the specific research. Following preliminary exploration, a more informed collaboration can be instigated with researchers (i.e., interdisciplinary) targeting specific methods and desired techniques.
As such, we offer an exploration technique that enables basic examination of relatedness among concepts of interest. Our method is not meant to supplant the more complex and comprehensive techniques described previously, but rather provide tentative evidence that more complex and comprehensive techniques from other disciplines may be beneficially applied to specific research agendas. In the spirit of attempting to overcome some of the aforementioned barriers between network science and social science applications (Lazer et al., 2009; Lee et al., 2010), we propose a modification of Lee et al.’s (2010) web search engine–based social network construction to facilitate social science research where specific temporal relatedness is a central tenant of the research, especially with a historical focus (i.e., not real time or current).
Study 1: Social Network Retrospective Reconstruction With 3R
Although a form of network analysis is already being used in social and organizational science research to examine social networks (e.g., Balkundi & Harrison, 2006; Balkundi & Kilduff, 2005; Brass, 1984; Brass, Galaskiewicz, Greve, & Tsai, 2004), as noted previously from foremost experts in network science (Lazer et al., 2009), more network-based research and a broader range of network-based computational tools are needed within the social and organizational sciences to advance knowledge. For example, the importance of temporal dynamics of network topology and their coupling with node/link state dynamics has been increasingly recognized in network science communities (Braha & Bar-Yam, 2006; Gross & Sayama, 2009). However, it is generally difficult to experimentally obtain, or gather in field research, large-scale data of real-world social network evolution over time (Doreian & Stokman, 1997; Wasserman & Faust, 1994). The exceptions to this are some well-studied electronic data sets, such as citations in scientific publications and friendship networks in social media (e.g., Facebook and YouTube), which have been producing a concentration of network analysis research on these limited sets of networks (Newman, 2001a, 2001b).
Recently, Lee and colleagues (2010) proposed a new web search engine–based data collection method by which a researcher can easily reconstruct social networks of any kind by simply using the number of web search results (“hits”) for two names as the link weight between them. They demonstrated the effectiveness of this method by applying it to the social network reconstruction for the 109th U.S. Senate members. They also considered the temporal change of this network over several months in late 2006. However, the network “snapshot” data had to be acquired during the time period under investigation, so that the entire data collection process required that data be taken over the course of several months. A remaining open question is: How could one use web searches to reconstruct the history of social network evolution retrospectively?
We attempt to answer this question by proposing a web search engine–based method that is capable of collecting approximated historical data of temporarily changing adaptive social networks by adding to a search query string other keywords that specify the inclusion and exclusion of specific years to limit the search results to a particular time point (i.e., 3R). For example, one can selectively search for results that are most likely relevant only to year 2006 by adding “2006”, “–2007”, “–2008”, “–2009”, and “–2010” to the search query string. This is accomplished via a modification to Lee et al.’s (2010) web search engine–based social network construction method where we restrict and limit relatedness data by specific dates of interest.
Research Context: 2008 Global Financial Crisis
We implemented a prototype of our proposed data collection method using an API (application programming interface) of a generic web search engine. This program sends a web search query string directly to a web search engine on the Internet, receives the search results from the engine, extracts the number of web search hits for that query from the results, and then returns it to the user. This process can take place without using a web browser and therefore can be automated for a large amount of data collection. We then applied it to the reconstruction of network history from 2005 to 2009 for 92 individuals (key leaders) who are important in the U.S. economy and industries. We selected this particular timeframe and sample because the U.S. financial crisis had far-reaching implications for the entire world. Moreover, there was extensive media coverage of nearly every aspect of the financial crises, including implications for major economies such as the United States as well as several European and Asian countries.
The financial crisis of 2008 pushed the global economy into the deepest economic recession since the Great Depression of the 1930s. The roots of this crisis can be traced back to the credit boom that peaked in mid-2007, caused primarily by low interest rates and loose monetary policy orchestrated by the United States Federal Reserve Bank, and eventually followed by central banks in major economies around the world (Steverman & Bogoslaw, 2008). The concerns expressed by experts about the U.S. housing market from inside and outside the government were already in the news in 2005 to 2007 (Morgenson, 2008). Moreover, several measures had already been taken by the U.S. government to contain the vulnerability of the mortgage market (Congleton, 2009).
Unfortunately, asset bubbles first burst in the subprime mortgages market in the United States and subsequently in all types of securitized financial products markets across the globe. As the crisis continued to deepen throughout 2008, the meltdown of the stock market and bond market, particularly after the failure of Lehman Brothers, triggered concerns regarding the liquidity and solvency of major financial institutions such as Goldman Sachs and Citigroup (Dash, 2008). These ongoing concerns escalated into a full-blown panic on the soundness of the entire United States financial system. Consequently, corporate and financial borrowing costs rose significantly and the economy was choked to a halt in late 2008 (Wilmarth, 2009).
During this economic crisis, leaders at federal government regulation agencies such as the Federal Reserve Bank and the Treasury Department, as well as leaders of major investment banks and business conglomerates such as Goldman Sachs, Bank of America, and General Electric, became the focal point of extensive scrutiny, fury, criticisms, and political confrontations from all facets of society (Steverman & Bogoslaw, 2008). Leaders from government, banking, real estate, investment, and even industry were lumped together and “villainized” as the public, and even Congress, tried to pinpoint the responsible parties.
But just how related were these various leaders and when did relations start and/or stop? Network analysis provides one means of examining relatedness among this set of leaders. We conducted a preliminary study to test these notions with the proposed method. However, we adapted current network analysis methods to include a retrospective reconstruction to allow examination of the evolution of these relations among key leaders over the primary period surrounding the financial crisis.
Method (Study 1)
Sample
The St. Louis Federal Reserve Bank (April, 2011) produced a report chronicling the timeline of events and policy actions that, in part, identifies several key industries and organizations involved in the financial crisis. Using the timeline and supporting articles and information, the following industries and organizations were identified as major industry sectors essential for the U.S. economy: financial services, insurance, banking, investment, retail, technology, and defense contracting. All sectors were heavily affected by the slumping economy, primarily the financial sector in the middle of the housing market downturn and stock market collapse (Steverman & Bogoslaw, 2008).
From these organizations we attempted to identify 100 key leaders (individuals) within these categories with the help of published analyses and lists such as the Fortune 100 (industry), the Bloomberg 20 (investment banking; Bloomberg, 2008), the Power 30 (most influential names for the world economy; Power 30, 2010), and the Top Technology Firms in the Fortune 500 (2008) for the time period from 2005 to 2009. We chose these years in order to observe relational changes surrounding the 2008 economic crisis. Then, we identified leaders of these organizations to be included in the analysis. We selected leaders because during times of crisis, people look to leaders and/or experts for information, support, and/or assistance and tend to scrutinize every action (Bass, 2008; Madera & Smith, 2009).
Leader identification method
Three doctoral-level research assistants, all with MBA degrees, independently generated a list of key leaders involved in the financial crisis using information from the St. Louis Federal Reserve Bank (April, 2011) report and the rankings and lists highlighting major industries and organizations involved in the financial crisis. The research assistants met to compare lists and look for commonalities among leaders and organizations to produce the 100 top leaders involved in the financial crisis. Agreement on a preliminary list was reached.
Additionally, we asked an economist researching the financial crisis to produce a list of names of top leaders involved from the same categories: government, banking, investments, industry, and real estate. The economist and research assistants met to compare lists and reach agreement on the top 100 leaders. This ranking exercise produced 100% agreement on 92 prominent leaders. As such, we selected these 92 leaders as focal leaders to include in our analyses. The list of leader names and affiliations can be found in Appendix A.
Social Network Evolution
The data collection method we propose takes as an input a list of keywords to be searched. Keywords can be of any kind, but in what follows, we focused on the names of the people (i.e., key leaders) we selected for inclusion in our social network reconstruction. To each name we added some additional personally identifiable information (in this case, the leader’s affiliation). This helped reduce some of the errors associated with the fact that many people often share the same name or the name has homonyms. For example, a search on Steve Jobs produced more than 12 million hits, one of which was a website for NBC.com, announcing a story about Late Night with Jimmy Fallon’s announcer Steve Higgins, and at the bottom of the page was NBC’s link to jobs within the company. Rerunning the search with “Steve Jobs” and “Apple” produced less than 1 million hits, and no link to Steve Higgins and the NBC jobs page.
Search queries were generated from this list as follows: Two entries (corresponding to two people) are chosen from the list described previously. A year is chosen (e.g., 2007). The search query is designed to examine the nature of the relationship that existed between the two people during the chosen year. To eliminate the influence of search results corresponding to documents created in years after the chosen year, we compose a series of partial search queries that exclude the unwanted years. This is typically done in a search query by placing a negative symbol directly to the left of the unwanted year. That is, if the chosen year is 2007, then the partial search queries –2008, –2009, and –2010 must be included in the search query. Since documents often discuss events that occurred in previous years, we did not add keywords to exclude years prior to the chosen year. “Alan Greenspan” “Federal Reserve” “Warren Buffett” “Berkshire Hathaway” “2007” –2008 –2009 –2010.
Our completed search query is composed by putting together the elements described in Steps 1, 2, and 3. For example, to examine the relationship that existed between Warren Buffett and Alan Greenspan in 2007, we used the following search query:
Results (Study 1)
Network Visualizations
Using our focal leader list we constructed a social network for each year from 2005 to 2009. Figure 1 (a), (c), (e), (g), and (i) are the network visualizations corresponding to the data obtained for the years 2005, 2006, 2007, 2008, and 2009, respectively. Within large samples containing multiple connections, it may be difficult to visualize any relationships among the tangle of links and nodes, and as such, a sparser subnetwork for subnetwork visualization can be developed (Lee et al., 2010) by delineating between strong and weak connections. Dotted lines indicate weak connections, for example, links whose weights are between 0.1 and 0.2. Solid lines indicate strong connections, for example, links whose weights are greater than 0.2. The same node positions are used in all network visualizations so that the changes in link structure will be visible. This means leaders were held in the same position within the network. Evidence of this practice is provided by tracking the stationary position of five nodes over the years, represented by unique symbols (square, triangle, circle, star, and diamond). Network variations were visible around those symbols each year.

Network visualizations.
Symbols in the figures show five individuals of interest: Squares, triangles, circles, stars, and diamonds indicate Warren Buffett, Timothy Geithner, Alan Greenspan, Lloyd Blankfein, and Henry Paulson, respectively. These individuals were selected based on government prominence (Greenspan, Paulson, and Geithner) and investor prominence (Buffett of Berkshire Hathaway and Blankfein of Goldman Sachs) as a means of examining changes in network visualization over the 5-year period (i.e., act as anchors of sorts). Among notable results is the changing shape and size of the network visualization as time moves from 2005 to 2009, adapting from more of a tightly clustered network in 2005 with several solid lines (stronger connections) to a star in 2009, with more dotted lines (i.e., weaker connections) visible in various locations within the social network.
Although important, within the small visualization diagram it becomes difficult to examine the relations among the 92 nodes (leaders) and therefore further analyses were conducted to examine network changes.
Maximum Spanning Trees
As network visualizations with large numbers of nodes can become crowded quickly, use of a spanning tree may assist with interpretation of the original graph via construction of a sparse subgraph of the original. A spanning tree of a network is a subset of links that ties all the nodes into a single connected component without any loops or cycles included. A maximum spanning tree (MST) is a particular kind of spanning tree where the sum total of link weights is maximal so that important connections tend to be included in the tree. By obtaining an MST, one can extract meaningful “backbone” structure of a large network data, and therefore it is extensively used in computer science and complex network science for visualizing complex networks. Computation of an MST can proceed by including links in order of nonincreasing weight (largest first and smallest last), which is Kruskal’s algorithm (Pemmaraju & Skiena, 2003) applied to negated link weights.
Figure 1 (b), (d), (f), (h), and (j) are MSTs that are created based on the same data as the network visualizations. Unlike the network visualization graphs, we did not fix the position of the five leaders of interest in MSTs. Symbols in the figures show the changing positions of five individuals of interest: Squares, triangles, circles, stars, and diamonds indicate Warren Buffett, Timothy Geithner, Alan Greenspan, Lloyd Blankfein, and Henry Paulson, respectively. Among notable results are the general clustering of governmental leaders over time, with occasional pairings with Blankfein, and the positioning of Buffett on his own over time, although not remotely positioned.
Betweenness Centrality
This concept measures how powerful a node is within the entire network. The measure is determined by how many connections a node makes with other nodes, when these other nodes are not connected to each other (Zhu, Watts, & Chen, 2010). As noted by Zhu and colleagues (2010), betweenness centrality is a critical distinction, as a hub for a set of isolated nodes differs in power from a hub for a set of interconnected nodes.
Financial sector CEOs
Figure 2 shows temporal changes in betweenness centrality of the five focal leaders from 2005 to 2009. The betweenness centrality in Figure 2 is normalized for comparison so that the sum of the betweenness centralities of all nodes is 1. These plots correctly reflect several actual events that happened in the U.S. economy. A straightforward example is that the centrality of Alan Greenspan, chairman of the Federal Reserve until January 2006, drastically decreased in 2006 and 2007. In contrast, the centrality of Timothy Geithner, who became the U.S. Treasury Secretary in 2009, increased that same year.

Temporal changes between centrality of five individuals of interest (2005-2009).
Nonfinancial sector CEOs
Although temporal changes in betweenness centrality were evident for the five focal leaders, with each leader heavily involved in the financial sector during the targeted 5-year period, this may not be a surprising result. However, one may expect that leaders not directly involved in the financial collapse (i.e., leading sectors other than finance, banking, and real estate) may not experience movement in betweenness centrality over the same targeted 5-year period. If, for example, we examine five nonfinancial CEOs whose business sectors were not directly embroiled in the financial collapse, we should see fewer/minimal changes in their betweenness centrality over the same targeted period (2005-2009). As such, we selected CEOs from the technology, pharmaceutical, oil/gas, and engineering/manufacturing sectors and assessed their temporal changes in betweenness centrality over time.
Results revealed that betweenness centrality of these five nonfinancial CEOs did not show significant movement over the targeted 5-year period, as compared with movement of the five focal financial CEOs (Figure 3). Note that within the technology sector, CEOs Ballmer (Microsoft) and Chambers (Cisco Systems) experienced only minimal shifts in centrality. The same applies to the CEOs from the oil/gas sector (Tillerson at Exxon Mobil), pharmaceutical sector (Weldon at Johnson & Johnson), and the engineering/manufacturing sector (Swanson at Raytheon).

Temporal changes between centrality of five nonfinancial individuals of interest (2005-2009).
Discussion (Study 1)
Social network Visualization
An interesting observation on the relationship between Lloyd Blankfein, CEO of Goldman Sachs, and Henry Paulson, U.S. Treasury Secretary and former CEO of Goldman Sachs, is their closeness in this social network varies from time to time. Their closeness in 2006 (Figure 1 (d)) may be related to the fact Blankfein took Paulson’s position as Goldman Sachs CEO in that year. They were apart in the following year (Fig. 1 (f)), but again came close to each other in 2008 when the economic crisis occurred (Fig. 1(h)). This relationship also was visible in the MSTs as well.
Betweenness centrality results also support this relatedness among Blankfein and Paulson. Although not widely known, Goldman Sachs was AIG’s largest trading partner in 2008 (Morgenson, 2008). When Blankfein noticed AIG was having severe liquidity problems, he called Paulson for help (Morgenson & Natta, 2009). The U.K. Guardian reported this as follows (Clark, 2009): Paulson’s office calendar at the Treasury, obtained by The New York Times through a Freedom of Information request, revealed that he spoke to Blankfein two dozen times during the September week when the Treasury bailed out AIG. That was “far more frequently” than Paulson talked to any other Wall Street executive.
Regarding other notable results, both Blankfein and Geithner are not readily evident on the MST for 2005 nor is their betweenness centrality significant. However, as Blankfein moves into the CEO position in 2006, and risky lending begins raising questions at the Federal Reserve during 2006, both men become more central. Blankfein becomes more central with Paulson, his predecessor, and Geithner, still at the Federal Reserve Bank of New York, more central with Greenspan, the outgoing Federal Reserve Chairman. Moreover, Geithner’s increasingly important role in government is reflected in the 2007 and 2008 MST, where he clusters with other government officials (Paulson and Greenspan) and experiences increasing betweenness centrality. Geithner played a central role working with Federal Reserve and Treasury Department personnel during the Bear Sterns and AIG debacles (Corkery, 2010; Moore, 2009).
Additionally, when Greenspan retires in 2006 his betweenness centrality drops and continues to decline until 2007 when it begins to rise again. However, recall in 2008 Greenspan was challenged as to his handling of derivatives, interest rates, and issues surrounding the housing crisis. Moreover, in congressional hearing testimony in October 2008, he admits to errors in regulation, which unleashed a media frenzy on a previously revered Federal Reserve Chairman (Andrews, 2008). This media interest may be reflected on his increased betweenness centrality despite his retirement.
Inasmuch as the financial sector was experiencing highs and lows throughout the targeted 5-year period, other sectors revealed lower volatility. For example, in our sample of five nonfinancial CEOs from technology, pharmaceutical, oil/gas, and engineering/manufacturing sectors, stock prices of their affiliated companies remained fairly stable over the 5-year targeted period, with the exception of Exxon Mobile. For the most part, respective stocks averages in 2009 were within approximately $10 of their 2005 average, with most enjoying an average increase (see Appendix B).
The stability in organizational performance (as gauged by stock price) may help explain the fairly stable positioning related to centrality over the targeted 5 years. As organizations were able to remain viable throughout the collapse, no significant movement of CEOs into the financial sector realm was observed. However, if bailouts were necessary, there may have been some changes as CEOs in financial trouble could conceivably become more closely connected with federal officials assisting/deciding on bailout packages.
These correspondences between the history of these focal leaders and changes observed in the MST and betweenness centrality figures offer some validity for the 3R method and suggest that the method is capable, to some extent, of illuminating changes that have occurred in real-world social networks. We have proposed this new method for acquiring historical social network data retrospectively using web search engines. Temporal changes in network topology and node centrality measure reflected several real-world events, such as shifts of power and influence and temporary formation of strong relationships. While these results demonstrate the potential of our method for examining changes that have occurred in real-world social networks, beyond social networks, relatedness has value in interpreting other aspects of social systems. As such, we apply 3R to a new domain, that of social sentiment analysis.
Study 2: Social Sentiment Retrospective Reconstruction With 3R
Understanding public opinion trends and/or sentiments may help organizations provide better customer service (Mullen & Collier, 2004) or detect unfavorable rumors for risk management (Nasukawa & Yi, 2003). Similarly, as our leader sample represented some of the most economically influential people in the world, understanding social sentiment surrounding a crisis may help leaders manage more effectively during such exceptional and critical times. Examining dynamic sentiments associated with a crisis and/or particular leaders may be helpful in studying how crisis evolves and assist in studying decisions and/or decision processes during a crisis.
Given that we were able to retrospectively construct changing adaptive social networks of leaders using our proposed method, we considered this retrospective relational approach may be applied to other social aspects of leadership within the 2008 financial crisis as well. We attempt to show that a retrospective relational approach may not be limited to construction of social networks, but rather is useful in the investigation of relatedness among other social leadership factors as well, such as social sentiments. Because individuals turn to leaders in times of crisis (Bass, 2008; Madera & Smith, 2009), and several leaders in our sample ranked among the most influential people regarding the world economy (cf. Power 30, 2010), one may expect the public looked to these influential leaders for information and cues on how to understand the growing financial crisis. Assessing the relatedness, retrospectively, among focal leaders and key social sentiments may inform social scientists on critical leadership factors such as emergence and crisis management.
Research in sentiment analysis has recently turned to the Internet and social media data as a means of capturing vast feeds of sentiment, mood, and emotion of the public (Bollen, Mao, & Zeng, 2011; Godbole, Srinivasaiah, & Skiena, 2007; Mullen & Collier, 2004). Although some prior research on large-scale Internet data has used the terms mood states and/or sentiment interchangeably (cf. Bollen et al., 2011), much of the computer science literature has used the term sentiment to represent a type of expression captured via electronic data (cf. Godbole et al., 2007; Mullen & Collier, 2004; Nasukawa & Yi, 2003). As our data are retrieved from large-scale Internet search engine explorations, sentiment seems an appropriate term to capture general expressions of the public and the population we explore, as we may not be privy to more personally connected emotions.
Driven in large part by advances in computer science, this type of sentiment-based research has produced evidence of links between Twitter mood and the stock market (Bollen et al., 2011) and daily sentiment analysis reports for news and blogs (Godbole et al., 2007). Prior research tends to focus not on the specific sentiment per se, but on the orientation of the sentiment as either positive or negative (Mullen & Collier, 2004; Nasukawa & Yi, 2003), or on the semantic network surrounding the sentiment (cf. Van Atteveldt et al., 2008). The 3R method is not a semantic network analysis, but rather a preliminary means for examining relatedness between keywords that represent social sentiments and focal leaders. Although semantic network analysis would be a more comprehensive investigative tool for examining relations between leaders and social sentiments, an automated semantic network analysis requires considerable time and a high-quality syntactic parser for the focal language, and “developing and testing the extraction rules is a nontrivial task” (Van Atteveldt et al., 2008, p. 445).
As expertise and/or accessibility required of parser and extraction rule development in automated semantic network analysis likely requires significant investment and possibly advanced computer programming skills, is it possible to adapt the methodology in such a way that a less sophisticated user could attempt to examine a dynamic process? Application of 3R enables a preliminary view of keyword association changes over time. It should be considered preliminary in that we attempt to establish associations and, as such, can employ simple queries without any programming skills using web-based searches. And if preliminary evidence from 3R reveals associations, this may warrant investment in more sophisticated means available within various fields that use advanced programming skills to conduct sentiment analysis.
The application of 3R enables organizational researchers to study phenomena of interest at a level of analysis that traditionally may have been beyond their reach. Where researchers interested in macro phenomena may have been limited or confined to micro-level research, 3R provides an accessible and manageable means for incorporating macro-level perspectives into organizational research. Although we acknowledge more in-depth macro-level exploration is possible with plentiful resources, 3R is a starting point that enables macro perspectives evaluation, even with limited personnel and financial resources.
Therefore, we applied 3R for the purposes of exploring social sentiment surrounding these same 92 leaders. An important note is this investigation also is not a social network analysis, but rather a different application of this relational tool originally developed for constructing adaptive social networks (Lee et al., 2010). Although different in application, at the core of both investigations is an exploration of relatedness (i.e., social networks among people and sentiments associated with people). As such, we suggest 3R would be appropriate in a retrospective social sentiment application as well to understand better the evolution or change of sentiments toward leaders over time.
Retrospective Social Sentiment Keyword Association
The negative sentiments of interest examined in this study are related to the “villainization” of Wall Street leaders, corporate CEOs, and even high-ranking government officials for incompetence and mismanagement. Early in the crisis several articles and news programs offered plenty of finger pointing and shaming in the form of a “who’s to blame” for the financial crisis slideshow launched by Businessweek.com and in Anderson Cooper’s countdown of the “10 Most Wanted Culprits of the Collapse” on his AC360 program on CNN (Steverman & Bogoslaw, 2008). Key negative sentiments expressed throughout ongoing media coverage and during congressional hearings on “behalf of constituents” or on “behalf of the public” by congressmen and congresswomen were shame, blame, and even fear (Andrews, 2008; Lo, 2009; Steverman & Bogoslaw, 2008). As such, we explored whether these negative sentiments would be associated with our leader set, as “fear,” “shame,” and “blame” represented some of the most notable sentiments surrounding the 2008 financial crisis and this leader set represented some of the most economically influential people in the world.
Additionally, we added “confidence” as a positive sentiment because a key attribute for leaders is, in fact, confidence (Bandura, 1997; Bass, 2008; Luthans & Avolio, 2003). Moreover, in times of crisis as individuals look to leaders (Madera & Smith, 2009), confidence may help people experiencing crisis find some relief and/or optimism related to the future (Bass, 2008). As such, we explored whether this positive sentiment would be associated with our leader set along with the negative sentiments.
Method (Study 2)
Analogous to Study 1, the data collection method takes as input a list of keywords to be searched. We focused on the names of 92 top influential leaders that are crucial to the U.S. economy. To each name we added four sentiment keywords, which were fear, shame, blame, and confidence, one at a time, for each year from 2005 to 2009. Additionally, we included in the search four synonyms for each main keyword. Anxiety, apprehension, fright, and panic were the synonyms for fear; degradation, dishonor, humiliation, and mortification for shame; guilt, culpability, regret, and remorse for blame; and hope, resolution, determination, and certainty for confidence.
Similar to Study 1, search queries were generated from this list as follows: A leader’s name (with affiliation) is chosen from the list described and paired up with one of four sentiment keywords and its synonyms. A year is chosen (e.g., 2007). The search query is designed to examine the nature of the relationship that existed between the name and the keyword (and its four synonyms) during the chosen year. To eliminate the influence of search results corresponding to documents created in years after the chosen year, we compose a series of partial search queries that exclude the unwanted years. This is typically done in search query by placing a negative symbol directly to the left of the unwanted year. As in Study 1, if the chosen year is 2005, then the partial search queries –2006, –2007, –2008, –2009, and –2010 must be included in the search query. Since documents often discuss events that occurred in previous years, we did not add keywords to exclude years prior to the chosen year. “Lloyd Blankfein” “Goldman Sachs” “Confidence OR Hope OR Resolution OR Determination OR Certainty” “2005” –2006 –2007 –2008 –2009 –2010.
Our completed search query is composed by putting together the elements described in Steps 1, 2, and 3. For example, to examine the association between Lloyd Blankfein and confidence, we used the following search query:
Relative Frequencies
We also calculated the relative frequency of each of the four social sentiment keywords aggregated over 92 focal leaders as shown in Equation 2. For example, the relative frequency of fear in our sample is the total number of search results for fear over all 92 names, divided by the total number of search results for all four keywords over all names:
We compare the relative frequency of interest to the benchmark relative frequency given in Equation 3. To enable more appropriate comparison among sentiments, the benchmark relative frequency employs a weighting factor that factors in frequency data on the likelihood of word occurrence in the English language. Frequency data are available from the Corpus of Contemporary American English (2011). For example, since fear is four times more likely to occur in language than shame (Corpus of Contemporary American English, 2011), fear would be weighted four times more in the benchmark relative frequency. Thus, benchmark relative frequency is a weighted computation of the number of web search hits for each sentiment keyword by itself in each year of 2005 to 2009, divided by the weighted total web search hits for all four words in each year:
Results (Study 2)
Table 1 presents descriptive statistics of the web search results. The mean reflects the average number of “hits” per leader on a focal keyword (or synonym of the keyword). For the set of 92 focal leaders, the highest negative sentiment related to leader names is in 2008 (fear) and 2009 (blame and shame). For confidence, the lowest sentiment related to leader names is in 2009.
Descriptive Statistics for Sentiment Relatedness to 92 Focal Leaders
To understand these numbers relative to the sentiment appearing in the web search not specifically linked to any of our 92 focal leader names, we provide relative frequency charts in Figure 4. As noted in the charts for all four sentiments, the benchmark relative frequencies are fairly stable over the 5-year period. However, we see changing frequencies on all sentiments related to our 92 focal leaders. For example, confidence is sharply lower related to our 92 focal leaders than in the benchmark for 2009. Conversely, blame and shame are sharply higher related to our 92 focal leaders than in the benchmark for 2009.

Relative frequencies for sentiments related to 92 CEOs/leaders and subset of five focal financial leaders versus benchmark relative frequencies.
Subset Analysis of Five Key Leaders
Using this proposed method we also can take a small subset view of specific leaders and social sentiment (using Equation 2 we substitute 5 for 92). Again, turning to our five focal financial sector leaders from the prior social network analysis, we look at the relative associations between focal sentiments and Warren Buffett, Timothy Geithner, Alan Greenspan, Lloyd Blankfein, and Henry Paulson. Descriptive statistics are presented in Table 2.
Descriptive Statistics for Sentiment Relatedness to Five Key Focal Leaders
Means reveal during and after the financial crisis (2008 and 2009) that these five leaders had a greater relatedness to social sentiment than our overall group of 92 leaders. Negative sentiments of fear, shame, and blame are considerably higher for our focal five leaders than the overall group. Moreover, confidence in 2008 was considerably higher for this focal group of five leaders than the overall group.
Additionally, Figure 4 presents relative frequency charts specifically for our five focal leaders as compared to the benchmark relative frequencies. Results revealed these five leaders trended similarly in social sentiment to the overall group of 92 leaders, but with some key differences. For example, the sentiment of fear associated with our overall leader set decreased in 2009, whereas for our five focal leaders the sentiment increased in 2009. Shame tracked nearly identically among the overall group and our five key focal leaders; however, blame was lower for the five focal leaders in both 2006 and 2009. Lastly, leading up to the crisis (2006 and 2007), confidence sentiments associated with our leader groups moved oppositely. Confidence was higher for the focal five leaders in 2006 and steeply dropped in 2007, whereas for our overall group of 92 leaders confidence was lower in 2006 and rose steeply in 2007.
3-D Plots for Five Key Leaders
Additionally, we parsed each focal leader’s association with the four sentiment keywords. Three-dimensional plots revealed a snapshot of the five focal leaders for the years of interest (2005-2009), number of web search hits, and the four sentiments of interest (where 1 = fear, 2 = shame, 3 = blame, and 4 = confidence).
Figure 5 results indicated higher social sentiment associated with Geithner (fear, blame, and confidence), Paulson (confidence), and Blankfein (blame and some confidence) concentrated in crisis and post-crisis year (2008 and 2009). A lower occurrence of social sentiment was associated with Buffett (fear) and Greenspan (shame), predominantly in post-crisis (2009).

Three-dimensional graphs of five focal leaders related to sentiment, occurrence of sentiment, and year of interest.
Discussion (Study 2)
As noted previously, this retrospective sentiment association is not a social network analysis or a semantic analysis. However, the 3R method was used in this case to examine the relatedness of our focal leaders, not with other leaders, but with social sentiment expressed and captured on the Internet over a period of years. During crisis events, negative sentiments are viewed as common and natural (Tiedens, Ellworth, & Mesquita, 2000). And our research showed that negative sentiment such as shame and blame were present in 2009 immediately following the crisis and sharply more associated with our five financial sector focal leaders than in a general benchmark occurrence.
Fear
Upon further examination of results, note that general levels of occurrence for the sentiment fear remain fairly stable in the Internet, although there is a slight increase during the financial collapse of 2008. Interestingly, our five focal financial sector leaders have a reduced fear associated with them from 2006 through 2007, possibly related to the increase in financial standing of Americans where home ownership was steadily increasing (Congleton, 2009). However, as it became apparent this housing bubble could not sustain the credit strain, from 2007 to 2009 the sentiment of fear associated with our five focal financial leaders rose sharply, exceeding levels commonly found on the Internet.
Moreover, examining the individual 3-D plots revealed that four of the five financial sector focal leaders experienced at least one significant spike in fear, especially surrounding 2008 and/or 2009. The lone exception in this fear spike is Blankfein.
Notable as well is the close tracking of the overall set of 92 leaders to the path of the focal five leaders along the fear sentiment path, although the 92 leaders end 2009 slightly below average for the sentiment of fear compared to its benchmark. Again, as stock prices remained fairly stable and these leading organizations remained viable, CEO fear among these select 92 leaders may have fallen below that of the key financial players, who were still attempting to address household economic and joblessness crises in 2009.
Shame
As a sentiment, shame displayed the least volatility among our five focal leaders and the larger set of 92 leaders. From 2006 to 2007, shame was associated less with the five focal leaders and 92 CEOs than generally occurring on the Internet. It may be that as Americans were living above their means staking large credit positions while shrinking their savings (Morgenson, 2008), the general economic euphoria of the population lessened our interest in CEOs. However, when homes, vehicles, and lifestyles crumbled, reading about massive CEO bonuses, salaries, and benefits (Anderson, 2006; “Wall Street bonuses set new record,” 2006) struck a chord with the new “have-nots” among the population. The steep shift to above average occurrences of shame associated with both our five focal leaders and larger pool of 92 CEOs indicates some displeasure with this group. These outrageous bonuses yet slow recovery of the economy after 2 consecutive years of job losses and wage cuts may be responsible for the significant increase in the association of shame to our leaders.
Again, examining the individual 3-D plots revealed that four of the five financial sector focal leaders did not experience at least one significant spike in shame. The lone exception in this lack of shame spike is Greenspan in 2009, who acknowledged regret over his handling of the economy leading up to the financial crisis (Andrews, 2008).
Blame
The public tended to associate blame with leaders’ names more frequently than the average blame during the 5-year period. Moreover, blame represents one of the most volatile of the four keywords, especially for our five financial sector focal leaders. Individual analyses from the 3-D plots reveal Blankfein and Geithner have a significant spike in blame beginning in 2008. This may be related to widespread frustration with the economy as the crisis and recession linger, coupled with a barrage of articles and interviews attempting to assign blame for the financial crisis, some of which centered around Goldman Sachs, which had employed both men at one time (Morgenson, 2008). Moreover, amid bailout programs for large companies and banks on Wall Street in 2008, executive bonuses in the financial sector were staggeringly large—Wall Street bonuses paid to New York City securities industry employees rose by 17% to $20.3 billion in 2009—outraging the general public (“DiNapoli: Wall Street Bonuses Rose Sharply in 2009,” 2010).
Among the 92 leader group, there is also blame being associated with them as well. Although less volatile than our financial focal leaders, the 92 CEOs generally remain well above average levels of blame occurring on the Internet, except in 2008 when Americans may have taken some responsibility for living above their means and becoming embroiled in the credit crisis.
Confidence
Confidence was generally sharply lower when associated with our five focal leaders than in a general benchmark occurrence, which again may be related to negative sentiment increasing during crises. Recall the financial crisis peaked late in 2008 after Lehman Brothers filed for Chapter 11 bankruptcy protection on September 15, 2008, which likely reflects the significant changes in these sentiments from 2008 to 2009 as the fallout from the crisis (i.e., foreclosures, joblessness, and credit crisis) begins. As indicated by their high association with confidence in 2007, possibly the public thought the Troubled Asset Relief Program (TARP) would be the “magic bullet” for the economy and people would keep jobs, insurance, and homes. However, as TARP did not prevent the crisis crunch, that may explain the sharp drop in the sentiment confidence associated with these five leaders.
Examining our larger pool of 92 leaders revealed similar findings with the five financial sector focal leaders, namely, the public tended to associate these sentiments with the subset of leaders’ names less frequently than the average use of confidence, especially following the crisis. The individual 3-D plots for the five focal financial leaders support this finding as well. Again, the slow recovery may be behind these negative sentiments associated with leaders of large organizations.
Summary
Immediately following the crisis, media outlets and even Congress were looking to attach blame (Andrews, 2008) to leaders and government officials, and our search method appears to capture that pursuit. Increasing negative sentiments and declining positive sentiment were associated with our focal leaders, who were some of the most influential leaders in the world economy. However, a cautionary note is important regarding such interpretations: The sentiment keyword association only reveals relatedness relative to a particular search query. As such, it should not be interpreted in any causal manner, or in other words, by stating the financial crisis caused these sentiments.
Discussion of the 3R Method
Thus we have modified a web search engine–based social network construction tool, in part inspired by and developed with researchers in physics and complex systems, and applied the 3R method to temporally changing adaptive social networks and sentiments. Without breakthrough research in biology, physics, mathematics, computer science, and complex systems, progress in examining adaptive social networks may be significantly slower within the social sciences. This collaborative and interdisciplinary approach to methods development has significantly reduced diffusion, adaptation, and adoption time of this method into several fields (Lazer et al., 2009); and with more recent advances in analysis of social networks coming from this interdisciplinary research, social and organizational science is poised to follow and even contribute to the ongoing development of this methodology.
We attempted to modify an existing interdisciplinary method for web search engines to enable retrospective analysis of social data and applied it two ways in social and organizational science research: by reconstructing temporally changing adaptive social networks and social sentiments. Both applications of our method were supported by factual historical events and trends and keyword association, which demonstrate some initial validity and offer promise and potential for our method. For example, marketing research may be able to use the technique for assessing temporal changes of corporate images, or linguistics research may examine how connotations of words have changed in the last decade. In the organizational sciences, beyond the applications offered here, researchers in organizational behavior, organizational theory, and human resources management, for example, might be able to employ our technique as a preliminary means to study changes, dynamics, and/or evolution of and relatedness in networks of individuals, dyads, groups/teams, organizations, strategic business units, and industries, among other entities. Hoppe and Reinelt (2010) noted several examples of how relatedness conceptualizations and analyses, such as social network analysis and leadership network development and evaluations, have revealed strategic strengthening of relationships among health care professionals in central California, influenced revitalized civic engagement among civic leaders and community members in Lawrence, Massachusetts, and developed successful field-policy leader networks to improve the quality of early education throughout the state of Massachusetts.
From a social science perspective, this computational approach may be similar to more conventional approaches such as historiometry and the use of historiometric methods. In historiometry three key elements include nomothetic hypotheses, quantitative approaches, and sampling from historical figures, not student samples (Simonton, 1990). In essence, historiometry was the frontrunner to data mining and usually involved teams of researchers and assistants poring through documents, speeches, and so on and coding data for qualitative analysis. Data mining, which represents a single step in the knowledge discovery process, is an intelligent method applied to extract data (Han & Kamber, 2006) and used the power of a computational approach. The computational approach, involving data mining, is similar to historiometry in that it too represents a knowledge discovery process and can involve nomothetic hypotheses, qualitative data analytic approaches, and sampling from actual subjects engaged in a particular activity or context.
Some advantages of computational approaches to research topics include the ability to more readily incorporate macro-level perspectives into organizational research, especially given the significant time savings and expansion of breadth of material able to be evaluated by machine and technology. These advantages likely lead to cost savings in personnel resources. For example, in this study, word association searches on all 92 CEOs took approximately 20 minutes per year investigated. Thus, in approximately 2 hours using accessible technology, social scientists can produce a preliminary data set that can assess evidence of relatedness among people or sentiments. Depending on the research question, at that time, research can proceed with more sophisticated network and/or semantic analytic methods that may take advanced programming expertise and months of labor.
Through various illustrative cases mentioned previously, our proposed 3R method captured actual events related to specific focal leaders over the studied time period. This indicates the potential effectiveness of the proposed method for examining changes in real-world social settings. The 3R tool uses basic and accessible programming and technology and can operate elemental searches in a matter of hours.
If interested in employing 3R in research, or some other search engine based method, we would recommend the following steps: Explore Lee and colleagues’ (2010) work detailing the development of web search engine–based social network construction; this is an excellent starting point. They offer tremendous insights and considerations, and our method was a modification of their method. Proficiency in Java on the research team, or having someone capable of implementing suitable code (e.g., Java, R, or Python), is critical for this type of research method. More often than not, computer scientists are willing to share code and offer some direction for computations. Using the aforementioned steps provided in both Study 1 and Study 2 as a template, specify appropriate temporal parameters given particular research interests and/or questions.
Limitations and Future Directions
Despite strengths, our 3R method has several limitations. First, our method relies solely on simple web searches using heuristically crafted search queries, and therefore its validity is yet to be fully examined and established. Additional filtering using more contextual information may help improve the quality of search results. Our method also has a computational limitation, since the number of search queries required to construct a social network grows quadratically as the number of keywords increases.
Yet another limitation is the potential error in search results. Multiple attempts of an identical web search query can sometimes produce very different results: The number of search hits might vary in orders of magnitude given that the web is ever changing (Lewandowski, 2008). In the current study, rerunning searches 5 months later produced increases in the number of hits on sentiments in over 85% of cases. Data from 2005 and 2006 had smaller increases in all sentiments (an average of 25 hits across sentiments where hits ranged from 0 to 191). These smaller increases may be attributed to a trend in index freshness, where older pages that no longer change have increased significantly on the web (Lewandowski, 2008). Data from 2009 had an average change of 135 hits across sentiments, although sentiment hits for that year ranged from 0 to 826.
Although differences among data runs were observed, we evaluated these differences by computing the means of normalized weights among our 92 leaders across each sentiment over the 5-year period in each of the data runs (Time 1 data run was the initial data query; Time 2 data run was the data query conducted 5 months later on the exact same leaders, sentiments, and years). For example, if Warren Buffett accounted for 15% of the Internet hits on fear among our 92 CEOs for 2005 in the Time 1 data query, he was assigned a weight of .15; and likewise, if Warren Buffett accounted for 17% of the hits on fear in 2005 in our Time 2 data query, a weight of .17 was assigned. Finally, we computed a mean weight for each leader, by sentiment, across all 5 years in both Time 1 and Time 2 data runs and correlated these means. Results indicated the Time 1 and Time 2 data runs were significantly correlated for all four sentiments among our 92 leaders (r = .71 for fear; r = .96 for shame; r = .75 for blame; r = .74 for confidence; all correlations significant at p < .01). The high correlations among data runs indicated that although there were actual differences between the number of hits obtained in Time 1 versus Time 2, the data were generally related and in the same direction.
For our purposes, as a preliminary research tool indicating “relatedness,” conducting a search over and over would be counterproductive, as the exploratory search is used to assess the appropriateness of moving forward with more sophisticated methods. However, in an attempt to gain the most accurate picture of relatedness, we would recommend collecting data multiple times during the initial search period, from a variety of machines, to compare efficacy of the search. In order to minimize the type of error associated with search engine results, we recommend comparing runs and aggregating as appropriate if returns indicate fairly consistent results across runs and machines. As basic searches can be completed within hours, searching over a single day with a few machines could produce a dozen or so data sets for efficacy evaluation.
An additional drawback to our proposed method is the inability of the search method to note whether the hits were related to a particular direction. For example, take confidence as a keyword that revealed lower confidence hits related to our focal leaders in 2009. What the search does not indicate is if people were stating that they have “no confidence” in these leaders or that they have “every confidence” in these leaders. The hits are confidence (and synonyms) and not the specific view (direction) of confidence. This also holds for blame. The search does not indicate if people were stating these leaders “were not to blame” or these leaders are “fully to blame” or some mix of both depending on different focal leaders.
As 3R is designed to fairly quickly show some preliminary level of relatedness by more than chance (Lee et al., 2010), direction may not need to be addressed in the preliminary search. However, once relatedness is established, use of more sophisticated semantic analysis and/or automated classification tools (cf. Deephouse & Carter, 2005; Rocha & Cobo, 2011; Van Atteveldt et al., 2008) could be employed to decipher feature, direction, and/or network of sentiments in question.
Moreover, although we used key synonyms from thesaurus entries (in English), altering the synonyms used in the search could produce different results. Using thesaurus entries as a guideline, centrally related keywords should be logically employed, as they better reflect the focal sentiment. However, additional research could examine differences in outcomes between central synonyms and secondary or less related keyword synonym and the focal keyword.
Finally, a limitation may be that information not readily available in the public data domain, or inaccurately displayed there, may be missed by, or incorrectly included in, computational analyses. Although with scanning capabilities and advances in data transmission, placing data accurately in a digital form for appropriate inclusion into analyses now seems more readily accomplishable.
Therefore, it is critical to understand the limitations of relatedness analyses using the 3R method. Our method reveals relatedness but cannot necessarily parse out the specific direction of a hit, but rather only the presence of a hit. This presence, however, implies the focal variables of interest are more closely related than random counterparts (Lee et al., 2010) and therefore likely worthy of further investigation. As such, we recommend our procedure act as a preliminary step, or a screening process, in more detailed and involved analyses such as historiometry, qualitative studies, and even more detailed social network analyses based on careful data searches using the historiometrics, data mining, and content analyses.
Although we acknowledge the limitations of our method, and note that more detailed methods may provide better interpretative capabilities, the benefits of establishing preliminary relatedness before spending countless resources devoted to more traditional types of relatedness analyses would seem clear. With only a few days of preliminary analyses using 3R, one can quickly verify relatedness or lack of relatedness. Investing this relatively brief amount of time as a preliminary step in social and organizational research can provide a snapshot of key focal individuals or other entities, key periods in history, or even key focal sentiments or other variables and constructs that can be linked to times, places, and people. And for some researchers, given their interests and/or other studies in a stream of research, such 3R results per se may be sufficient for their scholarly purposes.
The social sciences have benefited from advances in a variety of other fields, but in this particular study, social science has benefited specifically from the interdisciplinary field of network science. Advances in physics, biology, mathematics, and computer science have opened possibilities for social scientists examining a social or organizational network. Understanding how to view networks in a variety of ways and, as importantly, capture their adaptive or dynamical properties can only improve our understanding of how individuals and/or collectives relate in a complex context. In future studies, we plan to conduct a more rigorous validation of the 3R data collection and analysis method and to develop more advanced temporal filtering techniques using textual contents of the documents returned by web search engines.
We support the practice of looking at advances in other scientific fields and attempting to apply adaptations of research methods to social and organizational science. Moreover, and as important, we encourage interdisciplinary research teams to not only look for ways to apply various research methods across disciplines but rather work together to develop new research methods. Although these methods may have differential impact across fields (i.e., physics may benefit more than social science or vice versa), in the process we have advanced science for science’s sake.
Footnotes
Appendix A: Focal Leader Sample With Affiliations
Appendix B: Annual Corporate Stock Prices for Five Nonfinancial CEOs From 2005 to 2009
Note
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Fuding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
