Abstract
This paper considers different perspectives of indicators produced by official statistics agencies, with an emphasis on technical aspects. We discuss statistical methods, impact, scope and action operationalisation of official statistic indicators. The focus is on multivariate aspects in analysing and communicating such indicators. To illustrate the points made in the paper, we use examples from well-being indicators, from the UN sustainable development goals and a Eurobarometer example. The overall objective is to enhance the added value of official statistics indicators, as they are communicated, and thus strengthen evidence-based policy-making.
Background, purpose and scope of the paper
This paper is about how official statistics indicators are constructed and produced, with an emphasis on their statistical properties. Indicators are a specific product of statistics at the interface between the statistical production processes and the processes of interpretation and use of statistics. This makes it necessary to look at this interface in both directions linking the questions of design, production and communication of indicators with their use. Moreover, it is necessary to clarify whether these two sides, the producer and the consumer, are independent of each other and sequential, or whether they influence each other.
Our overarching goal is the production of indicators fit for purpose and on generating information quality. On the one hand we consider aspects and possibilities of statistical methodology used to condense (multifactorial) indicators, and on the other hand we consider the interaction between producers and users of indicators, which are of a more sociological nature. The next section is about indicators and the reduction of complexity to simplify the analysis. Section 2 is focused on statistical methodology of official statistics indicators. Section 3 is on the level of aggregation of indicators, Section 4 on assigning a differentiated level of importance to components used to compute indicators. Section 5 is on the integration of indicators to get an overall perspective, Section 6 on multivariate analysis and Section 7 on graphical presentation of indicators. The paper concludes with a discussion section.
Statistical methodology of statistical indicator
A common, but problematic approach used in official statistics reports, is to view a list of indicators as independent, i.e. not accounting for mutual effects of indicators. This poses several challenges in analyzing, communicating and operationalizing such indicators:
Analyzing refers to how a multivariate data set is analyzed by statistical and analytic methods. Statistical analysis is available in a multitude of platforms that can handle numbers, text, images, recordings, social media networks etc. In particular, administrative data and questionnaire-based data can be combined to provide deeper insights and generate information of high quality. We discuss this in Section 5.
Communicating is about the visualization tools to present such indicators. Indicators, like all outputs of statistical analysis need to be presented, in an appropriate way, to the relevant persons, at the proper time. Reporting platforms can include traditional printed reports but also interactive dashboard that can be queried for deep dive investigations and linked graphics.
Operationalization is the consideration of such a set of indictors by decision makers and policy formulators in support of their decisions. In order to achieve results, Deming [1] posed 3 questions to managers: 1) What do you want to accomplish, 2) By what methods will you accomplish it and 3) How will you know when you accomplished it? We need indicators to answer (1) and (3).
To help achieve concrete results, the reduction of complexity in statistics means using methods to separate the important from the unimportant. Clear and concise metrics are needed to convey useful information. In economic statistics, this objective is achieved by referring to indicators based on monetary currency. This approach condenses statistics on national production into a single aggregated figure: the GDP. Economists also evaluate other phenomena in monetary terms. From this perspective, The conventional economic approach can be seen as the opposite of an approach using various indicators. In the former, the challenge of aggregating multivariate variables is circumvented by monetarisation at the micro level. This approach requires monetary valuation of market transactions and needs to be justified when producing official statistics. In markets that do not yet exist such as environment related public goods, monetary valuations need to be simulated. This, in turn, is possible and justifiable only to a limited extent within the ethical and qualitatively defined standards of official statistics (42). With this background, an OECD-based expert group led by Joseph Stiglitz called for a dashboard of indicators, rather than a single aggregated figure, to better measure societal progress: “No single metric will ever provide a good measure of the health of a country, even when the focus is limited to the functioning of the economic system. Policies need to be guided by a dashboard of indicators informing about people’s material conditions and the quality of their lives, inequalities thereof, and sustainability. This dashboard should include indicators that allow us to assess people’s conditions over the economic cycle. Arguably, policy responses to the Great Recession might have been different had such a dashboard been used.” [2]
We address these challenges by considering the multivariate structure of official statistics indicators. Developing such capabilities will provide an improved policy support platform that indicates the overall effect of local changes.
Statistical issues affecting common practice in computing and displaying indicators range from the identification of latent variables, accounting for local variability, statistical aggregation and clustering of indicators and providing a multivariate model accounting for associations between indicators. We treat these aspects in the next section. Issues raised by the common approach of weighting individual indicators to derive an overall score have been raised in [3].
Aggregation of indicators
Identifying the level of synthesis in aggregating indicators is not a trivial issue. The decision depends on the meaning that the final aggregation should have. We demonstrated this with an example.
Subjective wellbeing is observed by collecting individual level data on both positive and negative effects representing wellbeing. The wellbeing indicator represents by the effect balance, defined as the difference between positive effects and negative effects. Even if the difference can be obtained at a macro level, the synthesis should be performed at an individual micro level.
Selecting the indicators to be included in a wellbeing aggregation represents a fundamental step since it operationally determines the concept defining the phenomena that the aggregation is supposed to measure. This step aims at
testing (reflective model) or investigating (formative model) the level of complexity of the concept measured by the indicators in terms of dimensionality. refining the selection of the indicators showing the best statistical characteristics.
Dimensional analysis is applying correlation between indicators. According to the reflective model for aggregating indicators, the selected indicators are considered as caused by the same latent variable. They can be considered multiple measures that to contribute to the measured variability. Consequently, highly correlated indicators prove that the model of measurement is reflective. The statistical property allowing the aggregation of reflective indicators is internal consistency. An approach to reflective models is Factor Analysis (FA), which can be applied in order to test the hypothesized dimensional structure underlying the selected indicators. In particular, it allows indicators that fit better the latent dimensional structure to be synthesized. FA is based on the assumption that the total variance of each indicator represents a linear combination of different components (additive assumption): the common variance (due to the dimensional structure), the specific variance (due to the specificity variance of each indicator), and the error. By estimating the amount of common variance (communality), the reflective approach is tested. Often, Principal Component Analysis (PCA) is applied to test dimensional structures. This practice is however criticized as the main goal of PCA is not to test a (dimensional) model but to decompose the correlations among basic indicators by calculating linear combinations of variables. PCA describes the variation of a data set using a smaller number of dimensions than the number of the original indicators.
The latent construct in FA can turn out to be multidimensional with the variance of each indicator explained by one or more dimensions. Aggregating indicators by referring to single dimensions of the latent variable is acceptable, and the aggregated score can be easily interpreted. Creating a single score out of several dimensions (i.e., aggregating indicators referring to all dimensions of the latent variable) may be also consistent and meaningful.
In the case of formative models for aggregating indicators, the correlation informs us about the level of overlap in indicators measuring the same unobservable variable.
From the technical point of view, several instruments are available:
Correlation Analysis which is useful in selecting indicators that are not redundant and avoidinf multicollinearity (double counting) in composite indicator construction. Principal Component Analysis. The main goal of principal component analysis is to describe the variation of a data set using a number of scores that is smaller than the number of the original indicators. This approach is applied to test dimensional structures. Multidimensional Scaling allows the underlying dimensionality to be tested and a geometrical multidimensional representation (map) to be created for the whole group of indicators. Cluster Analysis. Can be useful to identify meaningful groupings among indicators.
In some cases, the approaches can be combined (e.g., tandem analysis or factorial k-means analysis) [4].
Consequently, a high level of correlation between two indicators presents a redundancy and suggests the selection of only one of them. This decision should give preference to the indicator allowing trend analysis and wide comparisons and proving to be available for a large number of cases are preferable.
The nature of a latent variable defined in the formative context is multidimensional. Consequently, aggregation of formative indicators raises several issue. In general, when concepts are truly multidimensional, collapsing indicators to just one synthetic indicator is very questionable. The nuances and ambiguities of the data can be forced into a conceptual model where all the features affecting one-dimension indicators are considered as noise to be removed. Moreover, synthetic scores can be biased towards a small subset of basic indicators, failing to give a faithful representation of the data.
Defining the importance of each indicator
In case of formative models, stating the importance of indicators requires a criterion able to reproduce as accurately as possible the contribution of each indicator to the overall synthesis. With this perspective, the definition of the weighting system constitute an improvement and refinement of the model of measurement.
While equal weighting does not necessarily imply unitary weighting, the adoption of the differential weighting procedure does not necessarily correspond to the identification of different weights but rather to the selection of the most appropriate approach in order to identify the weights [4]. Assigning differential weights can be problematic, especially when the decision is not supported by theoretical reflections that provide a meaning to each indicator or consideration of its impact on the synthesis In this sense, apart from the applied approach, the defined weights represent judgment values. The methodology to be adopted should be carefully evaluated and made formally explicit. From the technical point of view, the weighting procedure consists of defining and assigning a weight to each indicator. The weights are used in the successive computation of the individual aggregate score when each weight is multiplied by the corresponding individual value of the indicator. In order to define the weighting system, some decisions need to be adopted:
The proportional size of weights (equal or differential weighting) The approach to derive weights (objective or subjective) The level of the weights (individual or group)
The approach is considered objective when the weights are determined through an analytic process, that is through
statistical methods (correlation, Principal Component Analysis, Data Envelopment Analysis, Unobserved Components Models). The adoption of statistical methods in weighting socio-economic components has to be considered carefully since, by removing any control over the weighting procedure from the analysts, it may give a false appearance of mathematical objectivity that is actually difficult to achieve in social measurement. multi-attribute models, like Multi-Attribute Decision Making (in particular, Analytic Hierarchy Processes – AHP) or Multi-Attribute Compositional Model (in particular, Conjoint Analysis, CA).
The approach is considered subjective when shows the possibility to involve more individuals (experts or citizens) in the process of defining weighting systems for social indicators. These approaches are defined in the perspective of giving more legitimacy to social indicators by taking into account citizens’ importance (values) and not – as usually done in the past – statistical importance.
Weights can be kept constant or can be changed according to particular considerations concerning each application. In both cases, the researcher needs to rationalize the choice. The former approach can be adopted when the aim is to analyse the evolution of an examined phenomenon. The latter is useful when the aim is to define particular priorities.
In any case, we have to consider that a set of weights is able to express in a perfect way the actual and relative contribution of each indicator do not exist.
Integrating indicators
The selection of the proper aggregation technique should consider: (i) if it admits or does not admit compensability among indicators; (ii) requires or does not require comparability (with reference to nature of data) among indicators; (iii) requires or does not require homogeneity in indicators’ level of measurement
An aggregating technique is compensatory when it allows low values in some indicators to be compensated by high values in other indicators. A compensatory technique can be useful in some contexts especially when the purpose of applying indicators is to stimulate behaviours aimed at improving the overall performance by investing in those ambits showing lower values. The selection of the aggregating technique is an important choice also to avoid inconsistencies between the weights previously chosen – in terms of theoretical meaning and importance – and the way these weights are actually used. In other words, in order to continue interpreting the weights as “importance coefficients”, a non-compensatory aggregating procedure has to be preferred, such as a non-compensatory multi-criteria approach, like Multi-Criteria Analysis (MCA). Comparability refers to the distributional characteristics of indicators, in particular to directionality and functional form. Directionality refers to the direction by which each indicator measures the concept (i.e., positive or negative). The aggregative approach requires indicators uniformly oriented. In case an indicator needs to be re-oriented, it has to be submitted to the reflection procedure:
Functional forms represent how the changes are valued at different levels of an indicator’s distribution. If changes are valued in the same way, regardless of level, then the functional form is linear. If changes are valued differently, according to the level, the functional form is not linear. In other words, in some cases same absolute differences between observed values are valued differently and consequently can have different meaning (e.g., a change of Homogeneity refers to the level of measurement adopted by the whole group of indicators. Almost all the aggregating techniques require homogeneous scales. Some techniques exist allowing the indicators’ original scales to be transformed into an interpretable common scale. In order to select the proper approach, the data quality and properties and the objectives of the indicator should be taken into account.
The literature offers several aggregation tech-niques [4]. The linear aggregation approach (additive techniques) is the most widely used. By contrast, multiplicative techniques (following the geometrical approach) and the technique based upon multi-criteria analysis (following the non-compensatory approach) allow the difficulties caused by compensation among the indicators to be overcome.
Multivariate analysis of indicators
A common approach to present indicators is to use a wheel display or radar plots. Another approach is to use line plots (see for example [5]). These methods present a frozen perspective which only indirectly reflects the links between indicators. To overcome this, reports of indicators typically include upward or downward arrows reflecting sort term trends over time.
League tables are sometimes prepared on the basis of indicators. Such leagues are notorious for misrepresenting the complexities of reality and have been widely criticized in scientific work and in the press [3, 6]. They are strong in communication but poor in information.
An important advance has been the methodology of sensitivity analysis proposed by Saltelli and colleagues [7, 8]. This work helps decision makers evaluate the robustness of alternative decisions, for example in movement restrictions during a pandemic, through their impact on indicators measurements.
Bayesian networks [9, 10], provide a graphical dynamic perspective that can be used for diagnostic and predictive analysis. This extended capability supports decision makers in assessing the causes of given situations and predicting the impact of policy decisions.
We present in this section, with examples, four approaches to the statistical analysis of indicators: that can be used in data aggregation, in comparability of indicators, and in representing homogeneity of indicators. These tools are:
Control charts analysis Cluster analysis Principal component and factor analysis Bayesian network analysis
These approaches aim at leveraging the multivariate structure of indicators and account for variability in time to identify significant effects. They give us the means to account for variability in the value of indicators, identify units with similar behaviour, reduce the dimensionality of a large number of indicators and provide graphical synthesis of indicators. As mentioned, we focus on examples, providing only a general theoretical introduction to these methods. For more details see [10, 11].
A simplistic example
This simplistic example is designed to show the impact of the multivariate aspects of indicators. Specifically, we consider three indicators. The first two are presented at values (0.1, 0.4) and (0.4, 0.1). The third indicator is affected by these two indicators and we assume that the larger value of that third indicator, the better we are. Figure 1 shows the path of change needed to be followed by the first two indicators to reach the peak value in the third indicator. Short paths mean small changes, long paths refer to big changes. The paths show changes that are needed to to reach the top. The two-way landscape leads to different paths to the top, some short, some long. The multivariate landscape provides an integrated view.
A simplistic example with two indicators and two sets of values. The two-way landscape leads to different paths to the top, some short, some long.
A case study based on wellbeing indicators published by the Israeli central bureau of statistics, by large cities in Israel is presented below.1 The indicators are of employment related aspects from 2016 and 2017. In this example, we apply both control charts and cluster analysis. Control charts are used in industrial statistics to monitor processes over time [11]. There main structure is an estimate of variability in the data and a line plot, over time, with control limits typically positioned three standard deviations above and below the grand average. Observations beyond or below the control limits indicate an observation not compatible with the grand average. In [10], control charts are used to analyze survey data and identify survey items with unusually high or low ratings. This is used, among other things, to highlight strengths and weaknesses in comparing various units, such as cities. Here we use control charts to account for variability of indicators in consecutive years in different cities. The control limits indicate results that exceed such a variability. This highlights exceptionally high of low results. We apply this approach to analyze the employment rate indicators, by city, across two consecutive years. This allows us to distinguish cities with significantly high and low employment rates, from cities with employment rate compatible with the country’s grand average. Figure 2 shows an analysis with JMP 15 of employment rates in 16 large Israeli cities. the dots correspond to averages over 2016 and 2017. The order of the city listing is determined alphabetically using their Hebrew names. The variability between the two years is used to construct a control chart that classifies the cities into cities with high employment rate (over 63.76), and low employment rate (below 58.64). For more on the use of control charts to analyze survey-based data, see [10]. Figure 2 shows that the average employment rate, over 2016 and 2017, in large Israeli cities was 61.2. The analysis identifies cities with significantly high employment rate and cities with significantly low employment rate, compared to the grand average.
Control chart of average employment indicator in large Israeli cities, in 2016 and 2017.
Another analysis of the employment related indicators is shown in Fig. 3. The indicators used here are i) Employment rate, ii) Partial time employment rate, iii) Long term unemployment, iv) Overall satisfaction from work and v) Overall satisfaction from income. The analysis of these five indicators is based on Ward’s hierarchical clustering and displayed with dendrograms and constellation plots obtained with JMP v15.2
Differently from Fig. 2, which presents average employment rate across 2016 and 2017, the clustering in Fig. 3 treats simultaneously 5 indicators, by year.
The cluster of Jerusalem and Bet Shemesh is in the overall low employment group both in 2016 and 2017, and the cities of Tel Aviv and Rishon Lezion are consistently in the overall high employment cluster.
Cluster analysis of five employment indicators in Israeli cities, in 2016 and 2017.
Principal component analysis of UK well being indicators.
This third case study presented here demonstrates the application of principal component analysis (PCA), factor analysis (FA) and Bayesian network analysis (BN) to well-being indicators.
PCA and FA have different goals: PCA is a technique for simplifying the analysis by reducing the dimensionality of multivariate data, whereas FA is a technique for identifying variables that cannot be measured directly, called latent variables. In PCA, all of the variance in the data, reflected by the correlation matrix, is used to attain a solution. The resulting principal components are a mix the measured variables. On the other hand, in FA, not all of the variance in the data is attributed to the underlying latent variable. This feature is reflected in the FA algorithm that “reduces” the correlation matrix with squared multiple correlations values. The squared multiple correlations is the estimate of the variance that the underlying factors explain in a given variable (also known as “commonality”). As the number of variables involved in the analysis increases, results from PCA and FA become more similar. In [12] argue, with simulations, that analyses with at least 40 variables lead to minor differences. Moreover, if the communality of measured variables is high, then the results between PCA and FA are also similar. For more on applications of PCA and FA to survey data see [10].
In this subsection we apply both PCA and FA to a set of indicators published in [5]. Figure 4 displays the first two principal components of 8 well-being indicator in 9 UK geographical areas. These two principal components cover 80%p of the overall variability in the data. In the middle panel we identify the outlying position of London (on the left) and the similarity between satisfaction with income and with health (lower part of right panel), and closeness of happiness, low anxiousness, people to rely on and overall satisfaction.
Factor analysis of UK well being indicators.
Bayesian network of UK wellbeing indicators.
a: Bayesian network of UK wellbeing indicators showing impact on overall satisfaction of difference between groups with high (top panel) and low (bottom panel) responses to question on people reliance. b: Bayesian network of UK wellbeing indicators showing impact of overall satisfaction (high: top panel and low: bottom panel) on people reliance.
Figure 5, on the other hand is a factor analysis of the same data. Here too we rely on the first two factors that help us identify two latent variables. A first on combines happiness, Overall satisfaction, people to rely on and low level of anxiousness, we might call it the “feeling good” dimension. A second factor is determined by unemployment rate, feeling of loneliness, satisfaction from income and health, we might call this the “feeling secure” dimension.
Bayesian networks, belong to the family of probabilistic graphical models known as a directed acyclic graphs. These graphical structures are used to represent knowledge about a domain represented by a multivariate structure. The graph structure, which represents a “qualitative” part of the model, is combined with “quantitative” parameters of a model. The parameters are derived from the Markovian property, where the conditional probability distribution at each node depends only on its parents. For discrete random variables, this conditional probability is represented by a table, listing the local probability that a child node takes on each of the feasible values, for each combination of values of its parents. The joint distribution of a collection of variables is determined uniquely by these local conditional probability tables. These conditional dependencies in the graph are estimated by using known statistical and computational methods. Hence, Bayesian networks combine principles from graph theory, probability theory, computer science and statistics. They can be used to map association and cause and effect relationships between variables [9]. Bayesian networks were used to integrate survey based official statistics with administrative data in Section 10.5 in [13]. Here, we apply here Bayesian networks to the wellbeing indicators published by [5] using the GeNie software.3 Figure 6 is a Bayesian network of the 8 indicators used above. Figure 7a and b present the impact of conditioning the network on specific variables. In Fig. 7a we show the impact of high and low values of the indicator measuring overall satisfaction. At the high level of the indicator 34%p declare very high overall satisfaction, at the low level of the indicator we see a drop of 7%p to 27%p. This analysis is an example pf predictive analysis using indicators. Increasing the sense of citizens that they have people they can rely on, would increase the toplevel satisfaction group by 7%. In Fig. 7b, we conduct a diagnostic exercise where we compare the groups with highest and lowest overall satisfaction levels in terms of responses to the question: “How much can you rely on your spouse/family member/friend if you have a serious problem?”. Citizens with high overall satisfaction, are 26%p likely to be in the group with highest response to the question. Citizens with low overall satisfaction are only 12%p likely to be in that top group, a drop of 8%p.
Many critical issues affect the aggregative-compen-sative methodology described in Section 5. Implicitly or not, it is generally taken for granted that “evaluation implies aggregation”; thus ordinal data must be scaled to numerical values, to be aggregated and processed in a (formally) effective way. However this often proves inconsistent with the nature of phenomena and produces results that may be largely arbitrary, poorly meaningful and hardly interpretable.
Realizing the weakness of the syntheses built through the aggregative-compensative approach, statistical research has focused on developing alternative and more sophisticated analytic procedures, but almost always assuming the existence of a cardinal latent structure behind ordinal data. The resulting models are often very complicated and still affected by the epistemological and technical issues discussed above.
The way out of the impasse can be found by realizing that synthesis does not necessarily imply aggregation. In other words, non-aggregative approaches are needed, able to (i) respect the ordinal nature of the data and the process and trends of phenomena (not always linear but more frequently monotonic), (ii) avoid any aggregation among indicators, and (iii) producing a synthetic indicator.
New challenges and perspectives are emerging aimed at developing technical tools and strategies in order to simplify the data structure, combine indicators and communicate the obtained “picture”.
In this perspective, one of the most useful references is the Partial Order Theory, a branch of discrete mathematics providing concepts and tools that fit very naturally the needs of ordinal data analysis. Non-aggregative approaches are focused not on dimensions but on profiles, which are combinations of ordinal scores, describing the “status” of an individual. Profiles can be mathematically described and analyzed through tools referring to that theory, in particular Partially Ordered Set (POSET), allow for the extraction of information directly out of the relational structure of the data and provide robust results.
Through partial order tools it is possible to give an effective representation of data and their structure and to exploit the latter to extract, directly out of it, the information needed in the synthesizing process. In particular, the computations performed to assign numerical scores to the statistical units involve only the ordinal features of data, avoiding any scaling procedure or any other transformation of the kind. Consequently, the conclusions drawn through the application of POSET methodologies are more robust and consistent than those based upon traditional statistical tools.
A partially ordered set is a set endowed with an ordering criterion, which is in general not complete, that is, such that some pairs of elements of the set cannot be ordered or compared. Partially ordered sets are the natural setting for the description and the analysis of multidimensional systems of indicators, particularly, but not exclusively, when they are of an ordinal kind. They can be used in evaluation studies, for the computation of non-aggregative synthetic indicators, for the definition of rankings, in cluster analysis, and in many other data analysis tasks. As the availability of complex ordinal data sets and multidimensional indicator systems is increasing, partially ordered sets are becoming a key tool in socio-economics, environmental and ecological studies, and multicriteria decision-making. Their use can involve heavy computations, but effective algorithms and software resources exist that make them a very useful tool in many concrete problems.
As constantly stated in this paper, representing a complex and multidimensional world through indicators produces a complex and multidimensional system [14].
While the numerical approach concerns the “reduction” of many values in just one (or, at least, very few), the graphical perspective concerns the deduction of many values in a visual display which typically uses a bi-dimensional representation. The main issue is, how can we represent the world of measures on a mere flatland? [15].
Operating a synthesis in a complex structure may represent a very difficult task, urging solutions which are able to apply different instruments in combination, including graphical devices.
Using a graphical approach to synthesis could represent a step forward with respect to the analytical approaches, especially in its potential capacity to represent a system of indicators (or part of it) in particular ways able to support users in making statements and asking questions about the indicators and allowing a critical understanding the reality they try to represent. This represents a big challenge especially if we think that new sources of data (big data) would support the construction of indicators [16]. Visual complexity represents an alternative/complementary approach to synthesis. It is positioned at the crossroad of different competences, i.e. image, word, number and art. Visual complexity represents the intersection of two technical and cultural fields, networks and visualization [16]. While the first has some tradition in social sciences, the explosion of online social networks and the increasing availability of web information linked to each other have raised the need of having the possibility to visually represent them.
However, similarly to what we saw for the analytical approaches, the graphical approach to synthesising indicators is not extraneous to risks and criticisms. In order to allow patterns, relations, dimensions to be communicated in a more efficient and clear way, many decisions should be taken in managing graphical instruments. For example, the relationship among indicators could be represented in terms of network visualization which may emphasize different aspects (density, organic growth, instability, dynamism) and/or different structure (symmetry, top-down, stable dimensions). One of the risks is that the synthetic representation of indicators could be created in order to obtain just more aesthetically interesting description. By adopting Tufte’s words, “what is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather the task of the designer is to give visual access to the subtle and the difficult – that is, the revelation of the complex.” [17].
Discussion and outlook
The paper reviews non-economic indicators used in analysing and presenting official statistics. Some of the discussion parallels the framework on information quality proposed in [18, 19]. Eventually, indicators provide an evidence based foundation for stakeholders interested in action operationalisation initiatives. Action operationalisation is one of the eight information quality dimensions. In other words, indicators take you from a construct, reflecting abstract conditions, to an action aiming at affecting such conditions. Indicators are designed to provide a wide coverage of perspectives. The multivariate approach to indicators presented here provide policy makers with an integrated perspective on social and economic systems such as transportation, healthcare, education and agriculture. Furthermore. we show that it is of critical importance for the production and use of high-quality indicators that both the technical-statistical and the communicative-application-oriented aspects of indicators are considered.
In direct comparison with the statistical areas in which surveys and accounts are used and produced, internationally established methodical standards are lacking for the younger statistical area of indicators. In this respect, it is obvious to recommend as a resumé the development of such standards and guidelines by the international statistical community. This does not refer to guidelines that relate to a specific set of indicators (such as the SDG indicators). Rather, there is a lack of useful handbooks (or rather ‘headbooks’) in which the various aspects elaborated in this article are prepared and summarised in such a generalised manner that they can serve as orientation aids and compasses for the designers, compilers and communicators of indicators in their daily work. Fortunately, there are already some such guidelines, such as OECD/JRC’s “Handbook on Constructing Composite Indicators” [20] or Eurostat’s three-part series of indicator manuals [21, 22, 23]. An international statistical standard based on these sources, updated, enhanced and complemented, and supported by standard-setting bodies, would certainly be a major step towards improving the quality of indicators in general.
Footnotes
The JMP software is available from www.jmp.com.
Acknowledgments
The inputs of W. Radermacher and two anonymous reviewers helped improve the paper and are gratefully acknowledged.
