Abstract
The paper presents a series of reflections deriving from teaching official statistics. Much of the accumulated experience derives from teaching in the “Methods and tools for official statistics” and “Survey methods: traditional and new techniques in Official Statistics” courses in the European Master in Official Statistics (EMOS) at the universities of Florence and Pisa, as well as on numerous occasions at tertiary education centres. Such experiences highlighted the lack of the themes in question within the standard study plans and the scarce awareness of students about the official statistics; and allowed the identification of the essential topics to be addressed. Four basic pillars (definitions, quality criteria and practices, sources, and autonomy in finding official data) need to be spread much more than only by statistical authorities. Manuals on basic statistics (data-science) should always foresee one or more chapters dedicated to the recognition of quality data and official statistics. Teachers from any discipline which implies the transmission of the value of high quality statistical information need to be trained on this aspect. The general insight of the experience of teaching official statistics is that if data science may remain a specialist knowledge, statistical literacy needs to become a common one.
Background concepts
Data science, statistical literacy and official statistics literacy
World population is under a data deluge and not everyone has the tools to treat and interpret the received information. Information, to become knowledge, needs to be processed. The skills for such processing need to become basic skills, not specialist ones.
Data deluge has a double implication: the need for improvement of data techniques through methodologies and data engineering to manage the huge wealth of available information; and the urgent empowerment of common people in terms of statistical literacy.
The former need implies a certain overcoming of “traditional” statistical teaching. Statistics is essentially taught in university courses as mathematical theory, as a technique of data analysis and as a descriptive tool to interpret social or psychological phenomena.
In the last years, the statistical world is called upon to face epochal challenges. Digitalization of the information, sensor data, social networks, applied artificial intelligence led to the explosion of big data and the so-called data deluge. We do not have any more the problem of how to produce data, but the one of how to better exploit the wealth of available information. Within universities, the focus is moving from statistics, to what is now called “data science”. Data scientists appear today as one of the most needed professional figures that enterprises seem not to easily find on the labour market. A recent and accurate review of the academic curricula [1] reveals five foundation courses:
Research Design and Application for Data and Analysis; Exploring and Analysing Data; Storing and Retrieving Data; Applied Machine Learning; Data Visualization and Communication.
This confirms a supremacy of the methodological issues inside university courses. It is thus possible to advance a future in which statistics will be more and more focused on engineering techniques to analyse the great quantity of available data. This is surely one of the great challenges of the statistics science in the nearest future.
The latter need requires citizens to learn a certain statistical literacy. “Statistical Literacy’ is the ability to understand and critically evaluate statistical results that permeate our daily lives-coupled with the ability to appreciate the contribution that statistical thinking can make in public and private, professional and personal decisions” [2]. This is made of a complex set of skills. According to UNECE [3] there is first of all the numeracy, that is the ability to understand what percentage, mean or variance represent. Secondly, there is the ability to read and communicate the meaning of data that need to be interpreted and set in their thematic context. Thirdly, statistical literacy is made of being able to understand how data can be used in decision making, either being the one of politicians, of entrepreneurs or of common citizens.
The low level of statistical literacy among adults and young persons has been certified by several national and international surveys [4, 5, 6, 7, 8]. Concept and definition of statistical literacy have so far been discussed by scholars. Key constructs have been identified in order to describe some general competencies that adults should have in order to effectively cope with the quantitative demand of an adult world: numeracy and mathematical literacy, quantitative literacy and quantitative reasoning, statistical literacy and probability literacy. In this framework, the regional office of Istat in Tuscany conducted a survey to evaluate the level of statistical literacy among university students. The survey was conducted for two years on different target populations: first year (pilot survey) on 1
If a lack in statistical literacy has been generally observed at international, national and local level, and there are several efforts to reduce the statistical illiteracy and to define adequate university programmes [10, 11, 12], the same cannot be said for Official Statistics literacy (OLS). Given that official statistics represents the core information about our societies, the one on which most relevant public and private choices are based, the ability to correctly interpret this kind of information is a basic pillar of conscious citizenship (see also [13]).
OSL should include the analysis of the sources, the meaning of data and indicators, the quality criteria of data production processes, the ownership of official statistics production processes, the importance of metadata, and the legislative framework. Citizens should be able to distinguish information deriving from private polls, official surveys, administrative data or big data, having in mind the different characteristics and criticalities which they entail.
The knowledge of these aspects that characterize official statistics represents a basic asset to have a compass in our societies and even more urgently under the current data deluge. Nevertheless, a system to evaluate the OSL among the population is still missing. In tertiary education curricula, also in the specialist ones such as economics, political sciences and statistics, the objective to provide some Official Statistics literacy is usually lacking. Given that “official statistics providers have been operating for decades around the world, and represent an indispensable element in the information system of a democratic society” [14], it is quite surprising that there still is not a set of principles for teaching official statistics.
Statistics and critical literacy
Official Statistics literacy is to become an essential element of a broader concept, the so-called “critical literacy”, that is spreading in the more recent literature (see [15]), particularly in times of fake news. Not being aware of the sources distorts the processes of people’s trust in the information they receive, leading them to trust the information that best suits their convictions and biasing the public debate.
The dissemination among university students (and others) of the ability to recognize the reliability of the sources should be at the base of the promotion of a conscious citizenship. Through critical literacy, people are put in the condition to better decide which information is to be trusted and which is not. Statistics are a relevant part of public information and understanding which data can actually be trusted is very important. At the same time, the difference between a false data and a right one is not always so clear. As noticed by the president of the Royal Statistical Society David Spiegelhalter [16] “completely fabricated, demonstrably false facts” are only one extreme of fake news. Spiegelhalter identifies, instead, a “bigger risk [in] manipulation and exaggeration through inappropriate interpretation of ‘facts’ that may be technically correct but are distorted by what we may call ‘questionable interpretation and communication practices”.
Facebook’s Tips for spotting fake news’ represents an example of basic critical literacy to avoid the most striking fake news circulating on the web, the first group described by Spiegelhalter.
The second group is possibly much larger. In this group fall both the statistical information which is not accurate (for example because it is based on too small samples) and the information which is produced through methodologically correct processes but which is then used or disseminated in the wrong way, either through wrong interpretations (of causalities, for example, or exaggerating the comments on the basis of non-relevant results) or misleading visualization. Most of these practices were already described in the 1954 classic, “How to lie With Statistics” by Huff [17].
We have therefore two different levels of critical analysis: the former is to understand if the data themselves are reliable; the latter is to understand whether they have been used in the right way.
Here, in the context of the teaching of official statistics, we are interested in addressing both levels and to this aim the distinction between different processes of data production and between official and non-official data becomes crucial.
OSL and trust
It is certain that official statistics offers a quality standard which makes it trustworthy. Yet, to put trust in official statistics it is necessary to become familiar with those quality factors if we do not want it to be a dogmatic trust. Also because such dogmatic trust does not spread through the society. According to Eurobarometer [18] polls, half of Europeans do not trust their countries official statistics2 (yet large differences exist among member States, with the percentage of people trusting official statistics varying from 73% in Sweden to 27% in Spain).
The issue of trust in official statistics has been treated often (see for example [19]), since for a statistical authority the maintenance of its reputation and of the trust of citizens is vital. The trustworthiness of official statistics is fundamental not only for National Statistical Institutes (NSI); but for the country as a whole since it determines the correct functioning of democratic systems. In this respect, it is crucial that citizens have confidence in the policy actions when they are based on official data and no one should trust a Government making decisions on the basis of mistrusted data. For a well-functioning democracy trust in official statistics is not, therefore, an option but a necessary condition: “trustworthy statistics sit at the very heart of the democratic contract between the public and its elected government” [20, p. 4].
Giovannini [21] proposed a conceptual measurement of the value added of statistics (VAS) based on the production of different factors: the number of people who know official statistics (N), the quantity of information produced (QSA), the role of media in its dissemination (MF), the relvenace of the produced data (S), the individuals’ numeracy (NL) and, last but not least, the trust that individuals have in official statistics (TS). A key aspect of the proposed formula, VAS
Alldritt and Wilcock [19] identify three aspects of trust which have to be taken into consideration:
Being trusted to explain; Trusted to be the best available; Trusted to be the most relevant.
In order to gain citizens’ trust, national institutes of statistics need first of all to be as transparent as possible. Time planning and punctuality need to work as a demonstration of independence and not being subject to the political timetable. Methodological strengths and weaknesses have to be made clear, not only in the “methodological notes” but also in press releases in order to help a clear interpretation of data. Sometimes an official definition, however reasonable and motivated, can be slippery but is fundamental to help journalists to deliver the right interpretation. A typical example is the one of youth unemployment: in Italy the rate during the economic and financial crisis peaked at over 40%. It was, of course, 40% of young in the labour force, which for 15–24 years old is a quite restricted group, and about 8% of the total population of that age. Newspapers were continuously commenting that 4 out of 10 of young people were unemployed, until Istat highlighted in a press release the proportion of people in 15–24 age group and the absolute values of unemployed young people. This dramatically reduced the spreading of the false interpretation.
Secondly, citizens should be aware of the existence of the code of practice, of the quality assurance processes and of the external control exercised by European institutions which makes the official production simply the best available. Statistical information is by nature non-exact, yet the best estimate is the information we have to put trust in. It is hard for other institution to produce higher quality data, and this has to be made clear.
The third point, stating that official statistics is to be trusted because it is relevant, is the least immediate concept to deliver. The relevance is a very relative concept and it is possible that for a specific topic, official statistics are lacking while other private institutions are producing them. Moreover, the simplification of diverse populations into aggregate measures sometimes makes it hard for people to find themselves in the statistics. “This blindness to local cultural variability is precisely what makes statistics vulgar and potentially offensive” [22] deriving in a refusal to accept the validity of the data.
Yet, statistics produced by NSI are chosen after a broad debate among different stakeholders, and users’ committee have the constant opportunity to intervene in order to modify the produced information. In this respect it very useful to provide thematic reporting so that users become aware of which is the actual available information and is put in the condition to decide which is the most relevant for their purposes.
Concluding, a conscious trust in official statistics derives, first of all, from the knowledge of the basic rules for recognizing quality information. Official statistics is only a part of the data proposed on media and the web. Students need to learn which information can be trusted in a world full of fake news and distorted information. The knowledge of the basic functioning of official statistics should lead to increase the trust in it. But, much more in general, it should activate the ability to understand which are the criteria to be used for recognizing trustworthy statistics, even if produced by other authorities or private organizations.
Teaching OS in University classes of Tuscany
The reflections presented in this paper are the outcomes of two principal experiences. The main one is the teaching in courses within the European Master of Official Statistics (EMOS), which also offers foresees internships and master theses, to students who already hold a Bachelor Degree. The other one derives from seminars on Official Statistics’ related topics directed to younger university students, often first year students.
Several national statistical offices are becoming active in the field of teaching OS, especially through institutional collaboration: the European Master in Official Statistics (EMOS) is a key project carried out by the European Commission, universities, NSIs and central banks. It is a full academic degree related to official statistics. In 2018, it involved 20 programmes and 14 countries in Europe. Thanks to the collaboration between academia and producers of official statistics, EMOS aims to train persons to work with European official data at different levels in the fast-changing production system of the 21st century. The EMOS Master degree is based on learning outcomes which make graduates familiar with the system of official statistics, production models, statistical methods and dissemination. Universities offering EMOS Master degrees collaborate actively with the national statistical institutes to reduce the gap between theory and practice, also through the choice of pertinent topics for a Master theses, internship in the area of official statistic, EMOS workshops and webinars [23]3.
Given the recognized lack of formal education in official statistics and the related lack of structured teaching materials [13, 24, 25], EMOS aims to bridge these gaps through an active cooperation between producers and academia, increasing the Official Statistics literacy (OSL) of student attending the courses.
EMOS approach embraces the suggestions of UNECE [3] and some results of Gal and Ograjensek [13]. The European programme contains the following sections: the system of Official Statistics, production models and methods, specific themes (depending on the professional skills of the teachers involved), statistical methods, and dissemination.
The EMOS project is currently carried out in Italy in four Universities: Pisa and Rome “La Sapienza” since the first edition in 2015, Florence since 2016, and Bergamo since 2018. As the project states, EMOS has been supported by a strong collaboration between academia and Italian national institute of statistics (Istat). The authors, both researchers at Istat, have been involved since the beginning in the EMOS project, and in particular in the definition of programme, preparation of the teaching materials, organization and tutoring of internships and support of Master theses.
Seminars on specific topics are focused mainly on production processes, dissemination, survey design and techniques, and social and economic indicators. They are offered to first year students of Economics, Political Sciences, and Statistics. Given the difference in the audience, seminars and EMOS lessons have a different language and a different detail of approaching themes, but they have in common the aim to deliver official statistics in the university classes and some starting point: the course represents the first contact with the OS, conducted in a lack of structured reference materials.
Some evidences from EMOS experience
In Florence and Pisa EMOS curricula, the authors are teachers in “Methods and Tools for Official Statistics” and “Survey Methods: Traditional and New Techniques in Official Statistics” courses. OS teachers are tutors of a few internships (6 in total from 2015 to 2018) and provide support to three Master theses.
The EMOS classes have a specific composition: small classes (max 20 students), with high level of education (they already have a bachelor degree), coming from different countries (European and extra European countries) and different education training (such as Economics, Statistics, Sustainable Development, Business and Management Administration, Psychology and Political Sciences). Finally, several teachers coming from academia are involved in EMOS courses, in addition to NSIs, the official statistics producers (Ministries or Research centers). It is quite evident that each of these elements could represent strengths or opportunities.
A small and trained class is surely stimulating for teachers: the different experiences can be exchanged, there is time to have an immediate feedback on the topics, it is possible to realize group exercises or stimulate reflection on specific topics (i.e., the differences in privacy law systems is one of the more appreciated by the students, but also the different use of administrative data or the integration strategies). In the meantime, teachers have to manage different students’ backgrounds and different ways of reasoning on themes: it could be required of the instructor to possess a high ability to make topics interesting to the whole class putting together the personal expectations about a course on OS.
The Master courses involve a large number of teachers from both the university and the data production organizations. Their different curricula and experiences offer the students a picture of the complexity of the official statistics. OS is composed of methodological contents, law and organizational aspects, communication, sociological and psychological skills: EMOS implies different courses, not immediately related, in order to improve the awareness of what is the main aspects of the functioning of a NSI. Even if several teachers approach the different aspects of OS, it is essential to find a common line.
In order to reduce the complex composition of the Master’s degree, it is essential find a thematic integration among the contents of different classes to recall the way they relate to official statistics.
The internship in the OS producer-offices is the second step of collaboration in the EMOS programme. The Istat office in Tuscany hosted 6 intern students: it was a rich experience for both parts. Students have, in most cases, their first work experience: they have the chance to live the work dynamics, the background decisional processes, respect of deadline and the experience of a working environment (meetings, briefing and debriefing, relational dynamics). Moreover, they have the possibility to participate in some specific part of the data production processes, even if with some limitations. Indeed, our experience of internship shows some problematic aspects, which could – and should – be solved for improving future participation. The interns do not have a specific contract in which duties and roles are defined, but only a general objective. NSIs and academia could reinforce their collaboration also through the definition of a standardized contract for EMOS interns (as other NSIs already do). A concrete definition of the interns role could enlarge their workspace: if their work was regulated by a contract, it would be possible to get access to microdata and it would be allowed to process them (still supervised and following the relative duties of respect of confidentiality). Access to microdata would, in fact, broadly enlarge the range of activities of internships which are now mainly focused on research and data analysis.
Starting from classroom discussions, and from the internship experience, students start working on the Master theses. In these cases, Istat supports the scientific research through the use and deep knowledge of data sources, of data production processes, and the management of data availability. More than that, the local office of a NSI traditionally has a strong collaboration with the local data producers of OS and public and private institutions (municipalities, no profit association, network of enterprises) and can put students in contact with these realities in order to carry out qualitative interviews, gather additional data, and enhance awareness of the local context.
Some evidences from seminars
Introductory seminars are devoted to 1
Evidences from approaching OS during the 1
Some general remarks
The field experience of teaching OS in universities classes showed five key elements which need to be shared for improving the awareness and the trust in OS, both for students and teachers. In particular:
The students’ (and teachers’) OS knowledge gap The emergence of different sentiments about OS The need to have structured and updated materials about OS (i.e. Official Handbook) The inclusion of formal OS curricula inside the university courses The need of an evaluation system regarding OS courses (teachers, contents, materials)
The students’ (and teachers’) OS knowledge gap
Students know official statistics by hearsay (via TV or newspapers), for example they do not know Istat’s production, the meaning of OS, the relationship among public institutions, civil society and Istat; they do not know methodologies behind indicators such as GDP, poverty or inflation. A few essential issues need to be faced in every course aiming at describing and recognizing official statistics.
First, the legal aspects: students must understand that official statistics, by defintion, derives primarily from the law. They have to become familiar with the legal mandate, aspects of privacy, statistical confidentiality or the obligation to respond, essentially the rights and duties of users and data producers.
Second, the principles and the codes of practice [27] are fundamental to understand the framework of action of official statistics, to learn what producing quality data mean, and to be able to recognize quality standards of official and non-official producers. This is the basis for fostering trust into official sources and in general to acquire the pillars upon which to build trust.
Then, students must be able to recognize different sources (surveys, administrative sources, different kind of big data) and the strengths and weaknesses of each of them.
Finally, in order to get in touch with official statistics, students need to browse through data and metadata autonomously, accessing the questionnaires if needed. Time has to be dedicated to the management of data warehouses and the creation of different queries exploring the potential of major official sources. This helps students to get familiar with searching for data at the source without any intermediary. They learn about the existence of different data on the same phenomenon, and different definitions and they learn to query metadata to fully understand “what data are exactly talking about”.
The emergence of different sentiments about OS, which is related to the sentiment towards public institutions and civil society
Adults are expected to know that their country has a system of statistics producers or official statistics providers that work cooperatively on the basis of fundamental principles, established by the United Nations [13]. This is the belief of NSIs employees, but this is not the reality. The described experiences in teaching official statistics showed us that the system of OS is quite unknown: the meaning of OS, the international standard for data production processes and for information dissemination, the impact of official indicators about national and local policies (i.e. poverty indicators, inflation rate, prices index). After a first introduction regarding OS, the attitude of the students towards it depends a lot on the characteristics of context of origin: for example, in our experience of EMOS classes, the prevalence of a sentiment of scepticism is typical of the students coming from countries of the ex-Soviet Union or Turkey or Greece, where OS does not follow the required quality criteria and plays a role in the economic crises or in supporting political choices. In these cases, a better knowledge of the history and meaning of OS, of its quality criteria adopted in data production or in dissemination phases, and the transparency about the adopted methodology are essential elements to develop citizens’ trust in OS.
The need to have structured and updated materials about OS (Official Handbook)
Considering that most of official statisticians got their training on the job, it is not surprising that a general handbook about OS has not yet been elaborated. It is possible to find materials produced by different NSIs around the world about specific aspects of OS: history of OS (UN), data production processes (Statistics Canada), quality criteria, regulation and specific law, reports on specific indicators, dissemination criteria (Eurostat), methodological aspects (sampling or administrative data), surveys and census framework (European NSI, including Istat). It is quite encouraging that these separate handbooks exist, but we believe it is possible to put together in a structured and harmonized handbook all the main aspects and procedures of OS. The share of the teaching experience could represent the base for the framework of the handbook. The availability of an Official Handbook would also represent a great savings of time for single teachers in preparing the lessons, reducing the risk of error and leaving to the teacher a certain flexibility in managing the topics. This may certainly start from the copious EMOS material available on the European Union website.
The inclusion of formal OS curricula within university courses
The EMOS programme is the first experience to include a curriculum on OS within a formal university course. Until now, OS was related to the personal initiative based on a single collaboration between academia and official statisticians, thus creating spotty experiences, such as a cycle of seminars, within a specific university course.
A mid-long term project linking OS and academic statistics could instead enhance the strengths and reduce the weaknesses of the OS teaching.
From our field experience, the main strengths are:
Classes composed by Master’s degree students which have already acquired the basic statistical literacy and have a high level of autonomy; Classes composed by a small number of students (up to 25 students) which permit a richer interaction between students and teacher;
The main weakness is represented by the heterogeneity in students’ curricula which requires strong efforts in the initial approach;
A more-than-1-year collaboration could instead permit to solve major problems and to support mid-long term project regarding OS and academic statistics.
The need of an evaluation system regarding OS courses (teachers, contents, materials)
In Europe, there are several experience of teaching OS, more or less structured. Still missing is a common evaluation system based on a survey addressed to the students and to the teachers.
Students could be interviewed concerning the topics, the structure of the course and final examination, the expected and received benefit, the communication and technical skills of the teacher, etc. They could also be the source for suggestions and solutions of critical aspects related to the course.
In the meantime, teachers could have a general view from the OS sentiment, pose specific questions related to the subject, and have the field feedback from a class of students potentially interested in OS.
Final recommendations
Most people do not even wonder whether they can be able to distinguish official and non-official sources, or between high versus low-quality data. Yet this represents a fundamental knowledge in nowadays’ information society.
We believe the above four basic pillars of knowledge (definitions, quality criteria and practices, sources, and autonomy in finding official data) need to be spread much more than only by statistical authorities. Manuals on basic statistics should always include one or more chapters dedicated to the recognition of quality data and official statistics.
Teachers must be formed on this aspect. Not only professors of statistics, yet broadly data users, from any discipline which need to transmit the value of high quality statistical information to their students. Data science may remain a specialist knowledge, Official Statistics literacy needs to become a common one.
Footnotes
“50% “tend not to trust” them, (
For more details about EMOS project see:
