Abstract

Discussions of data have become practically ubiquitous in the Western world, with advocates in academia, government and industry touting a range of benefits to the tapping of the vast wealth of data at our collective fingertips. From the idea that emerging ‘big data’ sources have rendered conventional scientific practices useless to the notion that data provides a basis for making the best objective, ideologically neutral decisions, data in all of its many forms has increasingly become a focal point for a range of individuals and institutions. At the same time, however, the potential downfalls of data are seen in the NSA PRISM scandal in the United States, or even in the increasing awareness of how mundane social media data is collected by companies in order to increase their profits at the expense of individuals’ privacy. It is this series of processes and events that serve as the backdrop for Rob Kitchin’s The Data Revolution.
The Data Revolution explores two central theses: first, Kitchin argues that this explosion of interest in data is the result of substantive changes in both our technical capacities for producing and analysing massive amounts of heterogeneous data and in the social apparatus that supports such technical practices. That is, there is something substantively different about this new regime of data production and analysis than what has come before, due especially to a general shift from conditions of data scarcity to those of data abundance. Second, and more importantly, Kitchin argues that for all this talk about data, there has been insufficient critical reflection on what it is we actually mean when we talk about ‘data’. By succumbing to a kind of ‘data boosterism’, leaving data to be a kind of black box imbued with the power to revolutionize our societies, we have neglected to understand not only what data is, but also what data does in the world. And so Kitchin sets out to provide what he sees as a ‘synoptic overview’ of the relevant social and technological changes at the heart of our society’s current obsession with data, in all of its forms.
Indeed, one of the key contributions of this book is its thorough analysis of popular and prevalent discourses around big and open data, and subsequent reflections on the limitations of these conceptualizations. Notably, Kitchin takes on the notion that data are objective, politically neutral representations of the world that can be allowed to ‘speak for themselves’. Instead, he understands data as fundamentally situated in particular social, spatial, political, economic and cultural circumstances that shape not only how such data represent the world, but also how they help to produce the world as we know it. With regard to big data, Kitchin specifically attacks the now pervasive ‘three Vs’ definition of increased volume, velocity and variety by adding the equally-important, but oft-ignored, dimensions of exhaustivity, granularity, relationality and flexibility. Indeed, given that so much of the relatively trite and simplistic popular discourse around data focuses on big data, it is demonstrative that Kitchin’s redefinition of the term does not come until the book’s fourth chapter. Instead, Kitchin adopts a much broader view of data throughout the book. He unpacks ‘the data revolution’ as constituted by a range of different kinds of technical and social practices from those of ‘open data’ shared by public institutions to the counter-valorization of ‘small data’ and the ‘data infrastructures’ that might enable such data sources to be linked effectively together.
While it is often easy to skip over any of the 21 tables full of definitions or schematic overviews that help to fill the book, it is worth noting that these tables make a key contribution to the book’s overall organization and argument. While each individual definition may not be absolutely crucial to grasping the bigger issues at hand, these tables are important insofar as they repeatedly demonstrate the fact that data is not understood in a single, universal way, but is instead a highly complex and variegated assemblage of objects, practices and discourses. The tables outline in detail everything from the principles of open data advocated by different organizations to different concepts and methods for data mining practiced by computer scientists. In sum, these tables, and the book writ-large, destabilize the notion that ‘data’ is any fixed thing that we all understand and mobilize in a common way.
Kitchin ends the book, and nearly every chapter, with a clarion call for more in-depth, empirical case studies of the dynamics he discusses, from the limitations of different technical and organizational approaches to building shared data infrastructures to the discursive regimes developed around the use of data-driven approaches to urban planning known as ‘smart cities’. While the book lacks these kinds of case studies, it instead serves as a kind of provocation, or a signpost pointing to the key routes that we ought to take moving forward, such that we might actually generate a more sufficient understanding of data and ‘the data revolution’. If there is a significant shortcoming of the book, it is largely in its nature as such an overview, because the broad overview of key issues and debates is likely to leave those already familiar with these issues wanting more.
As such, the book is most suited for those new to or unfamiliar with the concepts of big data and open data, or those who wish to get a better grasp on the social implications of these new ways of producing, analysing and talking about data. While it is doubtful that those unapologetic advocates of data who need this book the most will engage with it, Kitchin’s thorough unpacking of data – as a social practice, as an enabler of scholarly research, as a thing in and of itself, etc. – would be a welcome antidote to much of the unproductive hype around big and open data that continues to persist in the popular media. Importantly, Kitchin’s arguments are just as germane to physical geographers and GIScientists as to human geographers or other social scientists, and he makes frequent reference to issues that speak specifically to these groups. From the proliferation of readily accessible, remotely sensed data and algorithms meant to automatically analyse such images to the social implications of environmental data, Kitchin makes sure to engage with the broader discipline of geography. Ultimately, this book is useful for anyone with an interest in the present and future of scholarly research and the role of new technologies in shaping the discourses and practices of such work, and is almost sure to spur considerable future research into these pressing issues.
