Abstract

Machine Learners is an urgently needed contribution to social sciences’ understanding of the more formal or ‘mechanical’ aspects of machine learning. Adrian Mackenzie develops a Foucaldian-inspired examination of the mathematical, probabilistic, and diagrammatic specificities of this increasingly pervasive strain of artificial intelligence. Through it all, special emphasis is given to the questions of the redistribution of agency between human and machine occurring through machine-learning processes and how critical thought itself might be transformed by this new form of knowledge modelling.
Chapter 2, as the first chapter following the introduction and acting as a sort of wide-ranging mapping of machine learning’s operational formation (understood as the totality of its fields of practice) that covers at length the diversity of elements participating in its present-day ebullition, posits that machine learning is first and foremost a diagrammatic practice which cannot be understood without careful attention given to this more graphical and formal dimension – a dimension that seems to have been thoroughly ignored in the rest of social sciences and that receives its due attention in the rest of the book.
The six ensuing chapters, each relating to specific machine learning’s operations, mobilize (sometimes overly) abundant references to Foucault’s works on epistemology, archaeology and discipline. Echoing the French author’s elaborations on the epistemic effects of the transition from the Renaissance’s tabulation (quite prone to mixing multiple data categories) to the Classical Age’s grid (more ordered on that regard), Chapter 3 delves into how the vectorization of data, operating quite like the pre-Classical grid, ‘smoothes over important fault lines of [qualitative] difference’ (p. 63) and assembles vectorized spaces where data relating, for instance, to breast cancer, soybean, German credit ratings, or Zip-code are juxtaposed. Chapter 4 presents the ‘learning’ of machine learning as a recursive and self-adjusting operation of function-optimization that posits the existence of a useful (i.e. contextually effective and workable) approximation of the ‘ground truth’ of the studied phenomenon – the underlying function that mathematically formalizes the process at work when said phenomenon unfolded. Chapter 5 follows by construing this recursive function-optimization process as the reinscription of a probabilistic principle outside the realm of biological populations, whereas probability distributions acted as ‘models of truth’ (e.g. once your IQ results are located on the ‘wrong side’ of the normal curve distribution, such and such social outcomes await you) and into the realm of machine learners’ populations partitioned in accordance with their error rates. Thus, this process of probabilization ‘gives machine learning a relation to its own plurality’ (p. 122) and allows the machines’ disciplinary and self-adjusting optimization.
After Chapter 6’s investigation into how decision tree and support vector machine (two popular machine-learning techniques) recalibrate what counts as differences and patterns inside data, Chapter 7 demonstrates how the probabilistic treatment of the human genome, albeit constituting ‘a cross-validation of machine learning as a relevant knowledge practice’ (p. 158), also acted as a regularizing referential to the field of genomics. The overwhelming complexity of the genome, following economically oriented objectives of efficiency and field-coherence, is herein reduced by dropping its less statistically potent features.
Registering as the main argument of the book, Chapter 8 concerns artificial neural networks and how these are specific sites of agency redistribution between humans and machines. Back-propagation, a process optimizing the connections (so-called ‘weights’) between nodes in the network in response to features in the data, first considered (following chapter 5) as a self-adjusting disciplinary procedure applied to nonhuman machine learners, is reinscribed into the domain of human machine learners through Kaggle, a platform allowing practitioners to submit models to datasets proposed by various clients. Through the leaderboard, human machine learners can compare their error rates and iteratively optimize their model, thus replicating the self-adjusting optimization process followed by machines. Throughout the book, Mackenzie indeed seems to continually wonder how operational agency might be influenced by the expanding influence of machine-learning models, especially with respect to their repercussions in social sciences. He calls in the end for a strong if critical engagement toward the practice: conscious of the risks of corporate co-optation or re-orientation of any machine-learned practice, but nonetheless conceiving of it as ‘a resource to collectively modulate experience’ (p. 213) toward greater freedom.
Machine Learners, by the relative familiarity it asks toward foreign domains of advanced statistical knowledges, presents as an arduous if necessary proposition. Without ever pretending that his (not quite ethnographic, but certainly ‘embedded’) method is the only valid one to study machine learning, the author ponders on multiple occasions on an interrogation that also acts as a warning: what’s the degree of proficiency one should demonstrate toward his object of study and, if we decide so, how can we make up for the considerable gap in technical skills one would observe, today, between practitioners and outside observers of machine learning.
