Abstract
This article revisits Pierre Bourdieu's seminal work La Distinction and compares it with contemporary data analysis techniques used for online user profiling. Bourdieu's topological conceptualization of social space as a multidimensional space structured by forms of capital anticipated, in many ways, today's vector-based profiling tools used by major digital platforms. Both frameworks decompose co-occurrence matrices into geometric latent spaces where proximity encodes similarity, and both aim to connect individual social positions with tastes and preferences probabilistically. We argue that this convergence represents an independent validation of Bourdieu's theoretical intuitions and briefly discuss some implications of the new technical capabilities of corporate online profiling methods
Keywords
Introduction
In 1979, Pierre Bourdieu published the first French edition of his book La distinction (Bourdieu, 1979c, 1984a), a contribution that would be listed among the ten most important sociology books of the twentieth century (ISA, 1997). The book offers an empirically grounded account of the social structuring of taste, arguing that aesthetic preferences, far from being disinterested expressions of individual freedom, are predictably patterned by an agent's position in the social space.
The ambition of the project was to offer a scientifically anchored science of taste ‘in order to discover the intelligible relations which unite apparently incommensurable “choices”, such as preferences in music and food, painting and sport, literature and hairstyle’ (Bourdieu, 1984a: 6). Interestingly, Bourdieu framed his conceptual argument with multiple references to the language of marketing such as consumption, products and consumers: The aim is […] to move beyond the abstract relationship between consumers with interchangeable tastes and products with uniformly perceived and appreciated properties to the relationship between tastes which vary in a necessary way according to their social and economic conditions of production, and the products on which they confer their different social identities. (Bourdieu, 1984a: 101)
As we argue in this article, the underlying reference to marketing was disturbingly prescient. At the time, Bourdieu's ambition to use empirical data to build a comprehensive model of the social structuring of cultural preferences was constrained by the nature of available data collection and processing tools. However, half a century later, the landscape has profoundly changed.
Large corporations now maintain datasets on individual tastes, behaviour, and socio-economic profiles at a scale Bourdieu could scarcely have imagined. At the same time, mathematical tools rooted in high-dimensional vector geometry, an approach that shares a significant structural kinship with the topological framework Bourdieu deployed, have been developed at a massive scale to profile and predict user behaviour.
This commentary argues that there is a meaningful methodological resemblance between Bourdieu's topological conception of social space as operationalized through Multiple Correspondence Analysis, and the latent-space, vector-based models that underpin contemporary recommender and profiling systems. We start by revisiting Bourdieu's methodology in La Distinction before moving to an overview of the key technological developments in large-scale user profiling techniques. We conclude by discussing some methodological and political implications of the new technical capabilities of corporate online profiling methods.
La distinction
La Distinction exemplifies what became Bourdieu's distinctive approach to theorization. The book follows a layered structure, alternating between highly empirical chapters offering detailed analyses of specific data and others devoted to high-level conceptual reflection. A key strength of this approach lies in the dialogical interplay between data analysis and theory. For example, the book opens with a very empirical (and now somewhat dated) first chapter focused on the social distribution of preferences for various cultural goods in 1960s France, before shifting gears in the second chapter to lay deep conceptual groundwork built on the data.
Empirically, the book rests on survey-based data. The main survey was designed specifically for the project and completed by a total of 1217 respondents over two waves in 1963 and then in 1967–1968 (Bourdieu, 1979a). This data was supplemented with other publicly available surveys from the 1970s, published by France's National Institute of Statistics and Economic Studies (Bourdieu, 1979b). The analytical methods rely primarily on descriptive frequency distributions and Multiple Correspondence Analysis (Duval, 2018; Le Roux and Rouanet, 2010).
The most methodologically sophisticated components of the book are the famous two-dimensional graphs that display the association between social variables – mainly economic and social capital – and corresponding tastes (Bourdieu, 1984a: 128–129). Those graphs are based on Multiple Correspondence Analysis (MCA) of survey responses, positioning respondent and response categories as points in a two-dimensional space. Proximity in that space reflects similarity in profiles; distances reflect social differentiation. Bourdieu interpreted the structure of this space as a map of the underlying logic of symbolic power, revealing how cultural taste naturalizes social hierarchy.
Yet Bourdieu acknowledged a degree of frustration with the limits of the available data and analytical tools. In the absence of a survey (perhaps impossible to carry out in practice) that would provide, with respect to the same representative sample, all the indicators of economic, cultural and social wealth and its evolution which are needed in order to construct an adequate representation of social space, a simplified model of that space has been constructed […]. (Bourdieu, 1984a: 127)
What we want to emphasize here is how topological Bourdieu's conceptualization of the social space is. By topological, we mean that it offers a modelling of the social space conceived as a mathematical multidimensional space. That view is especially clear in a later publication about social classes (in French) in 1984: To begin with, sociology presents itself as a social topology.
The social world can be represented as a (multi-dimensional) space constructed on the basis of principles of differentiation or distribution, formed by the set of operative properties in the social universe under consideration—that is, properties that confer strength or power upon those who possess them within that universe. Agents and groups of agents are thus defined by their relative positions within this space. Each is confined to a specific position or a class of neighbouring positions (that is, a defined region of the space), and one cannot actually occupy, though one may conceive of it in thought, two opposing regions of the space simultaneously. Insofar as the properties used to construct this space are operative properties, it can also be described as a field of forces: that is, as a set of objective power relations that impose themselves on all who enter the field, and that cannot be reduced to the intentions of individual agents or even to direct interactions between them. (Bourdieu, 1984b: 3 authors’ translation)
Big data and new mathematical models
The first foundational development we want to mention is the construction of persistent individual identifiers through HTTP cookie technology, standardized in the late 1990s (Kristol, 2001). Cookies enabled platforms to track users’ online behaviour over time and across devices, capturing browsing histories, click patterns, dwell time, and purchase behaviour, and to link these with offline signals derived from GPS location, Bluetooth proximity, and third-party data brokers (Cyphers and Gebhart, 2019; Kwet, 2019). The exploitation of this mass of data for marketing purposes rests on its encoding into vectors that simultaneously position users and their tastes into a high-dimensional space.
The breakthrough that allowed to do this emerged with the publication of the paper ‘Indexing by Latent Semantic Analysis’ by Deerwester et al. (1990), combined with increasingly powerful neural network systems. This work proposed a new approach to information retrieval by embedding language-based meaning into a high-dimensional vectorial space. At the time, search algorithms struggled with limitations caused by synonymy and polysemy, and users often failed to retrieve relevant documents because they used different terms than those in the target content. Latent Semantic Analysis processes documents holistically using singular value decomposition to map each term into a multidimensional space (Jurafsky and Martin, 2024). Words that appear in similar contexts end up close together, and documents whose vocabulary profiles are similar are similarly co-located.
The two technological developments described above, cookie-based tracking and vectorial language processing, might appear unconnected, but they combined to profoundly shape the models now used for processing user tracking data. The ecosystem of persistent individual identifiers led to the creation of gigantic datasets of remarkable informational richness. At the same time, neural networks building on the foundations of latent semantic analysis were developed to mass encode this data into multi-dimensional vector spaces that offer high-resolution profiles of users' preferences and behaviours (Bell et al., 2009; Chen et al., 2016; Li et al., 2024; Pan and Ding, 2019; Zilliz, 2024).
The main element we want to emphasize with this very brief overview is the close mathematical connection between the mathematical underpinnings of singular value decomposition and multiple correspondence analysis. Both apply an Eigen decomposition to extract latent dimensions in a geometric space where proximity encodes similarity.
Conceptual and practical implications
Although they evolved from different origins and for radically different reasons, the topological conception of social space at the core of Bourdieu's theory and the user profiling models currently used for marketing purposes on the Web are strikingly similar. Methodologically, they both rest on the idea of a complex, multidimensional space where all elements simultaneously interact with one another through a vector-like set of positions and forces. Practically, both models also aim to connect individual-level characteristics with tastes and preferences in a probabilistic way. To the best of our knowledge, Bourdieu's work played no role in the development of the corporate profiling models. There is thus a form of validation by triangulation that comes from the independent development of a similar way to apprehend the same underlying social reality.
That said, there are obviously significant differences between Bourdieu's theoretical contributions and current marketing tools. Contemporary machine learning systems are optimized for predictive accuracy, not interpretability. Those models predict online behaviour but tell us nothing about the social mechanisms that generate the patterns they exploit. By contrast, Bourdieu used data to build conceptual tools and theories that go much further than the narrow data they were derived from.
Our argument here is thus in no way to suggest that corporate tracking tools are breaking new ground in social science. But when Bourdieu wrote his core conceptual contributions (Bourdieu, 1979c, 1980, 1984b), nobody could have imagined that the kind of empirical data making it possible to accurately position billions of humans according to their individual characteristics in a high-dimensional vectorial space would ever exist. Similarly, the mathematical foundations and computational power needed to make sense of such datasets simply did not exist. But now they do and it might be worth asking ourselves about the implications.
First, we should note that it is unlikely that the data infrastructures developed by large corporations engaged in online profiling will be readily usable for sociological analysis in the near future. Although the European Union has created a potential opening through Article 40 of the Digital Services Act (European Parliament and Council of the European Union, 2022), substantial technical and procedural challenges remain. Methodological questions, around sampling and representativeness, would also need to be carefully explored. Our point is thus not to suggest that corporate profiling tools are going to suddenly open new doors in sociological research. But we do think that some deep questions could, theoretically, be tackled in a new way.
For instance, when Bourdieu developed his theorization of the interdependency of capitals and fields, he framed it as both a critique and a refinement of the class-based post-Marxist models influential in France at the time. One of his core points was that the post-Marxist concept of social class should be replaced by empirically derived analyses of the structuring forces at play and the way agents cluster in similar positions within the social space (Bourdieu, 1984b). Theoretically, the empirical identification of such ‘class-like’ clusters would now be possible on a truly massive scale.
At the societal level, the potential impacts are also important, though in a slightly more concerning way. The data on user profiles isn’t accessible for research purposes, but it is constantly exploited and refined to run more effective than ever targeted marketing campaigns. In that sense, as we wrote earlier, there was something prescient in the marketing-derived language Bourdieu used. As he predicted, it is indeed possible to offer a scientifically based and empirically anchored science of taste: ‘to discover the intelligible relations which unite apparently incommensurable “choices”, such as preferences in music and food, painting and sport, literature and hairstyle’ (Bourdieu, 1984a: 6).
However, a turbo-charged capitalist consumer market isn’t the only practical use of the marketing power of the new models. The other commodities being pushed using optimized individual targeting are political ideologies. Electoral processes are now deeply influenced by online communication campaigns in which messaging is finely tuned for specific audiences. This makes for another interesting parallel between Bourdieu's work and current technological operationalizations. In the late 1980s, Bourdieu and Champagne (Champagne, 1988, 1990; Bourdieu, 1981, 1984b, 1984c, 1984d, 1988) had already adapted the models developed in La Distinction to the realm of political marketing and the crafting of public opinion.
Such an extension is quite natural. Political dispositions are socially structured in predictable ways, and political games are played through complex habitus-mediated processes that heavily draw on symbolic capital to objectify and sustain existing power relations. However, it is revealing to witness these same two levels reproduced in the corporate landscape. Profiling tools can provide accurate assessments of people's political preferences and opinions (Khan and Khan, 2024; Yu, 2023) in the same way they predict preferences for commercial goods, and individually optimized political messaging provides an extraordinarily powerful instrument for shaping public opinion and influencing voting behaviour. That such tools are in the hands of private corporations, with precious few guardrails in place, is deeply concerning, especially given the increasingly blurred boundaries between marketing and law enforcement (Cohen and Hongo, 2025).
Conclusion
In 1979, Bourdieu argued that it was possible to predict tastes and cultural preferences in a probabilistic way from people's social position. At the time, this claim was bold and somewhat controversial. Forty-odd years later, models that perform precisely this kind of predictive analysis are ubiquitous, drawing on gigantic datasets and near-limitless computing capacity. As we discussed, there is a genuine, non-trivial structural parallel between Bourdieu's topological conception of social space and vector-based models that power contemporary user profiling and recommendation systems. We believe this short commentary only barely scratches the surface of the topic at hand. But we want to conclude by saying that, despite the critiques at the time, Bourdieu was right on taste.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
