Social Sorting: A Research Dossier
Overview
Social sorting is one of the most important, and least publicly understood, concepts in critical surveillance studies. It reframes the problem of state and corporate data collection away from individual privacy (the dominant, liberal-legalist framing) toward a structural analysis of how classification systems create, reinforce, and amplify social hierarchies. The concept has a clear lineage running from Gandy (1993) through Lyon (2003) into contemporary scholars of algorithmic discrimination, and it applies with direct, documented force to the Canadian census–SDLE–nudge governance architecture.
Social sorting is not a conspiracy theory. It is a description of how the machinery of classification operates, and why the state’s claim that “we anonymize the data” misses the point entirely.
Part One: The Intellectual Genealogy
Foucault and the Urge to Classify
The deepest root is Michel Foucault’s concept of biopower, the modern state’s drive to govern populations through knowledge of their biological and social characteristics, organizing people into categories of normality and deviance in order to manage them. Lyon directly builds on Foucault, arguing that contemporary surveillance extends this “classifying drive” through digital infrastructure: statistical normality, operationalized through computer codes, becomes the benchmark against which every person is assessed and sorted.
The “modern urge to classify,” Lyon notes, found its ideal instrument in the computer.
Oscar Gandy: The Panoptic Sort (1993)
The most direct precursor to social sorting as a concept is Oscar H. Gandy Jr.’s The Panoptic Sort: A Political Economy of Personal Information (1993, updated 2021). Gandy defines the panoptic sort as:
“A kind of high-tech, cybernetic triage that sorts people according to their presumed economic or political value.”
His framing is explicitly about identification, classification, assessment, and distribution: the four operational steps by which data systems assign people to “their places in the array of life chances”. The panoptic sort is not simply about watching; it is about granting or denying resources, opportunities, and privileges based on how the sorting mechanism classifies you.
Critically, Gandy’s 2021 update extends his original analysis to the platform economy, where “vast amounts of transaction-generated information have facilitated the generation of descriptive, classificatory, interpretive, and predictive information that has increased the reach of surveillance beyond anything” he had imagined in 1993. His core conclusion, that technology is instrumental but not determinative, and that “the imperative resides not in the machines but in the people who use them” — is important to understand: the machinery doesn’t produce injustice automatically; it amplifies the intentions and biases of whoever designs and deploys it.
David Lyon: Surveillance as Social Sorting (2003)
David Lyon, Professor of Sociology at Queen’s University in Kingston, Ontario, synthesized the field in his edited volume Surveillance as Social Sorting: Privacy, Risk and Automated Discrimination (2003, Routledge). His central definition:
“For surveillance today sorts people into categories, assigning worth or risk, in ways that have real effects on their life-chances. Deep discrimination occurs, thus making surveillance not merely a matter of personal privacy but of social justice.”
This is the foundational claim. Lyon proposes that:
- Surveillance is not primarily about watching individuals but about building and deploying classification systems that govern populations.
- The critical ethical and political question is how categories are constructed, who decides what counts as “disability,” “risk,” “vulnerability,” “fraud risk”, because those categories determine differential access to resources and freedoms.
- Social sorting defuses the conspiracy reading: “It’s not a conspiracy of evil intentions or a relentless and inexorable process”. It is structural, ambient, and built into the ordinary operation of bureaucratic and administrative systems.
- The result is “coded bodies”, people whose physical movements and access to opportunities are governed by prior computational determinations about what category they belong to.
Lyon notes that surveillance “always serves some purposes at the expense of others,” and that the categories it builds are never neutral; they embed the values and interests of whoever designed the system.
Part Two: What Social Sorting Does: The Mechanisms
Classification as Governance
Social sorting operates through a four-step cycle that Lyon and Gandy both describe: identification, classification, assessment, and differential treatment.
- Identification: personal data is collected and linked to an individual or household.
- Classification: that data is processed into a category, disability type, income bracket, housing precarity level, “vulnerable population,” “fraud risk.”
- Assessment: a score or profile is generated, often algorithmically, predicting behaviour, need, or risk.
- Differential treatment: resources, interventions, restrictions, or nudges are applied differently based on the category assigned.
The individual is the raw input; the category is the operative unit; and the differential treatment is the outcome that shapes life chances.
Actuarial Logic: Predicting Rather Than Responding
A defining feature of modern social sorting is its prospective, actuarial character. Systems are designed not to respond to what you have done but to predict what you are likely to do or need, based on what your category looks like statistically. Lyon calls this “prospection”: codes promise advance vision, perceiving future events and managing populations pre-emptively.
This is precisely what Statistics Canada’s Disaggregated Data Action Plan produces: not a record of what people have done, but a model of what “women aged 25–54 with activity limitations and high shelter-cost ratios in Hamilton” are likely to need or do, so that interventions, nudges, and programs can be pre-designed and deployed on that segment.
The “Veneer of Neutral Expertise”
Perhaps the most politically important feature of social sorting is its claim to neutrality. Because the mechanism is mathematical and the output is statistical, the process presents itself as objective. When an algorithm sorts a population into risk categories, decision-makers can claim they are simply “following the data”, effectively laundering political and value-laden choices about who deserves what through a technical process that appears to be beyond ideology.
This is what allows governments to describe census-derived population profiles as “evidence-based policy” rather than as the exercise of classificatory power it actually is. The numbers don’t lie; but the categories the numbers are sorted into are human constructions, built into the system by people with interests, and often invisible to those being sorted.
Part Three: The Social Justice Dimension: Who Bears the Weight
Surveillance Falls Unevenly
One of Lyon’s core arguments, extended significantly by feminist scholars, is that social sorting is not experienced uniformly. The populations most intensively sorted, classified, profiled, and subjected to algorithmic governance are those already marginalized: people with disabilities, low-income households, racialized communities, migrants, and those with precarious housing.
This creates a feedback loop: the groups most subjected to state surveillance generate the most data about themselves; that data feeds classification systems; those systems produce profiles that attract more surveillance and more intervention; the groups remain permanently in the view of the state’s sorting apparatus.
Simone Browne: Racializing Surveillance
Simone Browne’s Dark Matters: On the Surveillance of Blackness (2015, Duke University Press) extends Lyon’s framework to show that surveillance is not simply discriminatory in its effects but constitutively racial in its genealogy. She demonstrates that contemporary surveillance technologies and practices are informed by the long history of racial formation: slave ship manifests, branding, runaway slave notices, lantern laws, and that the mechanisms of “sorting, counting, and surveilling of human beings” were as central to early capitalism as they are to the information economy.
Her concept of “racializing surveillance” is the power to define “what is in or out of place”, and she shows that this power has never been exercised neutrally; it has consistently defined bodies of colour, deviant bodies, and non-normative bodies as “out of place” and in need of monitoring.
Virginia Eubanks: Automating Inequality (2018)
Virginia Eubanks’ Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor (2018, St. Martin’s Press) brings the analysis to its most concrete applied form. Her key argument:
“We all live under this new regime of data analytics, but we don’t all experience it in the same way. Most people are targeted for digital scrutiny as members of social groups, not as individuals.”
She documents cases across the United States: Indiana’s automated welfare eligibility system that denied one million applications by interpreting any administrative mistake as “failure to cooperate”; Los Angeles’ algorithm that ranks tens of thousands of homeless people for inadequate housing resources; Pittsburgh’s child welfare predictive model that targets families fitting a statistical profile. Her conclusion is stark:
“The U.S. has always used its most cutting-edge science and technology to contain, investigate, discipline and punish the destitute.”
The digital poorhouse, in her framing, performs the same function as the Victorian poorhouse: it hides poverty from the middle class while giving the state “the ethical distance it needs to make inhumane choices”.
Part Four: Social Sorting in the Wild: Real-World Cases
The Netherlands: The Childcare Benefits Scandal
The most devastating documented case of social sorting gone catastrophic is the Dutch childcare benefits scandal. The Dutch Tax Authority deployed a self-learning algorithm to classify benefit claims by fraud risk. The algorithm systematically flagged applications from parents with dual citizenship as high-risk; officials then treated this classification as near-definitive proof of fraud. Over 26,000 families were falsely accused of fraud, forced to repay tens of thousands of euros, and driven to poverty, divorce, and homelessness. The algorithm could not explain its own outputs; the bias was structurally embedded but invisible. The scandal forced the resignation of the Dutch government in January 2021.
Separately, Rotterdam’s welfare fraud detection algorithm ranked recipients based on their clothing and their fluency in Dutch, and disproportionately targeted single mothers with migrant backgrounds, without any disclosure to those affected.
France: The Social Security Algorithm
In October 2024, Amnesty International and 14 coalition partners filed a complaint against the French social security agency’s national family allowance fund (CNAF) over a risk-scoring algorithm used to detect benefit overpayments. Amnesty described the system as one that “highlights, sustains and enshrines the bureaucracy’s prejudices and discrimination,” specifically targeting the most precarious families, those with disabilities, migrants, single parents, for intensive fraud investigation, while embedding existing social hierarchies into what presented as neutral mathematics.
Canada: Job Bank and Employment Algorithms
Within Canada, Employment and Social Development Canada (ESDC) already uses a statistical algorithm with machine learning in Job Bank Canada to suggest job matches for job seekers, including those with disabilities, and uses AI to identify “transferable skills”. This system, built on census and survey-derived population models, operates on people whose disability status, income, and employment history were collected, in part, through compulsory census disclosure.
Part Five: The Privacy Critique Falls Short
One of the most analytically powerful insights in social sorting scholarship is that individual privacy law is structurally inadequate to address the harms it describes.
Privacy rights are built around the individual: your data, your consent, your right to access your record. But social sorting harms occur at the group level. As Mann and Matzner (2019) explain:
“Data gathered at a particular place and time relating to specific persons can be used to build group models applied in different contexts to different persons. Thus, privacy and data protection rights, with their focus on individuals, do not protect from the discriminatory potential of algorithmic profiling.”
The model built from census data does not harm you as an individual, Statistics Canada “anonymizes” your record. The model harms you as a member of a category. And no privacy law protects the category. Anti-discrimination law is more promising but also limited, because it requires demonstrating intent to discriminate or demonstrable disparate impact against a protected class, while algorithmic sorting often produces discrimination through proxy variables that correlate with protected characteristics without explicitly using them.
This is the structural gap that makes social sorting a justice problem that existing legal frameworks were not designed to address.
Part Six: What Accountability Would Actually Look Like
Scholars across this literature converge on a set of conditions that would make data-driven governance minimally ethical:
- Transparency about categories: the classification schemes used must be publicly disclosed, including who designed them, what assumptions they embed, and what populations they are applied to.
- Independent oversight: algorithmic systems used in public policy, benefits, policing, and nudge campaigns must be subject to review by bodies independent of the government deploying them, with power to halt, modify, or require justification.
- Contestability: individuals and groups who believe they have been incorrectly classified or unfairly targeted must have meaningful avenues to challenge the classification and obtain redress.
- Sunset and proportionality: linked data environments, like SDLE, should have time-limited authorizations, purpose limitations, and mandatory destruction schedules, not permanent, open-ended growth.
- Genuine consent architecture: where collection is truly necessary, anonymized sampling with voluntary participation protects representativeness without coercing intimate disclosure.
Canada currently has none of these in any robust form for its census-SDLE-nudge governance apparatus.
Key Texts and Sources
The Sentence That Ties It Together
Lyon’s formulation is the one to keep returning to, because it contains the entire argument in two clauses:
“Surveillance today sorts people into categories, assigning worth or risk, in ways that have real effects on their life-chances. Deep discrimination occurs, thus making surveillance not merely a matter of personal privacy but of social justice.”
Substitute “census data infrastructure” for “surveillance”. The government’s entitlement claim is that it is improving policy. The social sorting framework shows that what it is actually doing is assigning worth and risk to groups, under a veneer of neutral statistics, with no meaningful oversight, transparency, or accountability, and that this is not a privacy matter. It is a justice matter.
Additional research from Perplexity
