Database States | Sanjana Varghese

In early 2007, a package sent from the north of England to the National Audit Office (NAO) in London went missing. In it were two discs containing the personal records of twenty-five million people—including their addresses, birthdays, and national insurance numbers, which are required to work in the UK—that the NAO intended to use for an “independent survey” of the child benefits database to check for supposed fraud. Instead, that information was never recovered, a national scandal ensued, and the junior official who mailed the package was fired.

The UK, as it turns out, is not particularly adept at securing its data. In 2009, a group of British academics released a report calling the UK a “database state,” citing the existence of forty-six leaky databases that were poorly constructed and badly maintained. Databases that they examined ranged from one on childhood obesity rates (which recorded the height and weight measurements of every school pupil in the UK between the ages of five and eleven) to IDENT1, a police database containing the fingerprints of all known offenders. “In too many cases,” the researchers wrote, “the public are neither served nor protected by the increasingly complex and intrusive holdings of personal information, invading every aspect of our lives.”

In the years since, databases in the UK—and elsewhere—have only proliferated; increasingly manufactured and maintained by a nexus of private actors and state agencies, they are generated by and produce more and more information streams that inevitably have a material effect on the populations they’re used by and against. More than just a neutral method of storing information, databases shape and reshape the world around us; they aid and abet the state and private industry in matters of surveillance, police violence, environmental destruction, border enforcement, and more.

Though often seen as neutral, or even user-friendly, by those who design them, these repositories are political interventions.

Databases and their particularities are often seen as a merely technical concern—something to be worried about only when there’s a glitch or a security breach. But as theorist Liam Young argues in his book List Cultures: Knowledge and Poetics from Mesopotamia to BuzzFeed, statistics and information have emerged as the “lifeblood” of the modern state. These massive streams of information—portrayed as seamlessly whizzing from one dataset to another—produce the state in as much as they also reveal the state as it is.

As the authors of the 2009 report note, the aims of what they call the “enforcing state” and the “public services agenda” are increasingly fused together; information collected about individuals en masse is cast as a way to better provide for them. To a very limited degree, there is some value to this argument—in certain health care contexts, for example around vaccinations and allergies, detailed datasets offer clear benefits. But these databases demonstrate how excessive—and often unnecessary—the majority of their counterparts are.

In their report, the authors devised a classification system for databases, which goes from red (databases that should be taken offline and changed immediately) to green (databases that are, by and large, appropriate). One of the in-between “amber databases,” the Automatic Number Plate Reader, remains in use—even after it was implicated in the murder of a young black man, Chris Kaba, in London last year by the police. ANPR cameras read the registration number of cars (sixty million records on a daily basis across the UK, according to the Metropolitan Police of London), which are then cross-checked against police databases containing information on “vehicles of interest.” In the case of Kaba, the ANPR camera linked the car he was driving to a previous firearms incident, although the car wasn’t registered to Kaba himself. Police pursued Kaba down a narrow street, didn’t identify themselves, and then shot and killed him—all under an erroneous suspicion.

At the time of the Database State report, the Labour government then in power was dedicated to what it called “joined-up” thinking, or pulling information from one part of the state to another in order to enact policy and deliver services more efficiently and effectively. In practice, this led to the creation of massive databases containing seemingly indiscriminate amounts of information, databases that frequently had no legal justification for their existence. Though often seen as neutral, or even user-friendly, by those who design them, these repositories are political interventions; they make possible the governmental overreach and omnipresent surveillance that are the dominant features of our times.

Public debate around databases tends to focus on the question of extraction—the extent to which modes of data collection pose a threat to civil liberties. But there’s more to it. As Dan McQuillan, an academic and lecturer in creative and social computing at Goldsmith’s College, explains, the relational database “transcribes between informational content and action in the world,” meaning that even seemingly neutral, legal data collection inevitably produces outcomes that may not have occurred otherwise. In this way, too, massive government databases are never just benign repositories.

Take, for instance, then-home secretary Theresa May’s creation of a “hostile environment” for immigrants in 2013, which effectively enlisted thousands of people working ordinary jobs into border management with the introduction of immigration checks throughout civil society. This made it possible for datafication to take place on a massive scale as the number of databases maintained by the Home Office proliferated. As more and more parts of civil society—such as landlords, employers, and doctors—are drawn into enforcing border controls, the riper the potential for harm and violence inflicted by the technology that underpins it becomes.

Even if they are error-prone, what other pathways of marginalization, oppression, and incarceration do databases make possible or more efficient? In 2019, the organizations Southall Black Sisters and Liberty filed a complaint against the Metropolitan Police, which had routinely been requesting information about the immigration status of people who came to them to report violence such as domestic abuse or trafficking. This meant that victims of violent crimes were coming to the police and then being earmarked for removal, sometimes subject to detention, because of their immigration status. Despite the efforts of campaigners, this practice has been fortified by the police through new legislation.

Inherent to many databases is a desire to control and track citizens deemed criminal or untrustworthy in some way. But even when these systems malfunction or fail, the effect is often the same. In 2020, the Independent Chief Inspector for Borders and Immigration released a report on the Home Office’s use of interpreters in asylum cases and found that there were significant gaps in the Interpreter Operations Unit (IOU) database. One applicant who spoke Rohingya was interviewed by three separate interpreters because there was not a single interpreter who spoke Rohingya. Other times, interpreters would be marked as available when they weren’t (and vice versa), causing considerable delays in processing people’s asylum claims.

If you’re on immigration bail in the UK—that is, marked for detention or deportation—you must wear an ankle bracelet that monitors your movement 24/7. The database that this location information is stored on can be accessed for wide-ranging purposes by relevant Home Office staff and used in a variety of ways: “trail” data can reveal someone’s political views (traveling to the location of a known march or a trade union office), their ethnic origin or religious views (traveling to sites of worship, community centers), their health information (traveling to a specialized surgeon’s office or a hospital a certain number of times), as well as other information. GPS monitoring trail data can also, for instance, be used to suggest that someone does not visit their children frequently. Campaigners and advocates worry that this data will be used against defendants in immigration cases—even when it fails to paint an accurate picture.

In the case of the Windrush scandal, where nearly sixty thousand people from Jamaica and the Caribbean who had moved to the UK in the 1960s and 1970s found themselves threatened with detention and deportation as they struggled to prove their right to be in the country, the paper landing cards that testified to their right to be in the UK were destroyed when the Home Office moved workplaces in 2010. This only came to light in 2018, when an ex-employee said that the boxes containing the landing cards, maintained in a government building’s basement, had been destroyed despite the concerns of employees. For many of those affected, it was the sole existing proof of their right to be in the country.

In the weeks after the news of this scandal broke, commentators and other pundits called for the introduction of a national digital ID system: Why, they wanted to know, had the records of people’s immigration status not been maintained digitally? Yet this misses a fundamental point about the desirability of digital databases, which are just as prone, if not more so, to destruction, breaches, and leaks—as even the Home Office admits.

“They’ve been trying to develop these systems over decades, and what you’ve tended to see is that as each politician or Home Secretary has come in, they’ve said, we’re going to sort out this shambolic system. We’re going to do X and Y, but they’ve just demanded new capabilities from the systems that exist,” Edin Omanovic of Privacy International, a London-based charity, told me in an interview. “The direction of travel is for these immigration databases to be combined with law enforcement data,” he adds. “It might technically exist in different systems—they’ll say there’s a firewall—but if what you’re doing is accessing it through one centralized point, it doesn’t matter.”

The mere existence of these databases becomes a continued justification for their use, so entrenched are they in everyday governance.

One database that has come under scrutiny in recent years is the so-called “gangs matrix” developed by the Metropolitan Police to compile information about young, predominantly black teenage boys suspected of being involved in gangs. Successive reviews and inquiries have come to the same conclusion: the database is discriminatory. The vast majority of those in it—nearly 80 percent—are black men, and it relies on racist tropes about criminality and blackness. Recent reports indicate that the information of hundreds of young men is being removed from this database, but others are popping up to take its place. Fingerprint scanners are now routinely carried by police forces around the UK and can be used against anyone “who is suspected of not being who they say they are.” These scanners are hooked up to something called the Biometric Services Gateway, which makes it possible for a police officer to search both the police biometrics and immigration biometrics databases simultaneously. It comes as no surprise, as recent reports indicate, that black people are four times as likely to be stopped and scanned than their white counterparts.

It’s erroneous to suggest that all of this is merely a government function: a dizzying array of contractors and subcontractors, bearing names like Cognizant, are involved in the production and maintenance of this hostile infrastructure. (One such company, Infosys, is linked to the family of current British prime minister, Rishi Sunak.) Freedom of Information Act requests filed with the Home Office are often refused on the grounds that they will reveal commercial information which could be advantageous to a supplier’s competitors.

These issues are not restricted to the UK—vulnerable populations across the world are often the targets and testing grounds of new data-collection technologies. For example, the UN Refugee Agency registers iris scans of individuals in refugee camps to distribute food. Or look at the National Security Agency’s Skynet program, which monitors the cell phone activity of suspected militants in Pakistan and the Middle East in support of the United States’ endless wars. (Former NSA Chief Michael Hayden’s once bragged that we “kill people based on metadata.”)

The organizing and ordering of potentially disparate bytes of information theoretically makes it possible for the state to construct a cybernetic Frankenstein of each of its citizens. The cryptic databases where such portraits are stored are harmful when they work as designed—and more so when they fail. In Pakistan, Rida Qadri writes about how the nation’s Computerized National Identity Card’s operating database produces errors if someone doesn’t have married parents, thereby cutting them off from all kinds of other societal and social benefits (such as being able to vote, or opening a bank account). In 2021, a massive breach of an Indian government database meant that people were able to buy the details of individuals—their names, addresses, phone numbers and sometimes, photos—for as little as $8. In Afghanistan, biometric information, including family trees, were left on insecure databases used by the government, which then became open to capture by the Taliban in late 2021.

Over time, the mere existence of such databases becomes a continued justification for their use, so entrenched are they in everyday governance, in policy and decision-making. They aren’t merely representative of everything that the state already knows about an individual, but what’s possible for the state to know, if and when it becomes ostensibly necessary. In the fifteen years since two discs containing the personal records of twenty-five million British citizens were lost in the mail, there have been many more scandals—and there will certainly be others as the modern state becomes more reliant on data. Rather than reacting to them as they happen, we should consider them as the substrate of contemporary life and the contemporary state.

In The Real World of Technology, Ursula Franklin’s treatise on the relationship between technology and the military, she asks, “How does one speak about something which is both fish and water, means as well as end?” We can start on the level of minutiae—considering how the underlying data architectures of the modern state come to be, and crucially, resisting attempts to embed them further.