Virtual Searches: Regulating the Covert World of Technological Policing
Christopher Slobogin
New York University Press, $30 (cloth)

Predict and Surveil: Data, Discretion, and the Future of Policing
Sarah Brayne
Oxford University Press, $29.95 (cloth)

As most people know by now, technology is dramatically reshaping the practice of policing. Consider how an investigation might unfold after a spate of shoplifting incidents at a big-box retail store. Authorities believe the same person is responsible for all of them, but they have no leads on the perpetrator’s identity. In a process known as “geofencing,” the police go to a judge and get a warrant instructing Google to use its SensorVault database—which stores location information on any Google users who have “location history” turned on—to provide a list of all cellphones that were within 100 yards of the store in the one-hour range of each day the robberies took place.

Law enforcement agencies are embracing technologies for which there are few, if any, existing limits.

Police cross-reference those lists and narrow their focus to sixty-five cellphones present on all the relevant days. Google then provides law enforcement more detailed information on those sixty-five users, such as name, email address, when they signed up to Google services, and which services they use. Police feed those sixty-five names into a facial recognition database and cross-reference the results against the store’s internal security cameras. This allows them to trace the movements of each of these sixty-five individuals throughout their visits to the store—identifying which products they looked at, which ones they chose to purchase and which, if any, they failed to pay for.

In another scenario, instead of using cellphone information, the police might use footage recorded from nearby Ring doorbell cameras, automated license plate readers (ALPRs), or drone- or satellite-based cameras to identify vehicles in the area during relevant time windows. In still another possibility, the police might rely on a predictive algorithm identifying the store and its neighborhood as a “hot spot,” where crime is likely to take place, to deploy a large police presence in the area to deter crime.

Whether effective or not, these tools raise significant concerns. Aggregating massive amounts of personal data in the hands of law enforcement poses a significant threat to individual privacy. Data-driven tools generate an insatiable appetite for data—not only data specifically about criminal activity, but data about anything—which is dramatically expanding the scope of law enforcement scrutiny to individuals who have no history of law enforcement contact. In other words, the dragnet of data collection is universal, not limited to suspected criminals. And there are few checks in place to constrain how that data is used or to ensure that it is accurate. In addition, some forms of law enforcement reliance on data threaten to entrench and exacerbate existing racial disparities that continue to plague policing.

Police use of data is nothing new. But the volume, nature, and scope of data that currently makes its way into law enforcement’s hands—as well as the role that private sector, for-profit entities play in this information ecosystem—has changed the landscape in ways that raise new and vitally important questions about what law enforcement agencies can do, what they should do, and how they should be regulated. Unfortunately, our current moment sees law enforcement agencies increasingly embracing technologically enhanced policing techniques for which there are few, if any, existing limits.

Two illuminating books explore these issues from different but complementary perspectives. Christopher Slobogin’s Virtual Searches provides the view from 30,000 feet, offering an immensely useful typology of “police investigatory techniques carried out covertly, remotely, and technologically, rather than through physical intrusion.” A leading scholar of the security and privacy implications of digital policing, Slobogin points out that Virtual Searches is not actually about searches—at least not within the legal meaning of the term. And that is exactly the book’s point (as its clever double-entendre of a title suggests): the Supreme Court’s narrow interpretation of a Fourth Amendment “search” allows police to adopt a vast swath of investigative tactics without having to get a warrant based on probable cause. At the same time, Slobogin recognizes that not all “virtual searches” are created equal, and he provides a reasonable framework for thinking about how to regulate the different kinds of investigations enabled by existing surveillance technology.

For anyone left wanting more details, Sarah Brayne’s recent Predict and Surveil, a sociological study of the LAPD and LA County Sherriff’s department, provides a first-person account of how some of these same techniques operate in practice. The most interesting and important discussions in these books come in areas where their focus overlaps: predictive policing (sometimes also referred to as “data-driven” or “precision” policing), and the aggregation of data from multiple sources into large databases that can be mined for information. Together, their discussions illustrate both the promise and the perils of policing in the age of big data.

The traditional locus of law enforcement regulation is the courts, through their interpretation of the Fourth Amendment, which bars the government from performing “unreasonable searches and seizures” of our “persons, houses, papers, and effects.” Slobogin and Brayne each devote a chapter to explaining why this constitutional protection rarely applies to “virtual” searches. Slobogin’s chapter provides a more detailed account of the law, and he also expresses some optimism that constitutional doctrine might be—very incrementally—beginning to confront some of the challenges posed by modern technology. But the upshot of both authors’ legal analysis is that contemporary investigative measures often fall into one of two Fourth Amendment blind spots.

Contemporary investigative measures often fall into one of two Fourth Amendment blind spots.

First, a landmark Supreme Court case, Katz v. United States, held that government collection of information is considered a “search” for Fourth Amendment purposes only when there is a “reasonable expectation of privacy” in the information collected. Crucially, neither information exposed to the public (such as your location in a public place) nor information in the hands of a third party (like financial information held by your bank) enjoys such a reasonable expectation. Therefore, the Court has said, Fourth Amendment protections are inapplicable when government agents follow someone “traveling in an automobile on public thoroughfares,” or fly over a privately owned property surrounded by a ten-foot fence to see what’s inside, or acquire from your telecommunications provider a list of the phone numbers that you have dialed.

What this means today is that the Fourth Amendment poses no obstacles to law enforcement agencies collecting, for example, video from CCTV, ALPRs, Ring cameras, or police body cameras. Nor are there constitutional constraints on government access to records about Internet searches and web surfing behavior, non-content call data, or e-commerce activity. The Supreme Court has begun to recognize that its Fourth Amendment jurisprudence might have outlived its time—in the 2018 case of Carpenter v. United States, for example, the Court held that the government does, in fact, need a warrant based on probable cause to get your cellphone provider to turn over a week’s worth of your cell-site location information (CSLI), a catalog of your movements collected when your cell phone identifies its location to nearby cell towers. But that decision still stands as a small island of protection in a veritable sea of information about each of us.

The Fourth Amendment’s other blind spot is that it has nothing to say about how information is used once in the government’s possession. This means that once lawfully collected, police need not demonstrate probable cause or secure a warrant to access data. Imagine an individual’s DNA is collected upon their arrest—a common practice, regardless of whether the arrestee is ever prosecuted or convicted of a crime. That DNA profile will be saved in the national Combined DNA Index System (CODIS) database. The police are then free to compare DNA collected from subsequent crime scenes against the profiles in CODIS seeking a match. Similarly, there is no Fourth Amendment bar to police compiling video footage acquired from around town through various means into a single database and then identifying and tracking a particular individual’s movements over time using facial recognition software. Nor is there a prohibition on using past crime data to generate algorithms to predict which individuals are likely to be involved in criminal activity.

Elected officials have failed to close these gaps. Even where they have stepped in to protect sensitive information—such as federal legislation regarding health-related data and the contents of phone calls and emails—there are exemptions for law enforcement access. Contemporary efforts share this shortcoming. Both the Biden administration’s “Blueprint for an AI Bill of Rights,” which seeks to protect against potential harms from inaccurate or biased algorithmic decision-making, and the Data Protection Act of 2021, which seeks to create an independent agency to regulate the collection, processing, and sharing of personal data, exclude law enforcement practices from their ambit. Some state and local jurisdictions have proactively taken steps to limit law enforcement’s use of big data or modern surveillance methodologies. For example, while California’s expansive Consumer Privacy Act does not apply to government agencies, the state does require all users of ALPRs to adopt a policy to ensure that their use is consistent with “respect for individuals’ privacy and civil liberties.” And several major U.S. cities have banned the use of facial recognition, predictive policing, or both.

But these measures are few and far between. As is often the case, technology has far outpaced the law, evolving at a rate that legislatures have not matched. It thus remains an open question how to realize digital tools’ potential to improve public safety without exacerbating racial disparities, enabling police abuse, or transforming society into a dystopian surveillance state where privacy is as outdated as 8-track tapes and typewriters.

Slobogin’s typology of virtual “searches” identifies five categories: suspect-driven (looking for a specific person), event-driven (investigating a specific even, usually a crime), profile-driven (predictive policing), program-driven (surveillance programs not based on suspicion, like ALPR systems), and volunteer-driven (data collection provided voluntarily by non-government actors). The categories that raise the most acute privacy concerns are the profile- and program-driven categories, the two that Braynes examines most closely. A look under the hood of each of these, starting with predictive policing, helps illustrate what’s at stake.

Predictive policing programs are more than just productivity-boosting tools: they transform police practices rather than simply enhance them.

Predictive policing relies on large data sets to generate statistical probabilities or algorithmic models—usually developed by private, for-profit entities—to predict criminal activity. Some predictive policing identifies places (known as “hot spots”) where crimes are more likely to take place, while others identify people (sometimes referred to as “hot people”) who are more likely to either commit or be a victim of crime.

Some hail predictive policing as a means of simultaneously improving police effectiveness and reducing bias in policing. Using statistical information and mathematical algorithms, the argument goes, will eliminate the discretion that so frequently results in disproportionate impacts on communities of color. (Slobogin himself made the case for properly regulated “risk assessment instruments” on these grounds in a previous book, Just Algorithms.) Critics, however—including legal scholars Dorothy Roberts and Andrew Ferguson—point to the potential for reproducing and magnifying existing inequities in the policing system. Consider the impact of one common metric: arrest records. People of color are overrepresented in arrests in part due to things like higher police presence in their neighborhoods, the increased likelihood that the police will stop them, and the fact that such stops are less likely to be based on solid evidence. As a result, any algorithm using historical arrest records as a factor in building a predictive model of criminal behavior will generate more Black false positives: it will more frequently erroneously identify Black people than white people as a future criminals. Thus, both what data is collected and how it is used to craft predictive models are hugely consequential.

According to Slobogin, place-based predictive policing, which tends to be used to predict property crimes, “appears to be a relatively useful way of allocating police resources.” Brayne found that the LAPD used hot spot predictions to identify areas where officers should spend uncommitted time (time not responding to calls or booking someone at the station). At the same time, the LAPD determined that to reduce crime in a particular high-crime area by deploying police helicopters there, they had to fly over the hot spot 51 times per week.” Even if that is an efficient allocation of resources, it remains open to debate whether the gain in efficiency outweighs the costs imposed by constant policing.

The true headline regarding predictive policing of individuals is that, despite its widespread use, we just don’t know whether it works. According to Slobogin, “the jury is still out on whether hot people policing is effective,” and as far as Brayne could determine, “there is basically no evidence that big data analytics has led to higher clearance rates, fewer stops without arrest, or a reduction in racial disparities in false arrests.” Nevertheless, both authors go on to explore the issues that predictive regimes raise because, even assuming that eventually this technology proves effective, it still raises significant questions regarding its legality, its impact on privacy, and whether any impact on privacy is equitably distributed across society. 

These tools also bear examining for another reason: they are already being deployed. One driving force behind the expanding use of police technology of questionable efficacy is surveillance capitalism. Police departments facing rising crime and faltering community trust are eager to improve outcomes. When tech companies come along and offer tools that promise to do just that, they seem to offer a win-win proposition. (More on that dynamic later.)

Another reason such tools appeal to law enforcement agencies is that they justify another goal: vast data collection. In Brayne’s detailed account of the LAPD’s now-discontinued Operation LASER (Los Angeles’ Strategic Extraction and Restoration Program), predictive policing became more a rationale for collecting data—a method of expanding surveillance—than for using it effectively. A small unit within the LAPD gathered information from disparate sources to generate a list of so-called “chronic offenders,” which it shared with the rest of the department. This classification was based not on outstanding warrants or the fact that these individuals were wanted for crimes, but rather a mix of prior arrest reports, criminal histories, and data from daily patrols. Officers were instructed to “seek out and gather intelligence on the chronic offenders during routine patrol work,” Brayne writes, in order to improve “situational awareness” and “remove the anonymity of our offenders that are out there,” one officer explained. When the list quickly grew so large that it became unmanageable, Operation LASER implemented a point system to “decide who’s the worst of the worst,” as one officer put it—assigning five points for a past violent crime, five points for gang affiliation, and so on. And while presence on the chronic offender list was not sufficient alone to justify a stop, officers regularly stopped people on that list. As Brayne puts it, “every officer said that the stops still have to be constitutional, but you could always find something to stop someone for if you wanted to.” Brayne notes that some chronic offenders were stopped as many as four times in a day.

Operation LASER therefore fed into a phenomenon Brayne calls “data greed”—the idea that once you have a system driven by data, the incentive becomes to collect as much data as possible, regardless of whether there is reason to tie that information to crime. This phenomenon will come as no surprise to anyone familiar with government surveillance practices (or the world of corporate surveillance, for that matter—Brayne notes that at a surveillance industry trade show she attended, data was viewed as a “strategic asset,” and ROI referred not to “Return on Investment” but “Return on Information”). The U.S. intelligence community has described individual pieces of intelligence as akin to tiles in a mosaic, such that the relevance of one data point may be unclear until it is combined with additional information that helps to reveal a larger picture. This theory leads to the logical conclusion that all information is, or at some point might be, relevant to some investigation. On this view, more collection is always better. Data greed is also closely associated with “function creep”—the use of data collected for one purpose being used for another. Once data is in the government’s possession, the temptation is to find ways to put it to work.

Again the LAPD provides a striking example: so-called field interview cards. As Brayne describes them, these are index cards that include an individual’s name, address, physical characteristics, vehicle information, gang affiliations, criminal history, and a spot for narrative or additional information, which might include what the individual was doing or who he was with when interviewed. Those additional individuals are thereby entered into the “system,” facilitating the future tracking of more people in more ways, regardless of whether they have any connection to criminal activity. Officers are instructed to fill one out every time they interact with someone. They are, in effect, “a means of gathering ongoing pre-warrant intelligence on people’s locations, social networks, and activities” as well as a means of documenting officer activity. As implemented, Operation LASER was not (only) about preventing violent crime. It was (also) a justification for ongoing intelligence collection that sent a clear message to the community being policed: “we know who you are.”

Data tools can not only perpetuate stereotypes but also create feedback loops that become self-fulfilling prophecies.

Despite promises to the contrary, predictive policing programs often replicate the problems posed by human-driven policing in part because, far from being mere productivity-boosting tools that enhance fixed policing practices, they actually transform police practices—where police go, who they interact with, and, importantly, what data they collect and retain. Operation LASER, for example, specified that one point should be added to an individual’s score for every police contact. The policy had a direct impact on policing practice: anyone on the chronic offender list was more likely to be stopped by the police, but every time they are stopped, they got another point, making it more likely that they would be considered a chronic offender and thus more likely to be stopped again.

Using data to drive police decision-making can thus exacerbate, rather than mitigate, existing disparities. In doing so, these tools can not only perpetuate stereotypes but also create feedback loops that become self-fulfilling prophecies. As Brayne puts it, in the end “we can’t know the degree to which we are measuring crime or measuring enforcement.” An LAPD internal review of Operation LASER determined that “there was not strong evidence that predictive policing reduced crime rates and that there were significant civil rights concerns with inconsistent enforcement, opacity, and lack of accountability,” Brayne writes. The program was discontinued, but similar efforts remain in use in departments around the country.

Get our latest essays, archival selections, reading lists, and exclusive editorial content in your inbox.

Operation LASER drives home a larger message that Brayne draws from her time in LA: data is sociologically constructed. “People situated in preexisting social, organizational, and institutional contexts decide what data to collect and analyze, about whom, and for what purpose,” she writes. Moreover, numerous forms of data involved in this ecosystem are generated and maintained by private-sector actors. From telecommunications companies and banks to online marketplaces and social media sites, the determination of what data is collected, how it is organized, and whether it is retained is made by private companies operating on the profit motive.

The upshot is that big data policing does not eliminate the use of discretion that enables discrimination; on the contrary, it simply relocates the exercise of discretion to a place earlier in the timeline, which is less visible and less accountable than the human exercise of discretion that it displaces. This less visible discretion is exercised in decisions such as which factors to consider, which model or algorithm to employ, and what level of accuracy a model must achieve before being implemented. When the exercise of discretion is hidden in this way, predictive models can be billed as race-neutral, bias-free, and based on quantification, computation, and automation. Determining whether they live up to these descriptions requires much closer examination.

Predictive policing is just one aspect of the changing nature of technology-based law enforcement. The use of technology that Brayne labels “dragnet surveillance” and Slobogin calls “program-driven” investigations raises a related but distinct set of concerns, bringing into sharp relief the urgent need to place data use as well as data collection at the forefront of regulation.

The distinguishing feature of such investigations is that they combine large volumes of data from numerous sources and provide an interface that allows users to access and link data across what previously would have been siloed systems. The data involved might have been collected by the government in the course of its policing activities, such as arrest records, but at least as often, data sources not usually associated with crime-fighting are included as well. Large amounts of data collected by private entities for purposes entirely unrelated to law enforcement are conveyed to government agencies, who dump it into enormous data repositories, allowing the police to increasingly utilize data on individuals with no prior police contact.

Big Data policing does not eliminate discretion; it simply relocates it, rendering it both less visible and less accountable.

Here, too, Brayne’s sociological examination of the Los Angeles law enforcement ecosystem is revealing. She describes the LAPD’s use of a system provided by Palantir Technologies—a publicly traded, for-profit company valued in the tens of billions of dollars whose original seed money came, in part, from the CIA’s venture capital arm, In-Q-Tel. Palantir offers analytic platforms to entities in the defense, law enforcement, intelligence, and private sector communities. Its clients have included the CIA, FBI, Immigration and Customs Enforcement, the Department of Homeland Security (DHS), and JP Morgan Chase, as well as state and local law enforcement agencies like the LAPD and “fusion centers”—multi-agency entities funded by DHS to facilitate information sharing among law enforcement agencies at the federal, state, local, and tribal level.

As Brayne explains, Palantir Gotham—the platform employed by the LAPD—enables disparate data points to be searched in relation to one another, and a single individual’s records to be cross-referenced across multiple databases. Once these disparate sources of data are gathered under the umbrella of a user interface like Palantir, they can be used not only to query that data based on particular investigative leads (for example, “search for red four-door sedans whose license plate begins FTK, is driven by a white male, and was within a particular ten-block radius at the time of a particular crime”), but also to set up alerts, which will notify officials when, for example, the license plate of a car reported stolen is captured by an ALPR.

Notably, Brayne found it impossible to provide a comprehensive account of the sources of data that feed into Palantir in the LA area. She identified at least nineteen databases included in the Palantir system at Southern California’s fusion center—the Joint Regional Intelligence Center—including information from the LAPD’s field interview cards, traffic citations, crime bulletins, ALPR readings, the sex offender registry, county jail records, traffic collisions, warrants, and crime stopper tips. Palantir users Brayne spoke to, while unsure exactly what data was included in the databases they were accessing, also indicated that LexisNexis’s public records database, which those users believed to include information like utility bills and credit card information, was available via Palantir.

As for non-crime-related databases, the ones Brayne encountered included data from repossession and collection agencies; social media; foreclosures; electronic toll passes; address and utility use from utility bills; hospital, pay-parking lot, and university camera feeds; and information collected from consumer rebates. Slobogin, too, points to multiple sources of data deployed across the board, regardless of whose information is involved, simply because it might turn out to be useful to a future criminal investigation. And it can, in fact, be useful. Brayne describes an instance in which an analyst used ALPR data to identify the specific intersection near which a “person of interest” parked at night, narrowing the police’s search for his residence to that area. Such techniques can also be used to seek cars associated with outstanding warrants, or track an individual’s travel patterns. Of course, systematically tracking all Americans 24/7 would also be useful for criminal investigations. The question is thus how useful is the information being collected, and how much privacy should we give up so that law enforcement can exploit that utility? If Brayne’s observation in LA that “the most common use of ALPRs is simply to store data for potential use during a future investigation” is true as a general matter, the gain hardly seems worth the sacrifice.

Linking numerous sources of data in this way raises what legal scholar Daniel Solove has called “the aggregation problem”—which might now be called the Carpenter problem. Combining numerous data points, each of which is relatively unrevealing on its own, can reveal incredibly intimate information. Someone’s location in a public place at any one moment in time is unlikely to provide a very detailed picture of her life. Aggregate many instances of her location twenty-four hours a day for a month, however, and you are likely to learn details about her family, her work, whether she is receiving physical or mental health treatment, whether she attends AA meetings, whether and where she worships, what political activities she engages in, who she socializes with, and more. Moreover, any conclusions law enforcement draws in this way assume both that the data is accurate and that the police will draw the correct inferences from it. Neither of those assumptions is something the public is able to question.

Brayne points out that incorporating non-crime-related data into police investigations can also have another adverse effect: it can cause people to avoid interacting with institutions whose use is fundamental to an individual’s full participation in society. When hospitals, banks, schools, and workplaces are included in the surveillance apparatus, people might hesitate to seek out medical, financial, educational, and labor-related resources.

So much for the technology. Slobogin’s and Brayne’s accounts both touch on another factor critical to understanding modern policing: it is big business. The NYPD alone spent $3 billion between 2007 and 2019 on surveillance tech. Technological advancement is what’s driving market growth—in cloud computing, video surveillance systems, analytics, and software.

Amazon partnered with police departments, offering free services in exchange for a contractual obligation to hawk Ring cameras.

In other words, these innovations are coming not from within law enforcement agencies themselves but rather from the private sector—usually in “opaque and unaccountable ways,” as Hanna Block-Webha has noted. They are thus rarely developed because a government agency approaches the private sector to say, “here is what we need.” Instead, companies such as Palantir—or others selling ALPRs, facial recognition software, predictive models, or cellphone tracking—come to the government with a sales pitch and say, “look what we can give you (whether you actually need it or not).” As one of LAPD’s civilian employees put it, “our command staff is easily distracted by the latest and greatest shiny object” regardless of whether its implementation is practicable. And because agencies can often procure the funding for these purchases through federal grants from DHS or the Bureau of Justice Assistance, which range from tens of thousands to millions of dollars, or even from private donors, agency leaders need not choose between them and other departmental necessities: they can have it all.

In fact, vendors often work hand in glove with client agencies to advertise technologies to attract additional licensees. Companies give law enforcement agencies prewritten press releases touting the success of their products. They hire individual officers to participate in marketing videos or to provide testimonials evangelizing for their products. Former cops find jobs at technology companies like ShotSpotter—which produces an audio AI tool that it claims can detect gunshots based on sounds recorded through microphones placed throughout cities—and then leverage their existing relationships to sell products. Sometimes, police even get direct economic incentives to promote products. The NYPD, for example, gets a 30 percent cut of every sale of the Domain Awareness System, which it developed alongside Microsoft. Amazon also partnered with police departments to advertise Ring doorbell cameras, offering free Ring products and access to Ring’s “Law Enforcement Neighborhood Portal” in contractual exchange for encouraging residents to purchase the cameras.

This private-sector role has numerous implications. As Elizabeth Joh and Thomas Joo have noted, questions such as what kind of data an algorithm should use to predict a suspect’s dangerousness, whether and how video is stored and accessed, and what should trigger activation of body or dashboard cameras “are typically raised and resolved by technology vendors—not the police.” Not only are police rarely involved in these design decisions, but the decisions themselves are frequently opaque. Indeed, predictive models are often developed by private entities who regard their operations as a trade secret that they will not reveal to the public, the courts, or even the agencies themselves. Some companies have even required law enforcement clients to enter into nondisclosure agreements as part of a licensing deal, effectively cloaking the fact that these tools even exist. These practices obscure not only information about how the tool functions, but whether and how its efficacy has been demonstrated.

This is only the beginning. Another knock-on effect of relying on proprietary products is path dependence. Once an agency licenses a product from a particular company, it can be locked into that specific technology, raising transaction costs for changing to another product, even if it might be more effective.

There are also private-sector firms whose entire business model involves collecting information for the purpose of selling it. Data brokering—aggregating information from public and private sources, processing it to enrich, clean, or analyze it, and then licensing it for a fee to other entities—has been estimated as a $200 billion per year industry. Data aggregators also cite trade secrecy as a reason not to divulge where they get their data and what data they possess. On a smaller scale, many companies sell customer data, such as data collected by loyalty programs, as a second stream of income. This industry is subject to very little regulation currently, and nothing bars law enforcement from making use of these resources.

Reliance on the private sector can also further erode the scope of Fourth Amendment protections. Some government agencies have expressed the opinion that the Fourth Amendment can never apply to the acquisition of data from data brokers. Lawyers at both DHS and the IRS, for example, have taken the position that, while Carpenter requires the government to get a warrant from a judge based on probable cause to collect historical cell site information from cellphone providers, there is no restriction on acquiring that same information on the open market. As Slobogin puts it, “If the government can obtain personal information simply by announcing it is willing to pay for any data or images about wrongdoing that nongovernmental actors can access,” that would further erode any protections that currently exist.

What is to be done? Slobogin and Brayne each recognize the potential advantages of police use of digital information as well as the fundamental challenges it poses to democratic values. For both authors, the important thing is to strike the right balance, and both join legal scholars like Barry Friedman and David Sklansky in assigning that task to the democratic process. They argue that, prior to their deployment, these tools should be approved by legislatures and subjected to detailed regulatory regimes that specify the harm to be addressed; the permissible uses of the tool; the data that may be used, how long it may be retained, and who may access it under what circumstances; how to ensure data accuracy and security; and what compliance mechanisms will ensure that these rules are followed. Moreover, there must be some means of ensuring that they minimize the kind of government discretion that can lead to discrimination.

Once data collection and aggregation is permitted, history tells us that the government will use it, regardless of what the rules say.

There are good reasons to prefer legislation over today’s largely judge-made law. Legislatures are much more suited to line-drawing than the courts. Consider the issue the Supreme Court addressed in Carpenter: whether the police may acquire a week’s worth of cell phone location information without a warrant. The Court said no. But what about three days’ worth? Or one day’s? Or one hour’s? Legislatures make those kinds of distinctions all the time. Legislatures can also establish compliance mechanisms beyond the courts’ authority, such as requiring regular audits of database access or civil remedies for violations. (Brayne points out that the LAPD’s Palantir system can perform audits, but nobody she spoke with could identify an instance in which such an audit had been performed.) More democratic and transparent legislation of surveillance techniques might also go some way to bolstering their legitimacy, which can be particularly valuable in historically overpoliced communities.

Slobogin offers a unifying theory on which to base the body of regulation he proposes. He develops a “proportionality principle” according to which we should permit the least intrusive investigative techniques while sharply curtailing the use of more intrusive methods. On this score, he rejects what he calls the “probable cause forever” view advocated by many privacy law scholars—the view that all government collection of information should be governed by the probable cause standard. That standard is appropriate, Slobogin agrees, in some instances, such as in traditional, non-virtual searches, for arrests, and to compel DNA samples from individuals who belong to families that partially match DNA collected from a crime scene. In other instances, such as investigative activity that infringes on First Amendment rights (including political speech or the right to assembly), an even more stringent standard might be appropriate. But when it comes to less intrusive investigative methods, he argues, the justification the police must offer should be proportionate to the intrusiveness of the investigative tactic at issue. This opens the door to accepting, at times, a much more permissive standard.

Of course, intrusiveness is in the eye of the beholder. A longstanding critique of the Katz standard—according to which protections exist only where an individual has a “reasonable expectation of privacy”—is that it is too subjective. Slobogin recognizes this challenge and argues that intrusiveness assessments should be based on existing legal principles—police action that constitutes a trespass on private property, for example, should be considered more intrusive than police action that does not—and public survey data. Slobogin and others have done significant survey work demonstrating that public sentiments about the intrusiveness of particular police actions do not line up with existing legal protections.

Each of these sources is flawed, however. Indeed, Slobogin advocates surveys because he recognizes that existing laws provide insufficient protection for some sectors of society. For one thing, they skew by wealth. If Fourth Amendment protections were tied to property interests, for example (as some scholars and some Supreme Court justices have suggested), people living in a tent situated on public property, or traveling by foot, would enjoy less privacy protection than people living in a mansion surrounded by acres, or traveling by car. Moreover, existing laws do not account for the fact that we might be willing to cede certain information to the private sector that we want to protect from disclosure to the government. But survey data is no panacea. There is no such thing as a uniform public view on any given tactic. Indeed, even among those most likely to bear the brunt of intrusive police action—communities of color—views on what should be deemed acceptable are mixed. Moreover, public attitudes are shaped by facts on the ground. The existing, minimally regulated use of these technologies might be shaping views on their propriety that would have been much different had they been subjected to extensive democratic scrutiny before being deployed in the first place.

The exhortation to legislate and regulate might also prove inadequate in other ways. Data greed means that any system reliant on data will maximize data collection. Once data collection and aggregation is permitted, history tells us that the government will use it, regardless of what the rules say. If the data haystack is there, law enforcement will go looking for needles.

Even the program Slobogin holds out as a model example illustrates the problem. He describes a pilot program authorized in 2020 by the Baltimore City Council known as the Aerial Investigation Research (AIR) program, in which a private company mounted cameras on planes flying high above the city during the day and recorded the footage they captured. If a crime occurred, that footage could be employed to follow cars and individuals who were in the area at the time of the crime backward and forward in time to see where they came from and where they went. Slobogin views the program as relatively noninvasive because it merely identified locations for a short period of time and because, while the data could be stored for forty-five days, it was only permitted to be used to investigate violent crimes.

A civil rights and civil liberties audit of AIR performed by the NYU Law School’s Policing Project paints a much more complicated picture, as does independent reporting. First, a similar program was implemented in secret in 2016; only after its revelation sparked public backlash did the Baltimore Police Department adopt the more transparent policy in AIR. Second, and more important, AIR did not conform to the announced limits. It was billed as a way to track people going to and from violent crime scenes, but the Baltimore Police Department also issued “supplemental requests” for information that was not tied to crime scenes, and it stored much more information for a much longer period of time than its public statements seemed to indicate.

Moreover, even the most comprehensive, well-intentioned regulatory regimes will have to account for the risk of bias. Slobogin goes so far as to suggest race-specific algorithms to avoid building disparities into the system or tailoring predictive regimes to the precinct level. But to the extent that big data is possible because of the “big” part—the larger the data set, the more likely the algorithm trained on it will be accurate—slicing and dicing data at that level of granularity likely undermines any benefit of reliance on data in the first place.

The other significant recommendation both authors make is to increase the transparency surrounding these tools. This extends, of course, to transparency regarding the technologies in use and the rules that govern them. Anyone subject to police action based on data-driven tools should be informed that is the case and permitted to challenge the reliability of the tool. Moreover, everyone should have the opportunity to remediate errors in the data on which law enforcement relies. These types of protections already exist in some consumer contexts—including the use of credit scores—and should be extended to the law enforcement context.

Slobogin also argues that an ambitious set of additional transparency requirements should apply to predictive tools. The government should disclose the factors being employed in any predictive decision-making model as well as the weights those factors are assigned. Moreover, the programs’ efficacy should be verified by independent entities, not just the government or the private-sector developer. This call for transparency in predictive policing echoes similar arguments that have been made for years by both scholars and activists.

But in practice, transparency will be difficult to achieve. Private developers’ reliance on trade secrecy has created a formidable obstacle to meaningful transparency. And to the extent that the tools are using algorithms developed through methods of artificial intelligence, such as machine learning, even complete transparency of the algorithm itself would be insufficient. The benefit of machine learning is that it can identify relationships among data that humans cannot. While one day they might be able to explain themselves to humans in a way that would facilitate better verification and regulation, the current technology is often inscrutable. Perhaps most important, discussion of transparency requirements for predictive models seems to be putting the regulatory cart before the efficacy horse. If these mechanisms cannot be shown to improve policing effectiveness, that is where the discussion of them should end. No level of transparency can justify deploying tools that do not operate as advertised.

In the end, what we make of big data policing is up to us.

Brayne argues that these technological tools can also be employed to achieve a different type of transparency, by turning the surveillance eye around and using it to police the police. Data regarding police activity is notoriously spotty. She points to dashboard and body cameras as well as GPS locators on squad cars as a means of collecting valuable information that could be mined, just as we mine the data on civilians with whom the police interact. Effective data tracking of law enforcement activity can identify bias in policing practices, as it did in the challenge to the NYPD’s stop-and-frisk program. It can also help assess the effectiveness of resource allocation and identify rogue police officers. Perhaps unsurprisingly, officers on the beat in LA failed to embrace these techniques. Their objections were based less on privacy concerns than on the idea that these tools might be used to evaluate officer performance.

In the end, Slobogin and Brayne argue, what we make of big data policing is up to us. We must employ the democratic process to maximize the benefits while minimizing the risks and safeguarding fundamental values. But there is a long way to go from our current situation, which more closely resembles the Wild West than it does the detailed regulatory regime they call for.

It’s not entirely clear how these authors expect to get there from here. All of the incentives facing legislative bodies push for extending more power, authority, and discretion to the police, not less. Law enforcement agencies currently facing major pressure to bring down crime rates will surely resist limitations to their toolkit, and they will be joined in that effort by private-sector companies seeking to maximize lucrative licensing opportunities. Everyday Americans, and especially communities of color, which most frequently bear the greatest cost from overzealous law enforcement, will struggle to make their voices heard over those of technologies’ advocates.

As with most problems, there is no silver bullet. It will require a vigilant public, a responsive government, and a willingness to push back against the unfettered growth of a profitable industry in a time of heightened concern about crime rates and public safety. But anyone seeking to raise public consciousness on these issues would do well to read these clear-eyed books.

We’re interested in what you think. Submit a letter to the editors at