June 30, 2019
Here’s how Pittsburgh cops are working to predict the future
Carnegie Mellon researchers provide a rare glimpse at efforts to foresee crimes before they happen
By Zach Goldstein // Photos via Associated Press
Police departments in cities across the United States use technology to try to predict and prevent crimes. In postindustrial cities, the allure of such technology is palpable: With declining populations and often resultantly tight budgets, police leaders look for any opportunity to do more with less — to confront as many criminal incidents as possible with existing staff. Predictive policing — the science of figuring out where crimes are going to happen — offers that promise of increased efficiency. Since at least February 2017, the Pittsburgh Bureau of Police has experimented with a version of a predictive policing program.
These programs tend to be black boxes. Trying to predict crimes is an investigative effort, so police departments tend to shield details about which crimes are analyzed and how. The Pittsburgh Bureau of Police is no different; it declined multiple requests for an interview for this article. But there is a lot we already know about how predictive policing works in Pittsburgh.
A paper released last year by Carnegie Mellon researchers showed that the Bureau of Police compiled a database of over 206,000 crime incidents — an average of about 112 new incidents per day — from 2011 to 2016. The database also contained roughly a million 911 calls. The effort to gather data continues within the Bureau of Police, painting a detailed, constantly updated picture of crime in Pittsburgh: where it happens, when, and what kind.
The effort to gather data continues within the Bureau of Police, painting a detailed, constantly updated picture of crime in Pittsburgh: where it happens, when, and what kind.
With this data, crime analysts at the Bureau of Police, as well as researchers who were given access to the data, can use machine learning methods ranging from simple to cutting-edge to predict where and when certain types of crime are likely to occur. It is not known exactly what data is fed into the predictive algorithms, but it includes three broad sources: past crimes of the same kind as the target crime being predicted, other kinds of past crimes which are found to be predictive of the target crime, and 911 calls about crimes.
Police can use these predictions to make more informed choices about where to send officers to do what is known as “proactive policing,” stopping would-be criminals from breaking the law by sending a police officer to an area and hoping the heightened threat of arrest will deter people from committing crimes.
Wil Gorr, a retired professor at Carnegie Mellon University who worked on predictive policing with the City of Pittsburgh, said in an interview that proactive policing can take various forms, such as a car patrol or a “park and walk,” in which an officer polices an area on foot.
In some cities, predictive policing identifies specific individuals who are thought to be likely offenders. Pittsburgh has been using a “hot-spot-based” approach instead. Pittsburgh’s hot spots are 500 sq. ft. areas where crime is likely to occur, and they can be either “chronic,” meaning crime is likely to happen there at all times, or “temporary,” meaning they have occasional spikes of crime activity.
In some cities, predictive policing identifies specific individuals who are thought to be likely offenders. Pittsburgh has been using a ‘hot-spot-based’ approach instead.
A new approach
The Pittsburgh Bureau of Police has long used data analytics to monitor where crime is occurring the most and use that information to guide decision-making. But in February 2017, the Pittsburgh Police started to take predictive policing to the next level with the launch of an experiment, conducted in collaboration with Carnegie Mellon researchers Dylan Fitzpatrick, Daniel Neill, and Wil Gorr.
The experiment, which was one of the projects funded by a $600,000 grant from the Richard King Mellon Foundation, started with the researchers identifying chronic and temporary hot-spots throughout the city. Once identified, some hot-spots were selected to receive a “treatment” of proactive policing. Researchers then compared what happened to crime in the treatment and control hot spots and determined if the interventions had the desired effects on reducing crime.
While the police are still running the experiment, the Carnegie Mellon researchers behind it revealed some of their methods and preliminary results in an October 2018 research paper they posted online.
The paper describes the program as a success. Serious violent crimes were 17% lower and serious property crimes were 11% lower in the hot-spots where police officers were sent to patrol than in the hot-spots selected as the “control group.”
Some might worry that criminals will just commit crimes elsewhere to avoid the police. The researchers say this didn’t happen, as they found that only 0.3% more violent crimes occurred in the areas next to treatment hot spots compared with the areas next to control hot spots. For property crimes, there were actually 4% fewer crimes in the areas next to treatment hot spots compared to those next to control hot spots, suggesting a beneficial spillover effect of the experiment to an area slightly larger than the targeted hot spot.
But even if it might be effective in reducing crime, critics of predictive policing still have concerns about Pittsburgh’s program and similar programs in other cities.
Major concerns — and unanswered questions
In his 2017 book, The Rise of Big Data Policing, University of the District of Columbia law professor Andrew Ferguson outlined three big concerns that predictive policing programs cause: transparency, racial bias, and constitutional rights.
Regarding transparency, the research paper describing the experiment includes some information about how the program works, such as an overview of the kind of data used and the statistical methods employed, but a lot of the details remain hidden. Gorr says he is concerned that criminals may be able to get around the program if too many details are made public and they can figure out where and when police are going to be patrolling.
The datasets of crimes and 911 calls the researchers used are not publicly available, and the code the researchers used to build the predictive algorithm is secret as well. Members of Tech4Society, a Carnegie Mellon student activist group organizing against predictive policing, say these details matter and ought to be made public because the data and code are needed to figure out whether the program has negative unintended consequences. “You’d need access to the actual technical internals of the model to be able to do any proper auditing,” said Tech4Society member Priya Donti.
When it comes to racial bias, critics of predictive policing allege that predictions are often biased against racial minorities. They say using past crimes to predict future ones means any discriminatory behavior by law enforcement in the past will translate into discriminatory predictions. If police tended to disproportionately arrest people in minority neighborhoods in the past, then those neighborhoods will show up as being high-crime areas, and the algorithms will tell the police to return there again. Moreover, the algorithms provide a sense of legitimacy and objectivity to predictions which could intensify racial disparities in policing if the algorithms are based on flawed or racially biased data.
Wil Gorr says his team took steps to make the algorithm more equitable by avoiding some of the mistakes that predictive policing programs in other cities have made.
For example, they don’t use drug offense data. They try to predict violent crimes like homicide, rape, robbery, and aggravated assault, rather than more minor ones. (Although they do use data on less serious crimes like simple assaults as inputs to the model)
The program predicts locations, not individuals, unlike a recently cancelled program in Los Angeles. The program identifies not only chronic hot spots, but also temporary hot spots, which are more spread out throughout the city. Because of that, the program is able to identify high-crime areas in a wide variety of neighborhoods. And Gorr pointed out that the algorithm doesn’t consider race or any socio-economic factors explicitly.
“Any idea of predictive policing, we are immediately uncomfortable with because, in the past, officer’s racial biases have had negative consequences for the community as a whole.”
Despite all these precautions, Tech4Society’s Josh Williams argues that there still may be racial bias in the algorithm.
“Speaking from the perspective of someone who’s been involved with the black community… any idea of predictive policing we are immediately uncomfortable with because in the past, officer’s racial biases have had negative consequences for the community as a whole,” Williams said.
An Allegheny County predictive analytics program about child welfare risk underwent an evaluation to study its “accuracy, fairness, and trustworthiness” in 2018. Tech4Society members want a similar assessment to study the crime hot spot program. Both programs involved partnerships between Carnegie Mellon’s Metro21: Smart Cities Institute and local government, but one included a comprehensive audit to measure fairness and the other did not.
On constitutional rights, Andrew Ferguson says that predictive policing, when done wrong, has the potential to exacerbate police use of force and unconstitutional arrests. One reason is that it can lead to more police officers being sent to minority neighborhoods. Additionally, it might lead to police officers being more likely to make an arrest or use force. If a police officer knows that an area is a crime hot-spot, he might be more nervous, more likely to over-react, and more likely to decide that there is “reasonable suspicion” or “probable cause” — the minimum standards needed for a police officer to stop someone on the street for questioning or make an arrest, respectively.
The research paper describing the program doesn’t include an analysis of whether the experiment affected police use of force, arrests, or any outcome besides crimes committed. “The costs and the benefits weren’t viewed on equal terms,” Priya Donti pointed out.
The paper cites a Los Angeles study which found no difference in racial bias in arrests as a result of a predictive policing program, but Tech4Society students argue that this is not applicable to Pittsburgh. Wil Gorr, one of the authors of the Pittsburgh paper, said he doesn’t think the Los Angeles paper is relevant to the experiment in Pittsburgh. The Los Angeles program focused on predicting burglaries, while the Pittsburgh program focused on a broader group of violent crimes. “I don’t know why Dylan cited it,” Gorr said.
For now, the Pittsburgh Police will continue with their experiment, and all the known and unknown costs and benefits of the program will continue as well.
On May 21st, the Pittsburgh Police wrote in an email to Postindustrial that there would be a “full press conference which will occur in early June” about the predictive policing program. The press conference hasn’t yet occurred. Andrew Ferguson says that taking a secretive approach can lead to a backlash and even the cancellation of a predictive policing program, but if done right, it can be a positive technology. “Everyone wants violent crime to go down. And so if you think this is a data-driven strategy to help do that, demonstrate it, show it, prove it, and bring the community in.”
Zach Goldstein is a data journalist and a recent graduate of Carnegie Mellon University in Pittsburgh, PA. He is currently an intern at the Center for Public Integrity in Washington, D.C. and was previously an intern for 90.5 WESA, Pittsburgh's NPR affiliate. You can reach him at firstname.lastname@example.org and follow him on Twitter @zpgoldstein.
Share this page