What We Know About the Computer Formulas Making Decisions in Your Life
By Lauren Kirchner / ProPublicaThis piece originally ran on ProPublica.
We reported Thursday on a study of Uber’s dynamic pricing scheme that investigated Uber’s surge pricing patterns in Manhattan and San Francisco and showed riders how they could potentially avoid higher prices. The study’s authors finally shed some light on Uber’s “black box,” the algorithm that automatically sets prices but that is inaccessible to both drivers and riders.
That’s just one of a nearly endless number of algorithms we use every day. The formulas influence far more than your Google search results or Facebook newsfeed. Sophisticated algorithms are now being used to make decisions in everything from criminal justice to education.
But when big data uses bad data, discrimination can result. Federal Trade Commission chairwoman Edith Ramirez recently called for “algorithmic transparency,” since algorithms can contain “embedded assumptions that lead to adverse impacts that reinforce inequality.”
Here are a few good stories that have contributed to our understanding of this relatively new field.
Know any algorithms we should investigate? Send us your tips at [email protected].
Wall Street Journal, December 24, 2012
The Journal staff (including Julia Angwin, now a reporter at ProPublica) showed that Staples was giving online customers different prices for the same products depending on how close those customers were to competitors’ stores. Offering different prices to different customers is not illegal, the article points out. “But using geography as a pricing tool can also reinforce patterns that e-commerce had promised to erase: prices that are higher in areas with less competition, including rural or poor areas. It diminishes the Internet’s role as an equalizer.”
Chicago Tribune, August 21, 2013
Chicago’s police department is at the forefront of “predictive policing”—the idea that police can prevent crimes using a combination of mathematical analysis and careful interventions. Chicago’s “heat list” analyzes residents’ social networks and criminal records to identify people who are most at risk of either perpetrating or falling victim to future violence. (A TechCrunch piece this year discussed some of the thorny problems of bias that this raises.)
The New York Times, July 9, 2015
A recent Carnegie Mellon study found that Google was showing ads for high-paying jobs to more men than women. Another study from Harvard showed that Google searches for “black-sounding” names yielded suggestions for arrest-record sites more often than other types of names. Algorithms are often described as “neutral” and “mathematical,” but as these experiments suggest, they can also reproduce and even reinforce bias.
San Francisco Chronicle, July 22, 2015
The Internet erupted in anger after images of African Americans on Google Photos and Flickr were automatically tagged as “gorillas.” The Chronicle found two underlying issues: the data that programmers use to “teach” algorithmic software matters, and so does the diversity of the Silicon Valley companies that do the teaching. “Not enough photos of African Americans were fed into the program that it could recognize a black person. And there probably weren’t enough black people involved in testing the program to flag the issue before it was released.”
The Marshall Project and FiveThirtyEight, August 4, 2015
“Risk assessment” scores are being used at different stages of the criminal justice system, to help evaluate whether defendants and inmates will commit crimes in the future. The formulas include things like a person’s age, employment history, and even the criminal records of family members. But is it fair to score people based on not only their own past criminal behavior, but on statistics about other people who fit the same profile? And should these scores be used to help determine their sentences?
The New York Times, September 26, 2015
Volkswagen recently admitted to rigging the software in millions of its diesel cars to cheat on emissions tests. The Times points out that some new cars now contain computer software that’s more complex than the Large Hadron Collider. Along with increased convenience and safety, the endless lines of code also make it hard for regulators to keep up.
Los Angeles Times, October 6, 2015
Researchers analyzed hundreds of thousands of military records to create an algorithm that they say the U.S. Army can use to find the soldiers who are at the greatest risk of committing violent crimes. “For men, who accounted for the vast majority of both soldiers and offenders, 24 factors were found to be at play. Those most at risk were young, poor, ethnic minorities with low ranks, disciplinary trouble, a suicide attempt and a recent demotion.”
The Boston Globe, October 7, 2015
Massachusetts’ child welfare system is considering adopting “predictive analytics” software to help caseworkers identify the children and families who are at the greatest risk of abuse. Higher “risk scores” are assigned to people with more extensive criminal records, previous drug addictions, previous mental health problems, and other factors. Critics of the plan, like the ACLU’s Kade Crockford, argue that this technology risks “disproportionately ensnaring the poor and parents of color.”
FiveThirtyEight, October 15, 2015
In a notable example of reporters keeping algorithms accountable, a FiveThirtyEight analysis found that Fandango was skewing movie ratings upward. The site, which sells movie tickets, “uses a five-star rating system in which almost no movie gets fewer than three stars.” Confronted with these results, Fandango said that this was due to an error in its “rounding algorithm,” and promised to fix it.
ProPublica is a Pulitzer Prize-winning investigative newsroom.