A trio of Princeton social scientists recently conducted a mass experiment with 160 research teams to see if any of them could predict how children’s lives would turn out. The participants were given fifteen years of data and were allowed to use any technique they wanted, from good old fashioned statistical analysis to modern-day artificial intelligence. Nobody even came close.
That’s because artificial intelligence – much like psychics and headless chickens – cannot predict the future. Sure, it can predict trends and in some cases provide valuable insights that can help industries make the best decisions, but determining whether or not a child will become successful requires a level of prescience that brute-force mathematics can’t provide.
According to the Princeton team’s research paper:
We investigated this question with a scientific mass collaboration using the common task method; 160 teams built predictive models for six life outcomes using data from the Fragile Families and Child Wellbeing Study, a high-quality birth cohort study. Despite using a rich dataset and applying machine-learning methods optimized for prediction, the best predictions were not very accurate and were only slightly better than those from a simple benchmark model.
Investigative journalism outlet Pro Publica uncovered the sinister truth about predictive AI in a 2016 expose on the US court systems. Through a series of investigative reports, it empirically demonstrated that outright racial bias in machine learning systems used by US courts was responsible for sentencing black men to harsher sentences than white men with no way of demonstrating or explaining why.
These systems usually exist in a “black box,” meaning neither the original developers nor the end-users can determine why a machine ends up at a specific conclusion. The AI can tell us what it “predicts,” but it cannot explain why. When we’re dealing with, for example, sales predictions, these insights are useful. When we’re dealing with human lives and freedom, or trying to figure out whether a child will be a success, they’re basically just guesses – and statistically speaking, not very good ones.
For experts who study the use of AI in society, the results are not all that surprising. Even the most accurate risk assessment algorithms in the criminal justice system, for example, max out at 60% or 70%, says Xiang. “Maybe in the abstract that sounds somewhat good,” she adds, but reoffending rates can be lower than 40% anyway. That means predicting no reoffenses will already get you an accuracy rate of more than 60%.
In the end, despite giving the research teams a trove of data gathered over a 15-years-long “Fragile Families” study on the lives of matriculating children, nobody’s system resulted in an accurate prediction. Per the Princeton team’s aforementioned research paper:
In other words, even though the Fragile Families data included thousands of variables collected to help scientists understand the lives of these families, participants were not able to make accurate predictions for the holdout cases. Further, the best submissions, which often used complex machine-learning methods and had access to thousands of predictor variables, were only somewhat better than the results from a simple benchmark model that used linear regression.
This is further confirmation that predictive AI – whether it’s Palantir’s intentionally misleading predictive-policing technology or the demonstrably racist algorithms that power the US judicial system’s sentencing software – is hogwash when it directly affects human lives.