
There are two realities when it comes to artificial intelligence. In one, the futureās so bright you need to put on welding goggles just to glance at it. AI is a backbone technology thatās just as necessary for global human operations as electricity and the internet. But in the other reality, winter is coming.
An āAI winter,ā is a period in which nothing can grow. That means nobodyās hiring, nobodyās acquiring, and nobodyās funding. But this impending barren season is special, it wonāt affect the entire industry.
In fact, most of the experts wonāt even notice it. Google, OpenAI, DeepMind, Nvidia, Meta, IBM, and any university doing legitimate research have nothing to worry about. Startups with a clear, useful purpose will be fine ā typical market issues notwithstanding.
The only people who need to be concerned about the coming chill are those trying to do what weāre going to refer to as āblack box alchemy.ā
Black box alchemy
I shudder to call any AI endeavor āalchemy,ā because at least the idea of turning one metal into another has some scientific merit.
Iām talking about the wildly popular research vein wherein researchers build crappy little prediction models and then make up fake problems for the AI to be better at solving than humans.
When you write it all out in one sentence, it sounds like it should be obvious that itās a grift. But Iām here to tell you that black box alchemy represents a huge portion of academic research right now, and thatās a bad thing.
Black box alchemy is what happens when AI researchers take something an AI is good at ā such as returning relevant results when you search for something on Google ā and try to use the same principles to do something thatās impossible. Since the AI canāt explain why it comes to the results it does (because the work happens in a black box which we canāt see inside) the researchers pretend theyāre doing science without having to show any work.
Itās a scam that plays out in myriad paradigms ranging from predictive policing and recidivism algorithms all the way to bullshit pop facial recognition systems alleged to detect everything from a personās politics to whether theyāre likely to become a terrorist.
The part that cannot be stressed enough is that this particular scam is being perpetuated throughout academia. It doesnāt matter if youāre planning to attend a community college or Stanford, black box alchemy is everywhere.
Hereās how the scam works: researchers come up with a scheme that allows them to develop an AI model that is āmore accurateā at a given task than humans are.
This is, quite literally, the hardest part. You canāt pick a simple task, such as looking at images and deciding whether thereās a cat or a dog in them. Humans will wreck the AI at this task 100 out of 100 times. Weāre really good at telling cats from dogs.
And you canāt pick a task thatās too complicated. For example, thereās no sense in training a prediction model to determine which 1930s patents would be most relevant to modern thermodynamics applications. The number of humans that could win at that game is too small to matter.
You have to pick a task that the average person thinks can be observed, measured, and reported on via the scientific method, but that actually canāt.
Once youāve done that, the rest is easy.
Gaydar
My favorite example of black box alchemy is the Stanford Gaydar paper. Itās a masterpiece in bullshit AI.
Researchers trained a rudimentary computer vision system on a database of human faces. The faces were labeled with self-reported tags indicating whether the individual pictured identified as gay or straight.
Over time, they were able to reach superhuman levels of accuracy. According to the researchers, the AI was better at telling which faces were gay than humans were, and nobody knows why.
Hereās the truth: no human can tell if another human is gay. We can guess. Sometimes we might guess right, other times we might guess wrong. This isnāt science.
Science requires observation and measurement. If thereās nothing to observe or measure, we cannot do science.
Gayness is not a ground truth. Thereās no scientific measurement for gayness.
Hereās what I mean: are you gay if you experience same-sex attraction or only if you act on it? Can you be a gay virgin? Can you have a queer experience and remain straight? How many gay thoughts does it take to qualify you as gay, and who gets to decide that?
The simple reality is that human sexuality isnāt a point you can plot on a chart. Nobody can determine whether someone else is gay. Humans have the right to stay in closets, deny their own experiential sexuality, and decide how much āgaynessā or āstraightnessā they need in their own lives to determine their own labels.
There is no scientific test for gay. And that means the Stanford team canāt train an AI to detect gayness; it can only train an AI to try and beat humans in a discrimination game that has no positive real-world use case.
Three solutions
The Stanford gaydar paper is just one of thousands of examples of black box alchemy thatās out there. Nobody should be surprised that this line of research is so popular, itās the low-hanging fruit of ML research.
Twenty years ago, the number of high school graduates interested in machine learning was a drop in the bucket compared to how many teens are heading off to university to get a degree in AI this year.
And thatās both good and bad. The good thing is that there are more brilliant AI/ML researchers in the world today than ever ā and that number is just going to keep growing.
The bad thing is that every AI classroom on the planet is littered with students who donāt understand the difference between a Magic 8-Ball and a prediction model ā and thereās even less who understand why the formerās more useful for predicting human outcomes.
And that brings us to the three things every student, researcher, professor, and AI developer can do to make the entire field of AI/ML better for everyone.
- Donāt do black box alchemy. The first question you should ask before beginning any AI project related to prediction is: will this affect human outcomes? If the only science you can use to measure your projectās efficacy is to compare it to human accuracy, thereās a good chance youāre not doing great work.
- Donāt create new models with the sole purpose of surpassing the benchmarks set by previous models just because you canāt afford to curate useful databases.
- Donāt train models on data you canāt guarantee to be accurate and diverse.
Iād like to just end this article with those three tidbits of advice like some kind of smug mic drop, but itās not that kind of moment.
The fact of the matter is that a huge portion of students are likely to struggle to do anything novel in the field of AI/ML that doesnāt involve breaking all three of those rules. And thatās because black box alchemy is easy, building bespoke databases is damn near impossible for anyone without big techās resources, and only a handful of universities and companies can afford to train large-parameter models.
Weāre stuck in a place where the vast majority of students and would-be developers donāt have access to the resources necessary to go beyond trying to find ācoolā ways to use open-source algorithms.
The only way to power through this era and into a more productive one, is for the next generation of developers to rebuke the current trends and carve a path away from the status quo ā just like the current crop of pioneering AI developers did in their day.
Get the TNW newsletter
Get the most important tech news in your inbox each week.