This article was published on February 26, 2022

Here’s how algorithms are made

Author Florian Jaton sheds light on the human side of algorithms


Here’s how algorithms are made

As software and algorithms become an increasingly pervasive part of our lives, there’s growing interest and concern on how they are affecting society, the economy, and politics.

Yet, most social studies of algorithms perceive them as obscure black boxes that function autonomously. This isolated look at algorithms, which separates them from their human elements leads us to the wrong understanding and conclusions.

The Constitution of Algorithms, a book by Florian Jaton, Postdoctoral Researcher at the STS Lab at the University of Lausanne, sheds light on the human side of algorithms by exploring them from the inside instead of studying them from afar. Instead of working his way back from a working algorithm and trying to figure out how it came into being, Jaton starts from seemingly unrelated entities, such as people, desires, documents, curiosities, and then studies how all of these come together and interact to form what we call algorithms.

By marrying ethnography and hands-on practice of ground-truthing, programming, and formulating, Jaton discovers all the small but important details and practices that go into creating algorithms. And in his journey, he shows us how we and our algorithms affect each other. Accordingly, his study of the constitution of algorithms can help uncover new directions to align our software with our values.

“When I started to be interested in the topic of algorithms in 2013, there was already a substantial, and generally quite critical, literature on the social effects of algorithms,” Jaton said. “These important works studied the ways in which algorithms acted on our lives, while also emphasizing algorithms’ opacity.”

While these studies were important and helped draw attention to the impact of algorithms on different levels, Jaton thought that the systematic denunciation of the power of algorithms could be counterproductive.

“I was especially concerned that algorithms were mostly considered abstract and floating entities made of distant code and obscure mathematics,” Jaton said. “How indeed to act against abstract, floating, and obscure entities? The fact that the critical depiction of algorithms did not give them much empirical thickness made, in my opinion, a much-needed collective contestation difficult to carry out.”

The problem, Jaton found out on closer review, was that algorithms are studied from outside, from the offices of critical sociologists who observe them through official accounts, such as reports, software, and academic papers. This methodology filters out the “fragile scaffolding that had previously contributed to their progressive shaping,” he writes.

“This is where my training in Science & Technology Studies (STS) was useful, since one of the basic postulates of this sub-field of social science is to consider techno-scientific devices as the products of situated and accountable practices,” Jaton said.

“A possible remedy to the disarming critical discourse on algorithms seemed then to lie in a drastic change of method, privileging the anthropology of science as framed, for example, by the in situ ethnographic works of Bruno Latour, Michael Lynch and Lucy Suchman, over distant document analysis (that yet remains important).”

To write The Constitution of Algorithms, Jaton spent two-and-a-half years working as part of a team of research scientists who were working on a computer vision algorithm. As he took part in and documented discussions, data-gathering efforts, programming sessions, code debugging practices, and refinement of theories, he realized that a lot of the important work that goes into creating algorithms is disregarded when they are studied in social settings.

In his book, he writes, “The invisibility of the practices underlying the development of algorithms can indeed no longer be considered positive: as they are the object of repeated disputes, it is now certainly important, or at least interesting, to document the practical processes that enable them to come into existence.”

Jaton breaks down the constitution of algorithms into three main phases: Ground-truthing, programming, and formulating.

machine learning

Ground-truthing

When a group of computer scientists, researchers, or engineers gather to create an algorithm, they are initially driven by a set of elements, including desires, skills, means, and hopes. For example, a research group might want to contest or outperform the results of a previously published scientific paper. The members of the team have a set of mathematical and programming skills that they can count on to achieve this goal.

They may have access to computing resources, academic papers, and digital tools that can help them. And finally, they might hope to make a change in a field of science, such as helping improve medical imaging or solve a problem that can later be productized, such as creating an algorithm that can detect defects in manufacturing plants.

But before they can develop an algorithm that can meet their goals, they must go through a process of problematization and ground-truthing. During this phase, the researchers must precisely define the problem they want to solve and determine the type of data they need to validate their algorithm. For example, an image classification algorithm determines whether an object is present in an image or not. An object detection algorithm, on the other hand, must also determine the coordinates of the object in the image. Other specifications might also apply, such as whether the image contains only one object or it may contain several objects? Do lighting conditions vary in the environments where the algorithm will be used? Do the objects appear against various backgrounds or do they appear against the same background all the time?

Once the problem is formulated, the researchers must establish the ground truth by collecting the right material that enables them to verify their algorithms and the models they will build in the future. For example, in the case of computer vision algorithms, the researchers will gather a dataset of images that fit the description of their problem and can be used to train machine learning models. These images then have to be labeled with the data needed to test the algorithms. For example, if they are creating an object detection algorithm, they must annotate each image with the data for the bounding box of the object(s) it contains.

It is important to understand that the way we think about problems and ground truths will largely affect the algorithms we create and the effects they will have. For example, if an object detection algorithm is derived from ground truths in which objects are centered, it might work well on similar images but fail miserably on images that contain multiple scattered objects. Likewise, if a facial recognition algorithm is only validated against images of people from a specific ethnicity, then it will perform poorly on images of other groups. As Jaton points out in The Constitution of Algorithms, “we get the algorithms of our ground truths.”

“To consider the question of ground truths head-on would make it possible to communicate the following message: algorithms, as technical devices, can only retrieve things that have already been defined,” Jaton said.

“As soon as an algorithm produces a result, the reflex should thus be to immediately ask the following question: from which ground-truth database does this algorithm derive? This would highlight the intrinsic limitations of algorithms—they are optimization devices—while not undermining their value and beauty—they are arguably the finest and most beautiful optimization devices we have designed to date.”

Examining and documenting the ground-truthing process is extremely important to the study of algorithms and their effects on society, especially as they take on sensitive tasks. There have been numerous instances where poor design has led to algorithms committing critical mistakes such as making wrongfully biased decisions, creating filter bubbles, and promoting fake news. There’s growing interest in understanding and addressing the risks of algorithms. Having a thorough process for studying and documenting the ground-truthing process will be crucial to addressing these risks.

“As long as the practical work subtending the constitution of algorithms remains abstract and indefinite, modifying the ecology of this work will remain extremely difficult,” Jaton writes in The Constitution of Algorithms. “Changing the biases that root algorithms in order to make them promote different values may, in that sense, be achieved by making the work practices that underlie algorithms’ veracities more visible. If more studies could inquire into the ground-truthing practices algorithms derive from, then actual composition potentials may slowly be suggested.”

Programming

Eventually, an algorithm moves to the programming phase, where a set of modules and a list of instructions are created to solve the defined problem and verified against the ground truth. But while this phase is usually reduced to pure source code, Jaton shows in his book that there is a lot more to programming than putting together a list of computer instructions.

“When cognitivists inquire into what makes programs exist, they cannot go beyond the form ‘program’ that precisely needs to be accounted for. In a surprisingly vicious circle that has to do with the so-called computational metaphor of the mind, cognitivists end up proposing numerous (mental) programs to explain the development of (computer) programs,” Jaton writes in The Constitution of Algorithms.

This problematic view on the practices of programming is deeply rooted in the history of computing, Jaton describes in his book. Scientists, researchers, and companies have tried to frame computers as input-output systems that have been created in the image of the human brain. In reality, however, it was the human mind that was later reimagined as an organic version of the computer.

These metaphors have reduced programming to providing the right set of instructions to a digital brain. They have also shaped how programmers are trained and evaluated, putting more emphasis on instruction writing and disregarding all the other practices that are valuable to developing software.

In his book, Jaton documents his own and his team’s experience in writing instructions, stumbling on bugs, writing intermediary code to zero in on the root of the problem, and consulting with other team members.

He emphasizes the importance of the intermediate steps taken to adjust and improve the program, the interactions between team members in refining the code, and many other steps and practices that are not reflected in the final source code. In reading his book, I got to reflect on my own process in writing code and implementing algorithms and the small details that are important to my work but I take for granted.

“I think that the most overlooked aspect of programming practices are all the small provisionary inscriptions that accompany the writing of programs (and that are then deleted and made invisible),” Jaton said. “I was really surprised to discover all these little writings and experiments that punctuate programming sequences, and integrating them into the analysis of coding practices seems to me to be absolutely crucial to better understand what happens during these engaging moments (because as Donald Knuth rightly says, software is hard).”

The micro-sociological analysis of programming practices is just beginning, Jaton says, and its propositions are therefore still mostly exploratory. Therefore, it is difficult to say where it might lead to. But he is convinced that a mastery of programming practices through in situ micro-sociological studies will enable greater flexibility in algorithmic design.

“A better understanding of whys and wherefores of programming practices would perhaps make it possible to extract what makes good programmers special, in order to then try to infuse this specialness into the algorithmic design communities,” Jaton said.

computing

Formulation

Finally, when an algorithm is implemented and verified against the ground truth, it becomes formulated into a mathematical object that can be later used in other algorithms. An algorithm must stand the test of time, prove its value in applications, and its usefulness in other scientific and applied work. Once proven, these algorithms become abstracted, taken as proven claims that need no further investigation. They become the basis and components of other algorithms, and contribute to further work in science.

But an important point to underline here is that when the problem, ground-truth, and implementation are formulated into an abstract entity, all the small details, and facts that went into creating it become invisible and tend to be ignored.

“If STS has long shown that scientific objects need to be manufactured in laboratories, the heavy apparatus of these locations as well as the practical work needed to make them operative tend to vanish as soon as written claims about scientific objects become certified facts,” Jaton writes in The Constitution of Algorithms.

“Once there are no more controversies or disagreements about a new scientific object, nature tends to be invoked as the realm that always already contained this constructed scientific object… when facts are certified and enrolled in further studies, the experiments, instruments, communities, and practices that allowed their progressive formation are generally put aside.”

But it is important to understand that algorithms, once formulated, become the foundation of other algorithms, where they will contribute to ground-truthing, programming, formulation, and other practices. Having deeper visibility into the different phases of the constitution of algorithms will better equip us to have constructive discussions about them and investigate their broader impact.

“This conception of algorithms as the joint product of ground-truthing, programming, and formulating activities—themselves often supported by other algorithms that may have undergone analog constituting processes—complicates the overall picture while making it more intelligible,” Jaton writes.

“Indeed, whenever controversies arise over the effect of an algorithm, disputants may now refer to this basic mapping and collectively consider questions such as: How was the algorithm’s ground truth produced? Which formulas operated the transformation of the input-data into output-targets? What programming efforts did all this necessitate? And, if deeper reflections are required, disputants may excavate another layer: Which algorithms contributed to these ground-truthing, programming, and formulating processes? And how were these second-order algorithms constituted in the first place?”

Zooming out and looking at this intricate web of algorithms and their social components that evolve together, we get a bigger picture of how we and algorithms are becoming inexorable parts of each other. This is an area where we still have a lot to learn and reconsider our own views, Jaton says.

“I am coming to think more and more that the fact that we are always building an algorithm on top of other algorithms allows us to remember that the questions of effects and uses remain central to the analysis of algorithms in society,” he told me.

“Indeed, when reading the book, one might sometimes think that the sociology/ethnography of the effects and uses of algorithms is quite distinct from the sociology/ethnography of their conception. But on closer reflection, this distinction is in fact counterproductive: those who design new algorithms are also part of society; they are actors who use and are influenced, to varying degrees, by other algorithms, which they sometimes use to build new ones. It is therefore impossible to completely put aside the question of the effects and uses of algorithms, and it is therefore important to also include the study of the social effects of algorithms within the study of their shaping.”

This article was originally published by Ben Dickson on TechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original article here.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with


Published
Back to top