Thomas is a writer at TNW. He covers the full spectrum of European tech, with a particular focus on deeptech, startups, and government polic Thomas is a writer at TNW. He covers the full spectrum of European tech, with a particular focus on deeptech, startups, and government policy.
Scientists have developed an algorithm that scans online ads for escorts to identify human traffickers and their victims.
The algorithm, called InfoShield, searches for escort ads with strong similarities, which can be a sign of trafficking.
Per the study paper:
The majority of ads suspected of HT [human trafficking] are written by one person, who is controlling ads for four-six different victims at a time. By looking for small clusters of ads that contain similar phrasing, rather than analyzing standalone ads, we’re finding the groups of ads that are most likely to be organized activity, which is a strong signal of HT.
Law enforcement agencies typically look for HT cases manually. InfoShield is designed to speed up the search by detecting mini-clusters of ads, grouping them together, and summarizing the common parts.
“Our algorithm can put the millions of advertisements together and highlight the common parts,” said study co-author Christos Faloutsos in a statement. “If they have a lot of things in common, it’s not guaranteed, but it’s highly likely that it is something suspicious.”
[Read: 3 new technologies ecommerce brands can use to connect better with customers]
Researchers at Carnegie Mellon University and McGill University adapted InfoShield from an algorithm used to spot anomalies in data, such as typos in hospital patient information.
In tests on escort listings that had already been identified as advertising victims of trafficking, InfoShield correctly flagged the ads with 84% precision. In addition, it didn’t incorrectly identify any of the listings as trafficking ads.
However, the team had to keep their findings private to protect the victims. To prove their algorithm worked, they applied it to tweets created by bots, which also typically tweet the same information in similar ways.
They found that InfoShield was also highly accurate at detecting the bots:
Moreover, it is scalable, requiring about eight hours for 4 million documents on a stock laptop.
The team now hopes to see their research help victims of trafficking, and ultimately reduce human suffering.
You can read the study paper here.
Greetings Humanoids! Did you know we have a newsletter all about AI? You can subscribe to it right here.
Get the TNW newsletter
Get the most important tech news in your inbox each week.