How machine learning finds anomalies to catch financial cybercriminals

In the last few months, millions of dollars have been stolen from unemployment systems during this time of immense pressure due to coronavirus-related claims.

A skilled ring of international fraudsters has been submitting false unemployment claims for individuals that still have steady work. The attackers use previously acquired Personally Identifiable Information (PII) such as social security numbers, addresses, names, phone numbers, and banking account information to trick public officials into accepting the claims.

Payouts to these employed people are then redirected to money laundering accomplices who pass the money around to veil the illicit nature of the cash before depositing it into their own accounts.

The acquisition of the PII that enabled these attacks, and the pattern of money laundering that financial institutions failed to detect highlight the importance of renewed security. But where historical rules-based systems fail, artificial intelligence trained on high-quality data excels.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

[Read: A beginner’s guide to natural language processing and generation]

How attackers acquire your financial information

Suppose you’re in need of gasoline, and you’ve stopped at your usual station. You slip your credit card into the slot and the machine reads, “Remove card quickly,” just like always. Yet you probably haven’t noticed the miniature piece of hardware fitted over the slot, looking identical to the usual slot, that reads your credit card number as it passes by.

Or suppose you receive an email from alerts@weIlsfargo.com that reads “We Have Detected Suspicious Activity On Your Account, Did You Recently Spend $5000 on Amazon?” There’s a button that takes you to the website, and a message in the footer that says “Do not give your account credentials to anyone for any reason. Wells Fargo will never ask for your personal information in an email.” When you go to the website, it looks exactly as you would expect, so you enter your password and the hacker now has access to your account. Did you notice that Wells Fargo was spelled with: one lowercase “L” and one uppercase “i”?

Once the attacker has access, they can spend your money without your permission; as long as the individual transactions aren’t too large, most people rarely notice. Or worse, the attacker can clean your accounts in one motion before you realize what’s happened.

Anomaly detection methods

Companies employ machine learning to monitor emails, login attempts, personal transactions, and business activities every day. Most financial institutions use a kind of AI called anomaly detection, a process through which computers can classify activity on a consumer’s account as either typical or suspicious.

The analysis of time series data can be used for anomaly detection. It works by comparing the consumer’s transactions with their own recent transaction history. It often takes into account parameters like consumer location, transaction location, merchant location, merchant type, monetary quantity, time of the year, and more. If the probability of suspicious activity is above a certain threshold, it alerts human users of the danger. Alternatively, for very high probabilities, it might block transactions automatically.

For example, you may have a history of spending $30 per week at restaurants. If you were suddenly to spend $100 per week at restaurants, an AI may find this change to be normal during the holidays but potentially dangerous other times of the year.

To make these models effective, high-quality training data is essential. Training data is used to teach the model how to classify transactions as anomalies. Subject matter experts help the computer learn by manually identifying suspicious activity. The machine then uses the complex knowledge it learned from the training data to make predictions about novel data.

The trouble is that attackers are constantly innovating with new techniques that throw off the computers. A different kind of anomaly detection called unsupervised outlier detection helps us to root out emerging patterns of abuse. Instead of learning from the expertise of a human with training data, the goal of unsupervised outlier detection is to help the human to see patterns they didn’t see before.

Consider a drug trafficking organization that regularly executes cash sales in excess of $1M. If they were to deposit the money directly, the transaction would be detected and blocked. But, instead, they can create “shell” companies that pretend to offer services in exchange for the illicit cash; no actual business need occur. This technique is an example of money laundering.

In this case, rather than identifying individual transactions as criminal based on the training data from the past, the AI would try to define groups of companies that share similar patterns of behavior. This kind of AI might discover a large group of companies conducting business as usual, but it might also discover that there is a much smaller smattering of companies, all located in tax havens, all founded recently, all with relatively few clients, all with a steady flow of business, etc. By examining the groupings discovered by the AI, a security specialist from the finance industry can investigate whether any of the groups, or outliers that don’t belong to a group, might correspond to money laundering schemes. In this way, we can learn how criminals are organizing themselves, and use the information in the future to detect these new kinds of money laundering automatically.

The future of AI

One of the challenges with anomaly detection, especially when using deep learning techniques, is that it’s sometimes difficult to understand why certain transactions or companies were singled out as suspicious. Strictly speaking, the machine simply yields groupings and anomalies, hence requiring a human specialist to interpret the results. But what if an AI could tell us not only what the anomalies are, but also why they were classified as such? This emerging discipline is called explainable AI (XAI).

Let’s return to our example of going out to restaurants. Today’s AI is likely to send an email to alert you that unusual activity has occurred on your account, while an XAI would not only alert you but also tell you this transaction was flagged because it occurred on an unusual day or in an unusual location. Armed with this information, you would be able to better assess whether the email was anything to be concerned about.

The future of security and AI in the finance sector will involve learning from larger and more complex volumes of data. As we collect more and more information about how users behave, the power of AI burgeons. The more data at our disposal, the more accurately we can scrutinize suspicious behavior. In a world where the amount of data collected and stored doubles almost yearly, AI will be essential for generating the insights that keep us safe.

This article was originally published by Igor Kaufman & Ellery Galvin on TechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech and what we need to look out for. You can read the original article here.