This article was published on August 11, 2019

Here’s how anonymized personal data should be used to create better services


Here’s how anonymized personal data should be used to create better services

Ask anyone their opinion on companies using their data and the response will almost always be negative. It’s not surprising; revelation after revelation has revealed that the likes of Google and Facebook not only profit off of user data, but are often reckless or intrusive in the way that they do so. 

The level of insight that major technology entities have into the average internet user’s lifestyle and preferences is frightening at times.

Back in 2012, Target used buying data to determine that a teen was pregnant before even her own father knew, in a story that has come to represent corporate overreach. They would routinely send coupons for diapers and cribs to women they expected to be pregnant, despite never being told by the customer explicitly.

When this was found to be making the recipients uneasy, Target would send booklets with childcare coupons placed within them in such a way as to look like they were there by chance.

This may seem underhanded, but it is not illegal 

Online sellers that track movements on the internet will have even more personal data than Target and will use it in equally intrusive ways.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

Public distrust in the way companies handle this kind of personal data has led indirectly to sweeping legislation in Europe in the form of GDPR. The legislation, which came into effect in May 2018, is a welcome development, giving more power to the individual with regard to what data is collected about them and how it is used by corporations.

Operating on an opt-in basis means that users have to actively choose to receive the perceived benefits of corporate data use.

Data pools will be slashed as a result, and contextual advertising is being used to fill the gap. An overwhelming majority (87%) of those surveyed by Sizmek plan to scale up their contextual targeting efforts in light of GDPR, while 77% agree that the legislation makes third-party data ad targeting more difficult.

The lack of user desire to share data with businesses is such that alternatives to personalization – tipped almost unanimously to be the future of advertising – will need to be explored.

What data do you REALLY want to share? 

One element that often gets lost in the conversation around the use of user data is that, in an ideal world, users should want to share their data.

First, the use of data can improve a user’s experience of a webpage. Media outlets can present content tailored to a user’s preferences, much in the same way that retailers can suggest products based on buying history.

Google’s autocomplete feature is heavily based on your browsing history and streamlines the whole process of using the search engine. Go deeper, and personal data can be used to build more efficient cities, create better products and even push the medical industry forward.

One example is personal genetics brand 23andMe. The company will analyse its customers’ DNA for a relatively inexpensive fee, giving them information on their genetic health risks and their carrier status.

According to 23andMe, it has analyzed the DNA of over 10 million customers, with more than 80% opting in to having their data available for research.

On average, the company says, each individual that opts in contributes to 200 different research studies and 23andMe has published more than 100 peer-reviewed studies in scientific journals. It has partnered with the likes of Genentech and Pfizer, providing data that has helped develop treatments for Parkinson’s and Crohn’s.

23andMe is a particularly interesting example because DNA is so intrinsically personal. It is so unique and identifiable as belonging to one particular individual that you can scarcely imagine sharing anything more intrusive.

The difference is, however, that very few people would be able to personally identify someone by their DNA results, and DNA results reflect very little about their owner’s personality and behaviors. Many people would much sooner have the findings from their DNA shared publicly than they would their browsing histories, for example.

Healthcare is based on personal data

There is a strong argument to be made that private data should be made more accessible for public projects; the effects when they are used for public good can be powerful.

In 2004, popular arthritis and pain drug Vioxx was pulled from the market by manufacturer Merck &. Co, after it was found that use of the drug led to increased risk of cardiovascular problems like heart attacks and strokes.

The connection was found when a researcher from the US Food and Drug Administration examined 1.4 million electronic health records and the drug was subsequently withdrawn from the market.

Binary District Journal spoke with Rainier Mallol, president of AIME, a company which is working to cut the gap between tech and public health using data-driven projects. Rainier’s work has been lauded, primarily for his company’s ability to predict outbreaks of diseases like Zika and dengue, with the latter seeing an effectiveness of 88%.

We asked Rainier whether he thinks that the wealth of private data held by private companies like Google could have a significant impact on what AIME could achieve.

“A lot of the data that we believe they have could be extremely useful in the improvement of public health decisions,” he tells us.

“One dataset that comes to mind is the movement of people from one place to the other. Even without providing sensitive data (name, ID) about a person, this dataset could help organisations like ours predict the rate of certain infections (such as the Flu, Tuberculosis, MERS, and any other disease transmitted by human contact/proximity), and these predictions would allow us to make better decisions regarding what operations should start, what changes in policy should be made, and how funding should be distributed.

“Another example is with Traffic. With a traffic data set, a company like ours could create software platforms for data-driven traffic policymaking. Ideally, a platform like this would be the basis of new policies, and would make our roads safer, reduce traffic, and would make our cities more productive. In terms of public health, we could devise better data-driven policies for road safety, as well as to change the way current road safety operations are handled.”

If health data could be considered more in line with actually serving the public than advertising needs, then real issues could be tackled. Without access to patient data, it would have taken Vioxx far longer to be pulled from the market and avoidable deaths may well have occurred.

The important distinction in this case, for many, will be the anonymity status of the electronic health records. If anonymous data is used to get a dangerous drug taken off of the market, it’s difficult to imagine many people finding the idea offensive. Things get trickier, however, when the data is not properly anonymized.

No such thing as private data

The data collected by private entities, if it were freely available to academics, could power incredible discoveries and enable game-changing developments in areas like machine learning. If academia had the resources, it could build its own data centres or buy market data from the likes of Amazon.

As it stands, it is reliant on handouts from entities like Google and Microsoft, which can be sporadic and unreliable form of philanthropy around which to base entire projects. Data can be used for much more than just ad targeting (how some major data-holding companies make the bulk of their money).

We asked Rainier if he believed that major companies had any moral obligation to make their data publicly available for use in healthcare. He believes the situation is more complicated than an issue of straightforward morality.

“People in these companies have led numerous R&D teams that have allowed them to have these different technologies,” he says. “It is in their domain whether they share the fruits of their R&D investment with others.

“It could even be possible that the people in these companies are too specialized in the technology, and don’t necessarily know how the datasets they acquire could help in public health. It’s also possible that the systems do not have a way to share this information, and would take time and resources to make it able to share the data.”

Everyone will have a different opinion on how morally obliged companies are to make their data transparent and accessible, but there is no denying its potential to benefit society. In an ideal world, the positive outcomes coming from constant data hoarding and analysis would be so clear that users would be comfortable with the idea of theirs being harvested.

This is not the case, though, with the majority of people still opposed to corporate data collection, and an overwhelming majority being opposed to that data being shared. As Rainier makes clear, there is the potential for that data to significantly benefit the healthcare industry going forward – if this was a more defined part of the public debate, the opinion may not be so hostile.

This article was orginally published on Binary District by Charlie Sammonds. Binary District is an international сollaborative technology community which creates unique competency-based workshops and events on new technologies. Follow them on Twitter.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with