This article was published on June 13, 2018

Facebook’s data should be a public resource

Vess Popov, a big data expert, thinks that Facebook needs to give away more data


Facebook’s data should be a public resource Image by: Brain Bar

Vess Popov, an expert in big data and psychometrics, spoke at Brain Bar in Budapest on the future of data. In his talk he shared his concern that platforms like Facebook are becoming more closed off. Their research is more secretive than ever and less of it is making its way to the scientific community and the general public than ever before.

Popov’s concern is understandable as he works for the Psychometrics Center at the University of Cambridge, a research institution that pioneered the study of psychology through big data analysis. They’re basically the ‘good guy’ version of Cambridge Analytica, which infamously manipulated voters based on their psychological profiles.

To find out more about the current landscape in psychological analysis in big data, TNW sat down with Popov on the bank of the Danube and asked him what the future actually holds when it comes to our personal data. Judging by Popov’s answers, the outlook is bleak, but there could be a possible solution.

We can’t trust companies with our data

Unfortunately it took a massive scandal like the Facebook/Cambridge Analytica disaster to get all of us interested in how our personal data is being handled. But where do we go from here? One has to ask if there’s any steps we can take in order to completely trust companies with our data. How can we assure they’re handling our data correctly and not using it to create harmful algorithms?

“The sad answer is we can never know for sure,” says Popov with cynical realism. “Only thing we can do is work on the incentives.” The problem we’re currently facing originates in the defunct system we’ve built around people’s data. We reward companies for abusing our personal information:

Right now the financial incentive to do psychological targeting and marketing is absolutely huge. We published a paper showing that when the ‘personality’ of the ad matches the personality of the customer, it’s twice as profitable.

So there’s nothing to prevent people doing that. And actually, people should be doing that in a way that involves the user because, frankly, I want to get more personal ads. Provided I know what data you’re using to personalize it.

That’s why Popov believes we’ll never be able to trust companies in a ‘blanket way’ when it comes to data handling — we’ll always have to evaluate it case by case. At the core of problem are destructive incentives which need to be changed from a market or regulatory perspective, but if that fails, Popov adds, the impetus for change falls on us, the individuals.

But even if we succeed in changing the fundamentals of our current data market, will it truly chip away at the profit data giants have made off our personal information? We’ve already lost our data to these companies, and our personality hasn’t changed since our data was mined. Doesn’t that mean that companies like Facebook will keep selling our information to third-parties, even after we’ve restricted their access to our data?

Yes absolutely they can. They’ve also been able to track users that don’t even have Facebook accounts — and they’re not unique in doing that. Every large advertiser does exactly the same thing. This is how our advertising infrastructure is built, on the basis of tracking. And tracking, as it currently works, is completely inconsistent with consent — even under the previous data protection law, before GDPR.

The reason is that you can’t say I consent to something that I don’t even know is on, or even understand how it works. Like that 100 ad exchange servers, each of them running a private auction for a split second just to show me an advert. I don’t understand that, I haven’t consented to it — but I don’t have a choice. I might be able to disable cookies or just stop using the internet, but then you’re placing the burden on users rather than companies that make all the money.

Popov emphasizes that even though the burden shouldn’t be on users, it doesn’t mean they shouldn’t be more involved. People need to be given proper control and oversight over their data — and legislation like GDPR goes a long way in giving people proper control over their data, but it won’t happen overnight. While we’re waiting for these protections to settle in, what has to be done in the mean time?

Credit: Brain Bar
Vess Popov on stage at Brain Bar

Facebook should keep giving away our data (but to better people)

It might sound weird, just when people have acknowledge the need for more privacy, but Popov argues Facebook should give away more access to the data it collected — for research. According to Popov, research can shed light on which areas legislation needs to cover. Basically, knowledge helps us better understand the problem we need to fix.

Back in 2007, David Stillwell, Popov’s colleague at the Psychometrics Center, created a Facebook application where six million people opted into sharing their data. This might sound similar to the Kogan/Cambridge Analytica app, but the important difference is that Stillwell only gathered data on people that opted in — not on their unsuspecting friends. This resulted in a huge open-sourced and anonymized database that could be used for academic researches around the world.

This resulted in a paper which illustrated how people’s Facebook likes could be used to determine their personal attributes. Published in 2013, this made the researchers of the Psychometrics Center one of the first to discover the capabilities of these types of data collection methods. It showed us, the public, that our Facebook likes (which were public at the time) were actually deeply private information.

“The situation is obviously different now, but you could argue that data would still be public if research like that hadn’t been done,” explains Popov. “We shouldn’t stifle research or innovation in the process of trying to reclaim our privacy. Because it’s actually the research and innovation that has the best promise of us having more privacy in the future.”

“If that research hadn’t been done, everything that Cambridge Analytica did would’ve been totally by the rules because they could’ve just used public data to do it. Then we wouldn’t have any legal problem to go against,” explains Popov.

Academic research that isn’t fueled by monetization greed is therefore essential to our society,but up until now it’s been hard for researchers to gain access to Facebook’s data. Currently tech companies themselves decide who they’ll give favorable access to, instead of a democratized or merit-based process.

Popov mentions that Facebook gave Kogan a massive dataset, unrelated to the Cambridge Analytica case, that was never shared with other researchers. The reason for this wasn’t because the company vetted Kogan, but simply because it had a working relationship him.

It is to prevent issues like this that Popov prefers a governmental approach, where companies are forced to share their data with researchers and research projects would be evaluated based on merit.

Ultimately, the choice of who can access data shouldn’t be up to the companies making profit from it. They didn’t create it — the public did.

“I think this incredibly valuable resource should be a public resource to a large degree, and I think individuals should be empowered to opt in and share their data with whoever they want,” says Popov.

Enough negativity. What can we actually do to fix things?

Knowing that not much can change without providing a better alternative, Popov says there are two things we can do to save us from a dystopian future:

  • Make tech companies more responsible for the content on their platforms
  • Implement data portability to ensure true competition

“I think we have a chance to impose greater editorial and publishing responsibilities on these platforms. Thus far Silicon Valley has grown as powerful as it has on the basis that it’s not responsible for the content published,” says Popov.

He adds that Facebook and Google have made great efforts in creating algorithms to detect bad content and remove it, but this shouldn’t absolve them of all responsibility. “Fascist, racist content gets a lot of clicks and a lot of shares, and every click and share is money into the pocket of Facebook.”

The other way forward is ‘data portability,‘ which is users having the option of downloading their data in a convenient format so they can move it between companies and services. The right to data portability is included in GDPR and Popov is incredibly excited by its possibilities in breaking down the current digital monopolies. However, it also happens to be one of the least defined rights within GDPR.

Popov says data portability is thought to stimulate competition and it works really well in the banking sector. Customers can move their account data easily between banks and the process takes seconds instead of weeks — but social media is more difficult.

The problem is that right now I have nowhere to move my Facebook data to. I want to have a social network, I want to stay in touch with my friends and family and so on, but I don’t have an alternative. I can download my Facebook data now but I don’t have a platform to take it to.

This shows the failure of competition regulation, Popov says, as he can’t even move his data to WhatsApp — that’s owned by Facebook too. In his opinion, we’ve neglected consumers rights in our big push for digital capitalism as they’re left with no choices.

That’s why data portability won’t mean much unless we have a market of secondary users, like banks that accept API of other banks, so there will be real competition and we’ll be able to extract value from our own data. This will also ultimately change the financial incentive of companies like Facebook, which is the root of many of our current problems.

If you get a competitor to Facebook that’s able to take the data you upload to it and create the same service without tracking you, I think that would be really interesting to see. It would probably take them a long time get to two billion users, but at least there would be some real choice. Right now we have little or no choice on the internet.

While users should take an active role in fighting for their data, it’s important that the burden of changing the system doesn’t end up being shouldered by users. Governments and companies should lead the charge in finding a solution.

“We, as a resource for Facebook and other advertisers to make money, need to be protected, the same way you’d protect a territory with natural resources,” says Popov. “We need much stronger protection and GDPR is a way towards that. But it needs to start from competition, data protection, and changing financial incentives.”

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with