Paying for a service doesn’t guarantee it won’t sell your data

In the wake of Facebook’s Cambridge Analytica scandal and congressional hearings, there’s a palpable sense that the pressure slowly building around online data security is about to finally catapult us into uncharted territory.

The issues aren’t new — Facebook has been a trailblazer in social media tech, but lack of transparency and misuse of user data have ruffled feathers over the years. A timeline of Facebook’s apologies published by The Washington Post lists 14 years worth of sorry-not-sorrys.

Lawmakers on both sides the aisle in the US seem to agree that it’s time to regulate, but doing so is a tightrope walk. No one dares infringe on the first amendment, but consumers must be protected — without stifling Silicon Valley innovation, of course.

It’s clear something isn’t working. “If you’re not paying for it, you become the product” has (again) become a popular phrase, but it misses the bigger picture which could be just as nefarious — even if you are paying, you might still be the product.

AI product companies: the new masters of data

Celebrate King's Day with TNW Conference :tickets:

Use code GEZELLIG40 on your Business, Investor and Startup passes and get 40% off. Offer ends April 29.

AI in enterprise is where social media was a decade ago. Though still undoubtedly cutting edge, AI is being rapidly embraced by large companies. A recent study by BrightEdge showed that more than 50 percent of respondents said that AI was important or a must-have for the future of marketing.

According to a new report by Investor’s Business Daily, Walmart, Monsanto, John Deere, and Devon Energy are all utilizing AI to optimize supply chain issues, help identify promising new products, or increase overall efficiency. Finance firms are optimistic that AI can massively improve money laundering and fraud detection. Huge efficiency gains are expected in manufacturing and healthcare once AI is fully implemented.

What we may have forgotten, in our rush to prepare for the future, is that data is the fuel of AI technology. Per The Economist, the so-called “data-network effect,” in which data attracts users who generate data, which attracts more users (and so on) is a powerful economic engine, comparable in scale to the oil industry of the last century.

Thus, it’s common practice for companies developing AI products, eager to improve their AI engine and prove the value of their outputs, to offer free demos in exchange for data samples. These data sets are extremely valuable, as they’re used to improve the AI engine in preparation for the next sample.

Are we repeating our mistakes?

Private enterprise data is a collection of consumer data at its core, and though the projected rewards of successful AI implementation are tantalizing, opportunities for misuse abound. Companies shopping around for AI product vendors have many options to choose from, and will likely supply sample data to several providers.

Most likely, only one vendor will be chosen, but the others will eventually be employed by the competition. At best, these companies will have ultimately fueled this competition — diluting their own competitive advantage by offering their unique customer insight to an AI service provider looking to consume as much data as possible.

At worst, enterprises dabbling in AI and submitting data to different vendors may open themselves (or customers) to security risks. As a team of researchers from New York University recently proved, it’s relatively easy for an adversary to create a “maliciously trained network” to “misbehave” when responding to certain inputs.

For example, a stop sign interpreted by an automated vehicle may be manipulated to a disastrous end. The more data out there, the bigger the risk. This is, of course, a worst-case scenario, but it serves to illustrate how security issues will become more important as the technology matures.

Of course, industries with intrinsically strict regulation do not run this same level of risk. This may be the reason why AI hasn’t quite cracked healthcare yet — in the tightly-controlled healthcare industry, data samples are either too small or impossible to get a hold of. Regardless of the level of regulation in their respective industries, it’s a must for decision-makers to fully think through the consequences.

Thankfully, there is an alternative

The Silicon Valley culture of innovation drives product companies, and AI is no exception.

However, there also are AI solutions companies that recognize the potential roadblocks inherent in this direction (both in terms of data security risk and PR). This second school of thought believes data must be protected at all costs, even if it means slowing AI development.

These solution companies deploy behind a client’s firewall, in secure cloud environments, and ensure that not a single byte is leaked. Examples of this set include giants like Microsoft, a number of startups, and to an extent even IBM Watson.

The prevailing view on social media technology, which is reasonably mature and has flourished without restriction, is that consumers should be reasonably informed on where their information goes and what is done with it.

It seems obvious now, but we shouldn’t forget that this view has evolved out of a series of scandals. Enterprise should take heed — a partnership in AI must be based on trust, transparency, and a hell of a firewall.

Long-term thinking in a short-term industry

The journey forward is murky; murkier still when trying to plan for the tomorrow’s tech. If we are successful in regulating without stifling innovation, the landscape of technology will look completely different 15, 20 and 30 years from now.

We’ll need to keep our leaders (both in government and in the private sector) accountable for long-term thinking as well as short-term if we want to avoid growing pains.

How do we do this? Very tactically, we work with AI solutions provider that keep our data secure, rather than AI products companies whose ambition is to grow based on our data. On a grander scale, we stay vigilant as consumers and decision-makers.

We speak with our wallets and with our attention. We stay wary of “free” — not to pass up useful tools, but to ask the hard questions, and to pay attention to what happens with our data so that each one of us is prepared to make the best value decisions possible.

Story by Praful Krishna

Praful is an AI/ Cognitive Computing expert who helps Fortune 500s automate boring, tedious, or repetitive text-based tasks. He founded Cose (show all) Praful is an AI/ Cognitive Computing expert who helps Fortune 500s automate boring, tedious, or repetitive text-based tasks. He founded Coseer after a career with a hedge fund and a turnaround CEO role. Praful is an alumnus of McKinsey, Harvard, and IIT. He is also a hapless dad living in San Francisco. Connect with him on Quora (he's the most read author on Cognitive Computing), or on LinkedIn.