While you were worried about net neutrality this summer, an artificial intelligence company named HiQ got itself tangled up in a legal battle with LinkedIn. The Microsoft owned social media platform this June demanded HiQ, an AI startup, cease and desist scraping data from its website. HiQ took LinkedIn to court over it earlier this month and won.
A Judge ruled LinkedIn didn’t own the data to such a degree that it had the right to prevent HiQ from compiling it. The court also ordered the social network to remove any technological impediments (things like IP blockers and bot blocking software) preventing HiQ from doing so.
It seemed like a huge win for bots, crawlers, and companies that do what HiQ does.
TNW reported the case, and noted that LinkedIn plans to immediately appeal. Since then, we’ve had the opportunity to consult with numerous experts on both sides of the argument, including the CEO of HiQ.
The implications of this battle — depending on who you ask — range from entire companies forced to operate without security, to a number of companies being instantly put out of business.
According to the CEO of Node, Falon Fatemi – whose company uses freely available data on the internet to train AI – if the social network successfully appeals the ruling, and is allowed to block HiQ, it could mean a loss of rights for individual users:
We, as users, own our information; that’s our data. LinkedIn and other social networking sites allow our data to be indexed by search engines, which of course benefits them, but that data belongs to us. When it comes to whether a company should consider that data public or private, the decision should be owned by the user.
HiQ scrapes data from LinkedIn, and it’s been adamant it only scrapes information users have chosen to make public, but to be clear: this wasn’t news to LinkedIn. HiQ CEO Mark Weidic told us:
We run a conference called Elevate, LinkedIn actually attended this conference several times. At one point they reached out to us and asked if we’d consider one of their employees for an award we give out.
LinkedIn and HiQ were actually partners, in a way. HiQ, in fact, built its entire business model on the idea that both it and LinkedIn saw benefit not only from users’ data, but also on the results of HiQ’s analytics.
According to HiQ’s testimony in court: LinkedIn never had a problem with the relationship until Microsoft took over and decided to take anti-competitive measures to usher in its own analytics.
Rami Essaid, the co-founder and CEO of Distil Networks — a company that provides website security, including bot detection and mitigation – told us:
You see this a lot. One of these businesses will try to use data without the authorization of the company hosting it and, they usually lose these cases and have to stop. That’s the problem with building a business model on someone else’s platform.
The difference is that HiQ didn’t lose; it has the right to stay in business and will live to fight another day.
But Essaid raises an excellent point: to him the distinction is that LinkedIn gave consent to use the data, and then was perfectly within its rights to withdraw that consent.
Opponents to this line of reasoning claim it’s not LinkedIn’s data to begin with.
What’s on the line might be more than just whether HiQ can keep its doors open. Essaid, who had no problem telling us he’s taking LinkedIn’s side, said HiQ could end up liable for damages if LinkedIn were attacked while “unable to defend itself.” And there’s a bigger issue at stake:
These bots are taking up resources; if this ruling stands, it’s going to get pretty ugly. This could effect every company in tech. Right now every other user is a bot already. It’s not feasible for every site to host unlimited bots.
We asked HiQ about this. Is it using the legal system to open a Pandora’s Box that will eventually cause the internet to be overrun with data-mining bots? Weidic told us:
The amount of impact we have? It’s infinitesimal compared to the day-to-day use they have. They’re designed to host all these connections. We’re a drop in the bucket.
If it isn’t resources, and it’s not strictly anti-competitive, then is this a criminal case? LinkedIn sent HiQ a cease-and-desist that appeared to accuse the AI startup of criminal acts. Its initial complaint alleged that HiQ was in violation of the Digital Millennium Copyright Act and the Computer Fraud and Abuse Act of 1986.
We asked Weidic why LinkedIn would suddenly change its mind about working with HiQ and why there were accusations of criminal computer fraud tossed around after years of cooperation:
It’s spurious. There’s isn’t a better way to put that. LinkedIn wants to monetize the data for themselves and they see us as competition now.
If LinkedIn wins an appeal HiQ won’t be the only company affected — at least that’s what Weidic thinks. He admitted that there were certain free speech aspects of the case that appealed to him, but also told us that the company considered every option and felt the courts were the only path forward.
He’s not the only one worried about the free speech aspects. Fatemi, who is keeping a close eye on the case, also thinks the ramifications could extend beyond just the companies directly involved:
If HiQ wins, it’s a win for the internet. Don’t get me wrong, we have a great relationship with LinkedIn, but if we’re defining whose data it is – if a court is going to make that decision – it’s the individual account holder’s data.
If gathering freely available information on the internet becomes criminal, or based solely on a single-gatekeeper system, then we might be losing the right to our own data.
Let’s not forget, HiQ isn’t exactly stealing from the librarian’s purse here; it’s checking out books and writing them down word-for-word. HiQ is clear that its only accessing the data users have made public. Furthermore, its not re-publishing the data as content, it’s analyzing it in order to make predictions.
The case will also decide whether HiQ should be put out of business because of a policy change at LinkedIn, a decision that could effect every company with a similar model.
One common theme in the rhetoric of everyone we spoke to is: the current precedent isn’t enough, we need clarification on current laws and a plan for handling data.
The outcome of this case will determine if publicly available data on a social media site belongs to the user, the network, or everyone.
LinkedIn declined requests for comment.