GitHub’s new AI tool doesn’t violate copyrights, says expert

Last week, GitHub launched a new AI-powered tool called Copilot that’s meant to help developers out by suggesting snippets of code automatically.

The tool was developed in conjunction with OpenAI by training the system on publically available source code of different projects. On paper, this feels like any other AI project’s training method. But several people took to Twitter criticizing GitHub’s move and calling it a copyright violation.

“I’m leaving GitHub because copilot uses my OpenSource code for training” is such an odd move. Anyone can fork it to there and GitHub can feed OpenSource code from anywhere to it and US copyright law permits this. I’m also pretty certain we should not strengthen copyright laws …

— Armin Ronacher (@mitsuhiko) July 3, 2021

If @GitHub (Microsoft) truly believes copilot isn't infringing on anyone's work, I want to offer them a chance to prove it: I'll donate $50k to a charity of their choice (or @EFF if we can't agree) if they release a Copilot version trained solely on Windows kernel source. 1/ https://t.co/WMWD6FTcR2

— Jake Williams (@MalwareJake) July 3, 2021

However, Julia Reda — researcher and former Member of the European Parliament — has argued on her blog that GitHub’s tool doesn’t violate copyrights.

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

She says while the idea of a big corporation like Microsoft using public code might seem against the whole idea of copyleft, banning it would be an even bigger misstep. Trying to prevent this practice would just end in tighter copyright laws, which would undermine copyleft more than Github’s tool.

She also added that text and data mining is not against copyright laws. Plus, machine-generated work — in this case, code snippets generated by the Copilot tool — can’t be called derivative work, and is not covered under intellectual property rules:

On the other hand, the argument that the outputs of GitHub Copilot are derivative works of the training data is based on the assumption that a machine can produce works. This assumption is wrong and counterproductive. Copyright law has only ever applied to intellectual creations – where there is no creator, there is no work. This means that machine-generated code like that of GitHub Copilot is not a work under copyright law at all, so it is not a derivative work either.

There’s a lot of debate going on around the world related to tweaking IP-related policies when it comes to machine-generated work, but it’ll take a while till these arguments will be put to bed. In the meantime, you’ll just have to keep tweeting out your frustrations.

Story by Ivan Mehta

Ivan covers Big Tech, India, policy, AI, security, platforms, and apps for TNW. That's one heck of a mixed bag. He likes to say "Bleh." Ivan covers Big Tech, India, policy, AI, security, platforms, and apps for TNW. That's one heck of a mixed bag. He likes to say "Bleh."

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with

GitHub

GitHub’s new AI tool doesn’t violate copyrights, says expert

Get the TNW newsletter

Also tagged with

This two-year-old startup already programs Blue Origin’s rockets. It just raised $20M.

Jeff Bezos is backing a two-year-old Cambridge AI lab at a $2.6bn valuation

Discover TNW All Access

Helion just got the world’s first licences to run a fusion power plant

Genesis AI thinks the humanoid hype is wrong. Its wheeled robot is the counterargument.