This article was published on August 7, 2019

You can now get 100,000s of books for FREE — thanks, confusing copyright law

Sometimes convoluted acts and amendments are our friends


You can now get 100,000s of books for FREE — thanks, confusing copyright law

Book lovers, rejoice — hundreds of thousands of works are now online for free! Free! FREE! How? The exciting, rock and roll world of the public domain and US copyright law.

Oh dear

Oh yes. Basically, before the 1976 US Copyright Act, American works only had a copyright length of 28 years, according to Motherboard.

Once this period was up, the books would have to be registered again. And you know what you can bet on in these sort of situations?

No…

Human laziness. Loads of people didn’t bother to re-register their copyright, meaning the work fell into the public domain. Which, for us, the people, is fantastic.

But why are we only finding out about these free books now?

The <3 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

Simply, people have suspected these books are in the public domain, but technical gaps and a bureaucracy made it tricky to know for sure.

According to Leonard Richardson, a writer and programmer tangentially involved in this saga, the Library of Congress kept records on which books were registered and renewed up until the ’70s.

Then, even better, the Internet Archive actually had scanned a lot of this data in. So, in theory, there was an accessible record of what books had fallen out of copyright.

Well that sounds great!

It’s halfway there. Unfortunately, only the copyright renewal information could be read by computers. The registration data couldn’t be.

In other words, it had to be analyzed by hand. This made the process of checking out the copyright status of a single book exhausting, and trying to get a definitive list of them all impossible.

So what changed?

Well, the glorious institution that’s the New York Public Library got involved.

To cut a long story short, the New York Public Library chose to analyze the period between 1923 and 1964. Why? Because “any book published before 1923 has surely been in the Public Domain and any book published after 1963 has positively been in copyright.”

In other words, this was a grey area.

It paid for the previously machine-unreadable registration data to be transferred into XML — something you can find more about here. Suddenly, the records were open.

And what did the New York Public Library find?

That 80 percent of books published between 1924 and 1963 are (or should be) in the public domain.

Wow.

Yeah, right?

And where can you get hold of these books?

The Hathi Trust has uploaded a huge number of works to its site, but that’s still only ten percent of the aforementioned 80 percent.

There’s no easy way to browse these on the Hathi Trust, but Richardson (the programmer and writer whose work we linked to earlier) has created a bot called Secretly Public Domain to help out. This clever tool posts a new book every several hours, so it’s a great way of dipping your toes into this collection.

All that’s left to say is god bless the people involved in this project — the only thing better than a book, is a free book.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with