Early bird prices are coming to an end soon... ⏰ Grab your tickets before January 17

This article was published on January 17, 2020

Google’s new AI language model can comprehend entire books


Google’s new AI language model can comprehend entire books

One of the prime challenges of a language-based AI model is to understand the context of the surrounding content.

To solve this problem, Google has introduced a new model called Reformer, which understands the context of 1 million lines using just 16GB space. The company built this to solve problems of its old model Transformer — a neural network that compares words in a paragraph to each other to understand the relationship between them.

Current models, support understanding of a few lines or paragraphs before and after the text in focus.

However, as it uses pair matching, Transformer takes a lot of data space if it needs to process text more than a few thousand words. So, it’s impractical when you’re processing a long article or a book.

Google made Reformer to solve the problem of a short ‘attention span’ and memory consumption of the old model. To solve the first problem, the new model uses locality-sensitive-hashing (LSH).

What does it mean? Instead of comparing all words with each other, the model uses a hash function to band similar words together in a bucket, and then compare words with each other in the same or neighboring bucket, reducing the processing overload. 

Credit: Google AI
Top: Image fragments used as input to Reformer. Bottom: “Completed” full-frame images

To solve the memory problem researchers have reversible residual layers that use activations (outputs) of one layer and use it in another layer. To test this model, Google fed Reformer some images and it created full-frame images out of that.

Google’s engineers said the new model can easily process whole books. This opens up a huge potential to process text in bulk.

You can read more about Reformer in a paper here, and play with its code here.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

Also tagged with


Published
Back to top