The Wayback Machine Passes 400 Billion Indexed Webpages

The Internet Archive today announced a massive milestone for its Wayback Machine: 400 billion indexed webpages. The data encompasses the Web as it looked anytime from late 1996 up until a few hours ago.

To celebrate the milestone, the Internet Archive has provided a list of The Wayback Machine highlights over the years:

2001 – The Wayback Machine launches.
2006 – Archive-It launches, allowing libraries that subscribe to the service to create curated collections of Web content.
March 25, 2009 – The Internet Archive and Sun Microsystems launch a new datacenter that stores the whole Web archive and serves the Wayback Machine. This 3 petabyte data center handled 500 requests per second from its home in a shipping container.
June 15, 2011 – The HTTP Archive becomes part of the Internet Archive, adding data about the performance of websites to the collection of website content.
May 28, 2012 – The Wayback Machine is available in China again, after being blocked for a few years without notice.
October 26, 2012 – the Internet Archive makes 80 terabytes of archived Web crawl data from 2011 available for researchers, to explore how others might be able to interact with or learn from this content.
October 2013 – New features for the Wayback Machine are launched, including the ability to see newly crawled content an hour after it’s archived, a “Save Page” feature so that anyone can archive a page on demand, and an effort to fix broken links on the Web starting with WordPress.com and Wikipedia.org.
Also in October 2013 – The Wayback Machine provides access to important Federal Government sites that go dark during the Federal Government Shutdown.

Onwards and upwards! Will The Way Back Machine have 500 billion webpages indexed by 2015? We wouldn’t be surprised if it happened sooner.

The <3 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

Top Image Credit: MamPrint

Story by Emil Protalinski

Emil was a reporter for The Next Web between 2012 and 2014. Over the years, he has covered the tech industry for multiple publications, incl (show all) Emil was a reporter for The Next Web between 2012 and 2014. Over the years, he has covered the tech industry for multiple publications, including Ars Technica, Neowin, TechSpot, ZDNet, and CNET. Stay in touch via Facebook, Twitter, and Google+.

Get the TNW newsletter

Get the most important tech news in your inbox each week.

The Wayback Machine passes 400 billion indexed webpages, covering the Web from late 1996 to a few hours ago

Get the TNW newsletter

4 in 10 translators are losing work to AI. They want remuneration from devs

‘Just like meaty sausages!’ Europe hosts first cultivated meat tasting

Join TNW All Access

How OpenAI and Microsoft reawakened a sleeping software giant

Air taxi firm raises $110M, plans to launch commercial service in 2026