The World Wide Web is growing a billion pages per day

The growth rate of the Internet is accelerating in such a degree that a rather amazing related milestone was passed; Google’s spiders discovered the trillionth URL. That’s 1,000,000,000,000 WebPages indexed, Cuil reported to have indexed almost 122 billion pages with the help of the Internet Archive. According to Google, the World Wide Web is growing at a speed of a billion pages per day.

Jesse Alpert & Nissan Hajaj tell explain how Google downloads the web, and reprocess the web-link graphs continuously, a Good example of how complex Indexing actually is:

“To keep up with this volume of information, our systems have come a long way since the first set of web data Google processed to answer queries. Back then, we did everything in batches: one workstation could compute the PageRank graph on 26 million pages in a couple of hours, and that set of pages would be used as Google’s index for a fixed period of time. Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day. This graph of one trillion URLs is similar to a map made up of one trillion intersections. So multiple times every day, we do the computational equivalent of fully exploring every intersection of every road in the United States. Except it’d be a map about 50,000 times as big as the U.S., with 50,000 times as many roads and intersections.”

(read more)

Story by Joop

Business development manager in Shanghai, always up to play with shiny gadgets, firecrackers or eat Shabu shabu. (Japanese hotpot) (show all) Business development manager in Shanghai, always up to play with shiny gadgets, firecrackers or eat Shabu shabu. (Japanese hotpot) • Check out joop.in • Got Twitter? ♥ -> @Joop

Get the TNW newsletter

Get the most important tech news in your inbox each week.

The World Wide Web is growing a billion pages per day

Get the TNW newsletter

EU lawmakers voted to shield colleagues from Belgium’s Huawei corruption probe

Uber’s bet on Nuro is bigger than it let on, at close to $500m

Discover TNW All Access

Meta will let employees stop being tracked, for 30 minutes at a time

Sam Altman tells Congress to fund AI testing, not to require model approvals