The Next Web

» It really exists: the Terra Incognita of the Web.

   

It really exists: the Terra Incognita of the Web.

edial Written on June 15, 2008 – 8:17 pm
Edial Dekker,

This is a guest post by New Media student Edial Dekker

Science Fiction writers, visionaries, whose books I consumed as a child, made me believe that in a few years, shiny robots would handle all mundane tasks. There are many robots today, but no funny-whistling R2-D2’s. The robots today are invisible and immaterial, reading and indexing millions of websites on daily basis. They are robots built for speed and efficiency, mapping the Internet as fast and as accurately as possible. A few years ago we thought we could find anything that was out there on the Web, today we realize the Web is fragmented, divided into four continents with ‘Terra Incognita’-islands; websites that are clustered and simply can’t be found, no matter how many times you click or how hard you try.

No round-trips

Most search-engines do not even try to reach the full Web, because indexing as many as websites as possible isn’t necessarily the best way to provide the best search results. The Web is big yet small. But the small world behind the Web is a bit misleading. The Web is a scale-free network, dominated by hubs and nodes with a very large number of links. The World Wide Web has a directed structure. Andrei Broder, Vice President of Emerging Search Technology for Yahoo!, was the first person to notice how this directed network had consequences for the topology of the Web itself. For example, if you want to go from website A to website D, you can start from node A, then go to node B, which has a link to node C, which points to D. But you can’t make a round-trip. Most likely there is a different route one would have to find for going from node D to node A.

The four different continents of the Web

Albert-László Barabási, a Hungarian scientist, famous for contributing his insights on network theories, has tried to map the Web into four different continents:A Strongly Connected, or Central Core (SCC): this contains a quarter of all websites, it gives a home to all indexed websites and is easy navigable. This does not mean there is a link between all nodes; but the paths are defined and allows you to surf between the nodes.Than there are the IN and the OUT continents: these continents are just as large as the Central Core but are much harder to navigate. From the IN continent you can easily reach the SCC, but there is no path taking you back to the IN continent. In contrast, the OUT continent can easily be reached from the SCC, but has links to take you back to the core (where all the magic happens). The OUT continent is mostly populated by corporate websites that can easily be reached from outside, but once you get in, there is no way out.

The fourth continent is made out of Tendrils and disconnected Islands; they are interlinked groups that are unreachable from the SCC and have no links back to it. These websites can contain thousands of documents. The location of these websites have nothing to do with the content, but with relation to other documents.

There’s no way you can reach it

These four continents significantly limit the Web’s navigability. Where we can go, depends on the continent you start your search at. No matter how many times you time you want to click, when you are in the Central Core there is no way you can reach the IN continent or the Islands that surround it. Ever realized why search engines are giving user the option to submit websites? It’s because then the crawlers can sniff into those isolated islands that can otherwise never be found.

Is this fragmented structure here to stay? Barabási thinks it is. As long links remain directed, homogenization will never occur. One of the founding fathers of the Web, Tim Berners-Lee has been stressing the importance of links that track back to where they are linked from, for many years. The way blogs use the track-back system, can also be used for connecting the IN and OUT continent. The bottom line is that directed networks always break into the same four continents. The only way to organize is to reorganize the relations documents have with each other, semantic web anyone?

I hope you like that post!

The Next Web Blog covers start-up news from all over the world (not just the Valley), exciting new technologies and inspiring entrepreneurs. If you're new here, you may want to read our 'About' page and subscribe to our RSS feed.

Do you have a start-up that we should write about? Contact us! Thanks for visiting and hope you come back again!
Add to Google Add to netvibes Subscribe in Bloglines
About the author: Edial Dekker is a New Media student at the University in Amsterdam. He works as a freelancer and is specialized in the arts of data visualization. He is also co-founder of BLOG08 and is involved in many side-projects that have to do with new media. If you are interested in data visualizations, be sure to drop an e-mail. See his LinkedIn profile and Blog for more information.

One comment to “It really exists: the Terra Incognita of the Web.”

  1. By Steven Carrol on Jun 16, 2008

    Albert-László Barabási, is a top explorer (I’m absolutely sure that Google founders read his book then built Google accordingly), you should read ‘Linked’ if you have not all ready. Another one I would highly recommend exploring is Paul Erdos.

    You will never get a fully connected continent as in nature ‘isolated blocks’ are also the method for emergence of new species.

    The terrain has to be allowed to join or to separate from the main core if it decides that it is in its best interest to ‘go in another direction’, but at the same time should not force or ‘levy distress’ on the main continent or others nor Vice-versa.

    [Reply]

Rate this post

Post a Comment

Subscribe to:

 RSS feed   Comments  Email update Email

Add to Google   Add to netvibes   Subscribe in Bloglines
Sign up for The Next Web Update (example) & get invited to ALL our events!





Accenture Innovation Awards MailChimp
ZayPay


This blog is currently sponsored by Accenture, ZayPay and MailChimp. Interested in becoming a sponsor too? Check our advertising opportunities for more information.



Mega Sponsors:

myMailMarket email marketing ZayPay
Organizers United Linkedin Group Fleck

Copyright 2006-2009 © TheNextWeb.com - Entries (RSS) / Comments (RSS)