Ricardo Baeza-Yates is VP of Research for Europe and Latin America, leading the Yahoo! Research labs at Barcelona, Spain and Santiago, Chile, and also supervising the lab in Haifa, Israel. Yahoo! Research is about developing new technology, not only for search but also for other internet purposes. Web search is no longer about document retrieval so we need means for web-mediated goals.
F**k it, we'll do it live!
Our biggest ever edition of TNW Conference is fast approaching! Join 10,000 tech leaders this May in Amsterdam.
Intent and result
Search all depends on your intent. We’re trying to move away from a to-do or to-find intent to actual task completion. What are the challenges we are facing? You have to do it fast and it has to be scalable. Fast completion may be achieved through immediately presenting short cuts, deep links, enhanced results.
The premise is that people don’t want to search. Instead, people want to get their tasks done and get straight to their answers. So how to we do this? We move from a web of pages to a web of objects. People, places, businesses, restaurants are all objects that have attributes such as noisy or expensive (in the case of restaurants.) Intents of searchers are satisfied by presenting objects and attributes. It’s not exactly the semantic web but about finding implicit relations through web usage.
Opening up search
Baeza-Yates shows the SearchMonkey Ecosystem which he describes as a “win-win situation” where publishers can contribute objects and define how they want to present themselves and yield better results. By building an open ecosystem publishers would have incentives to contribute. The aim is to provide a more coherent search experience. The ecosystem wishes to leverage the wisdom of crowds which is not about the internet but about people. The premise of the wisdom of the crowds is that
under the right circumstanes, groups are remarkably intelligent.
The problem is the word ‘right’ – aggregating the ‘right’ data is the answer.
In the history of search 1996 was about descriptive data such as descriptions from librarians. In 1998 it was all about links, ranking and PageRank, using the wisdom of the publishers, the webmasters. However now in 2009 anyone can put links on the web so it’s not working anymore. According to Baeza-Yates it’s all about tags and in t future – we can use everything, queries from all web users.
Tag Explorer shows how tags may be used within search. In the example Baeza-Yates shows what people do with Flickr. It’s an endless form of image browsing where you can edit/add/remove tags and change your query. Tags are better than image features but together they are much better as they are complementary. One uses semantics and the other syntactic features but if you search for images, tags are better than just image processing.
So how can we use the existing great amount of content? We need to exploit the metadata that is already on the web and make it even richer by bridging implicit and explicit metadata. The Correlator can already bridge such relations within Wikipedia and provides a new search experience by bridging implicit and explicit metadata.
Read next: iPhone Prototype uses RFID