The heart of tech

This article was published on August 25, 2011


    Diffbot lets developers navigate code the way our eyes see the world

    Diffbot lets developers navigate code the way our eyes see the world
    Chikodi Chima
    Story by

    Chikodi Chima

    Chikodi is the West Coast editor of The Next Web, a multimedia producer and entrepreneur who travels the world in search of innovative foods Chikodi is the West Coast editor of The Next Web, a multimedia producer and entrepreneur who travels the world in search of innovative foods and spicy tech. Asked to choose a favorite, he would answer "both." Chikodi loves apps, does Twitter, Linkedin and has a thing for Tumblr.

    Diffbot today announced the release a production version of its API for developers, which lets people navigate the hidden world of the Web visually. All a developer/application needs to do in order to leverage Diffbot is submit a URL and he or she can see when content has changed on a website, or easily understand the different sections of a website, such as important text, advertisements and headlines. Diffbot also helps to distinguish context of material, so that Apple the computer maker is clearly differentiated from Apple the fruit based on other nearby articles.

    Diffbot has two API offerings, On Demand and Follow. On Demand was created to analyze home pages and index pages using the common layout markers such as headlines, bylines, and images, with a sepate feature set that can extract clean text and images from web pages. Follow tracks chanegs that are made to a web page, and any updates, similar to an RSS feed. With Diffbot it’s easy for a developer to follow only the part of the page he is most interested in, and easily extract the metadata organized in a meaningful manner.

    Like many powerful technologies, Diffbot emerged from a rather simple idea. “I was taking 8 CS courses one quarter, and created Diffbot as a tool for monitoring my class webpages,” says creator Mike Tung. “Anytime a professor posted a new homework assignment, lecture, or announcement, my phone would buzz and show me the new content. My friends wondered how I was always informed about everything in real-time and asked if they could use it, too. I realized, during my work in AI at Stanford, that techniques in computer vision and machine learning could be used to generalize my algorithm to not just analyzing class webpages, but any page on the web.”

    Diffbot already has significant traction, and is being used by AOL Editions, which touts itself as “The magazine that reads you,” to extract user recommendations based on interactions with different content. Hacker News Radio is an Internet radio station for the blind that leverages Diffbot to allow users to hear a webpage’s content while avoiding extraneous information such as privacy policies and other non-crucial data.

    Diffbot was launched from Stanford’s StartX program by Tung and co-founder Leith Abdullah, both on leave of absence from PhD programs at the school.