Not long ago, the concept of image recognition was considered the wave of the future, somewhere on the sidelines of public consciousness. Not anymore.
Photography trends have now propelled this technology to front and center stage because of the sheer volume of images shot and stored and the need to find them quickly and easily. And as technology advances, image recognition joins other techniques to enhance search algorithms and results.
For example, Yahoo’s Flickr and Labs teams have already developed a viable search algorithm using a multi-pronged strategy of computer vision, geographic information and human interaction to move beyond the traditional reliance on metadata.
Example categorical results from Deep Convolutional Neural Network for weather condition and time prediction.
Yahoo Labs details its goals and methodologies in the new Science Powering Product: Yahoo Weather blog post and hosted an in-depth presentation to journalists outlining its ongoing research to improve photo search. This is designed to benefit the Flickr photo app and its users as well as address the commercial needs of current and future Yahoo apps and e-commerce services.
“Image recognition is not enough to understand the large corpus of photography, and especially in the context of Flickr, that’s been about the metadata — the photo is important as well,” said Pierre Garrigues, a senior principal and research engineer at Flickr. “So you have the metadata, but the pixels are really critical in understanding the geolocation and the spacial interactions around it.”
Yahoo’s Project Weather, which powers the Yahoo Weather app, offered an opportunity for the company to find and serve location-based images on request.
It gathered metadata, geolocation and social interactions of Flickr photos, combined with images hand-picked by editors, to select the most interesting photos for the app.
“The weather app was a good example because it involved a lot of photos and we can index these photos at various locations by not only geolocation but time and weather conditions and then retrieve them in the app to find whatever city you like,” said David Amyan Shamma, senior research manager at Yahoo Labs.
Instead of relying exclusively on user-generated content such as tags and photo titles, Yahoo turned to computer vision and deep learning to identify objects in images. This combination of social and computer vision increased the quality of search results.
Editors searched for photos from the service, Flickr members submitted their photos or were contacted for permission to use their photos, and Yahoo Labs analyzed the images based on social media interactions such as comments, likes and favorites to showcase the most appealing ones based on human judgments.
That social computing method yielded about 6 million weather photos. After corrections were made for inaccurate geolocations and time stamps, low resolution and errant content, about 1.5 million photos remained in the pool.
“It’s really a combination of the strengths of artificial intelligence and putting humans in the loop,” said Yahoo Labs senior research scientist Jia Li. “The image classification algorithm, based on deep learning, analyzes whether it’s a storm photo, a cloudy or snow photo, and day or night. Using the computer vision and social computing algorithm we submitted those photos to the editors.”
This is not the first time Flickr has publicly shared details about its image recognition technologies. Another recent blog post, National Park or Bird, describes how the team applied Convolutional Neural Networks and Scaled Computer Vision to improve image search and discovery.
Another demo showed a Taiwan e-commerce shopping site in which a semantic search was augmented by image technology to help users search and choose from among a page full of blouses. “This is a combination of semantic search with additional exploration,” Li said.
“First I was interested in a blouse, then in a specific blouse, but I don’t know how to describe it.” So instead of searching only with language-based terms, a button image search matching similar blouses is used.
“Despite the difference in brands and looks we can identify the visual similarity of the photos using advanced object detection and combining that with an image using a machine learning technique like deep learning,” Li said.
The team has also implemented an auto tagging feature (suggestive tagging), which can now recognize more than 1,000 objects and concepts. While currently indexed in the search engine and operative for public searches, the technology remains hidden from end users, at least for now. Yahoo is working to refine this feature, with particular attention focused on protecting people’s privacy, Shamma said.
Yahoo joins other companies in working on image recognition; smaller companies such as Orbeus and Applied Recognition are also developing algorithms. Yahoo is collaborating with a range of academic institutions and private companies in this effort, sharing a 100 million-photo package from its Creative Commons collection to provide a large enough data set for other researchers and collaborators to work with.
Ultimately, image recognition research comes down to benefitting people in their personal lives. Says Garrigues, “For regular people, it has not changed their lives quite yet. I have 10,000 photos on my phone and I spend all my time scrolling and I can’t find photos … We think there is a missing link in terms of bringing this technology to people and having it help them in their lives.”
➤ Yahoo and Flickr [Science Powering Product: Yahoo Weather]
Also read: National park or bird? Flickr knows for sure
Get the TNW newsletter
Get the most important tech news in your inbox each week.