A beginner’s guide to AI: Computer vision and image recognition

Welcome to TNW’s beginner’s guide to AI. This (currently) four part feature should provide you with a very basic understanding of what AI is, what it can do, and how it works. The guide contains articles on (in order published) neural networks, computer vision, natural language processing, and algorithms. It’s not necessary to read them all, but doing so may better help your understanding of the topics covered.

Teaching a computer how to ‘see’ is no small feat. You can slap a camera on a PC, but that won’t give it sight. In order for a machine to actually view the world like people or animals do, it relies on computer vision and image recognition.

Computer vision is what powers a bar code scanner’s ability to “see” a bunch of stripes in a UPC. It’s also how Apple’s Face ID can tell whether a face its camera is looking at is yours. Basically, whenever a machine processes raw visual input – such as a JPEG file or a camera feed – it’s using computer vision to understand what it’s seeing. It’s easiest to think of computer vision as the part of the human brain that processes the information received by the eyes – not the eyes themselves.

One of the most interesting uses of computer vision, from an AI standpoint, is image recognition, which gives a machine the ability to interpret the input received through computer vision and categorize what it “sees.”

TNW City Coworking space - Where your best work happens

A workspace designed for growth, collaboration, and endless networking opportunities in the heart of tech.

Book a tour now

Here’s some examples of image recognition at work:

The Ebay app lets you search for items using your camera
This neural network turns pitch black photos into bright images
Facebook’s AI knows a lot about your photos
How about an AI that can read your mind?

There’s also the app, for example, that uses your smartphone camera to determine whether an object is a hotdog or not – it’s called Not Hotdog. It uses computer vision and image recognition to make its judgments. It may not seem impressive, after all a small child can tell you whether something is a hotdog or not. But the process of training a neural network to perform image recognition is quite complex, both in the human brain and in computers.

AI, at this point, is much like a small child. Computer vision gives it the sense of sight, but that doesn’t come with an inherit understanding of the physical universe. For that, an AI needs training just like children do. If you show a child a number or letter enough times, it’ll learn to recognize that number.

Surprisingly, many toddlers can immediately recognize letters and numbers upside down once they’ve learned them right side up. Our biological neural networks are pretty good at interpreting visual information even if the image we’re processing doesn’t look exactly how we expect it to.

It’s easy enough to make a computer recognize a specific image, like a QR code, but they suck at recognizing things in states they don’t expect — enter image recognition.

The way image recognition works, typically, involves the creation of a neural network that processes the individual pixels of an image. Researchers feed these networks as many pre-labelled images as they can, in order to “teach” them how to recognize similar images.

In the hotdog example above, the developers would have fed an AI thousands of pictures of hotdogs. The AI then develops a general idea of what a picture of a hotdog should have in it. When you feed it an image of something, it compares every pixel of that image to every picture of a hotdog it’s ever seen. If the input meets a minimum threshold of similar pixels, the AI declares it a hotdog.

Any AI system that processes visual information usually relies on computer vision, and those capable of identifying specific objects or categorizing images based on their content are performing image recognition.

This is incredibly important for robots that need to quickly and accurately recognize and categorize different objects in their environment. Driverless cars, for example, use computer vision and image recognition to identify pedestrians, signs, and other vehicles.

For a deeper dive into computer vision check out the following:

Learn how to develop AI with TensorFlow’s image recognition tutorials
Get your hands dirty DIY style with a Hackster.IO project using image recognition
Take this free artificial intelligence course online

Check out our artificial intelligence section to learn more about the world of machine learning.

Story by Tristan Greene

Editor, Neural by TNW

Tristan is a futurist covering human-centric artificial intelligence advances, quantum computing, STEM, physics, and space stuff. Pronouns: (show all) Tristan is a futurist covering human-centric artificial intelligence advances, quantum computing, STEM, physics, and space stuff. Pronouns: He/him

Get the TNW newsletter

Get the most important tech news in your inbox each week.

A beginner’s guide to AI: Computer vision and image recognition

Get the TNW newsletter

UiPath pushes deeper into financial services with WorkFusion acquisition

Stop talking to AI, let them talk to each other: The A2A protocol

Discover TNW All Access

Anthropic’s $30B raise is about more than money

Kembara closes €750M first close to fuel growth of European deep tech startups