ArticlATE Q&A: Google Brain’s Vanhoucke on Robots, AI and Programming vs. Learning

The hardware of a robot is only as good as the software brain that powers it. This is why we are so excited to have Vincent Vanhoucke, Principle Scientist with the Google Brain team speak at our upcoming ArticulATE conference.

In addition to being the director of Google’s robotics research efforts, Vanhoucke has spent his career researching artificial intelligence and machine learning. Before we sit down with him at the show, we wanted to give you a little taste of what he’ll be talking about with a brief Q&A that we conducted over email.

If you want to see Vanhoucke in person and be a part of the discussion on the future of food robotics and automation, get your ticket to ArticulATE today!

What is Googley about robots and automation?

There is something new and exciting happening in the world of robotics: thanks to the advances in deep learning of the last few years, we now have vision systems that work amazingly well. It means that robots can see, understand, and interact with the complicated, often messy and forever changing human world. This opens up the possibility of a robot helper that understands its environment and physically assists you in your daily life. We are asking ourselves: what could happen if the devices you interact with every day could carry out physical tasks, moving and picking things up — if they could ask ‘How can I help you’ and do it directly?

What are some broad applications that AI and machine learning are really good at right now and where is biggest room for improvement?

Perception at large has made enormous progress: visual understanding, localization, sensing, speech and audio recognition. Much of the remaining challenge is to turn this progress into something actionable: connecting perception and action means understanding the impact of what a robot does on the world around it, and how that relates to what it sees. When a robot operates in a human environment, safety is paramount, but also understanding social preferences, as well as people’s goals and desires. I believe that enabling robots to learn, as opposed to being programmed, is how we can establish that tight degree of human connection.

Do you need massive amounts of data for training AI, or do you just need the right data?

Improving data efficiency has been a major focus in recent years. In the early days of deep learning, we explored what was possible when lots of data was available. Today, we’re probing whether we can do the same thing with a lot less data. In most cases, the answer is yes. In robotics, we’ve been able to leverage lots of simulated data for instance. We’re also finding new ways to improve systems on the fly, as they operate, by leveraging data-efficient techniques such as self-supervision and meta-learning.

Computer vision + AI is shaping up to be a versatile and powerful combination (spotting diseases on crops, assessing food quality, automating store checkout). Where is this technology headed and what are some untapped uses for it?

If you look at the perception systems that have evolved in animals and humans, they’re largely driven by function: for instance, our own visual system is very good at sensing motion, and at focusing its attention on the entities that we interact with in a scene. Computer vision systems don’t work like that today, because we haven’t closed the loop between sensing and acting. I think that one of the grand challenges of computer vision is how to optimize visual representations for the tasks we care about. Robotics will enable us to close this functional loop, and I expect that we will see the technology improve dramatically as a result.

What is your favorite fictional robot?

Japanese giant robots were a fixture of TV shows when I was a kid in 80’s France. Of the lot, I’m going to have to go with UFO Robot Grendizer out of sheer nostalgia. Today, I find inspiration watching Simone Giertz’ terrific robot contraptions on YouTube.

ArticlATE Q&A: Google Brain’s Vanhoucke on Robots, AI and Programming vs. Learning

Love this post?