Innovation

Super Curious Mario: Teaching AI to Keep Asking Questions

2017 Winter Power
SuperMario

In the ongoing quest to build artificial intelligence (AI) that more closely mimics the human brain, some computer scientists at Berkeley are focusing on one crucial piece of the puzzle: curiosity.

For the last three years Deepak Pathak and Pulkit Agrawal, Ph.D. students in the Berkeley computer science department, have worked to create software that can learn on its own. Now the team is looking at creating systems that can not only learn, but keep asking questions.

The common approach to machine learning is known as reinforcement learning; we train computers by giving them a goal, or a reward to seek out, like a mouse in a maze. Just as the mouse learns, via trial and error, to get to the cheese, the computer learns by assigning values to each right or wrong move it makes.

But that’s not exactly how humans learn, and it’s a limited approach to comprehension and learning in general, say Pathak and Agrawal.

The bot is insatiable for novelty, but not all novelty is useful. A boundless curiosity can be quickly sidetracked by trivialities and noise.

“As humans, when you’re put in a new situation, you keep learning new things,” says Agrawal. “That’s what makes a person different from a computer … which just knows how to do one thing very, very well.”

So, what if you inject curiosity into a bot?

To answer that question, the Berkeley team designed an “intelligent agent” and set it loose in the colorful, chaotic universe of Super Mario Bros. The agent (Mario) runs through the classic arcade game’s Mushroom Kingdom without any foreknowledge of the rules or features. Its only mission: Stay curious.

How it does that is more or less basic math: After each discrete action, the bot tries to predict what image it will see next, based on a network of information that the system is constantly improving. It then compares what it actually sees to its prediction. The difference between the images is given a numerical value, which reflects Mario’s level of surprise. Higher values (greater difference) are treated as rewards (surprises). Explains Pathak, the “high prediction error” triggers curiosity, which inspires learning.

After curious Mario spent time exploring his skills in Level 1, he was then able to navigate Level 2 much faster.

And curiosity-augmented AI seems to impart an advantage over standard AI. One problem with reinforcement learning is that the end goal is often too many steps ahead of the initial actions, interfering with the AI’s ability to draw meaningful conclusions about cause and effect. It can’t connect the dots because they’re too far away.

When the Berkeley team pitted a curious agent against a control bot (one programmed for reinforcement learning) in a game called ViZDoom, the curious agent was the clear winner. Where the curious bot swiftly navigated the 3D virtual environment, the control bot repeatedly bumped into a corner. (The curious bot doesn’t keep bumping against the wall, because it quickly grows bored with it.)

But curiosity can come with pitfalls, too. The bot is insatiable for novelty, but not all novelty is useful. A boundless curiosity can be quickly sidetracked by trivialities and noise.

Trevor Darrell, co-director of the Berkeley Artificial Intelligence Research Lab, notes that curiosity is not a new subject of study in AI, but that the team has been able to revisit it using modern machine-learning techniques and develop algorithms that teach the AI what to focus on and what to ignore.

Using artificial neural networks, the bot analyzes its movements in the context of its environment. “It’s like if you’re walking from the BART station toward the Campanile, you don’t have to look at the clouds to see where you are going,” Agrawal explained. “What we say is, ‘Mario, find the things you need [in order] to move, and if it isn’t required to move, ignore it.’”

The researchers say we’re still a long way from having robots who can walk around and interrogate us about the meaning of life.

Still, the team feels their work is an important step in the right direction if we want to develop AI that learns and adapts autonomously. In the meantime, Darrell says the team’s work has given us a useful framework going forward.

“We can’t just go for what’s novel without worrying about what’s relevant, and we can’t learn what’s relevant without this curiosity bonus to drive our exploration.”

More from the 2017 Winter Power issue

EdNote_Page_1_Image_0001

Editor’s Note: The Power of UC Berkeley

When you check out the table of contents for this iteration of CALIFORNIA you might be surprised by the many entries listed in the feature well. Generally speaking, the well is where we offer up several long-form stories off the theme of the magazine. The number of stories and bylines this time around doesn’t mean […]

EVChargingStation_Thinkstock

Charging Ahead: California’s EV Future

In 2025, California parking lots will be the new gas stations. Or so goes the vision of Ethan Elkind, director of the climate program at the Center for Law, Energy & the Environment at Berkeley. It’s one he shares with Gov. Jerry Brown, who reportedly considered a ban on gas-powered vehicles and recently signed legislation […]

QuantumComp

Berkeley Scientists Are Building a Quantum Computer

To the average technology consumer, a quantum computer sounds like something out of science fiction. But these machines are real, and scientists at Berkeley are working on one right now. So what is a quantum computer? Well, a “classic” digital computer, like the one at your desk, stores information in bits, a basic unit of […]