Close Mobile Menu

Super Curious Mario: Teaching AI to Keep Asking Questions

December 20, 2017
by Virgie Hoban

In the ongoing quest to build artificial intelligence (AI) that more closely mimics the human brain, some computer scientists at Berkeley are focusing on one crucial piece of the puzzle: curiosity.

For the last three years Deepak Pathak and Pulkit Agrawal, Ph.D. students in the Berkeley computer science department, have worked to create software that can learn on its own. Now the team is looking at creating systems that can not only learn, but keep asking questions.

The common approach to machine learning is known as reinforcement learning; we train computers by giving them a goal, or a reward to seek out, like a mouse in a maze. Just as the mouse learns, via trial and error, to get to the cheese, the computer learns by assigning values to each right or wrong move it makes.

But that’s not exactly how humans learn, and it’s a limited approach to comprehension and learning in general, say Pathak and Agrawal.

The bot is insatiable for novelty, but not all novelty is useful. A boundless curiosity can be quickly sidetracked by trivialities and noise.

“As humans, when you’re put in a new situation, you keep learning new things,” says Agrawal. “That’s what makes a person different from a computer … which just knows how to do one thing very, very well.”

So, what if you inject curiosity into a bot?

To answer that question, the Berkeley team designed an “intelligent agent” and set it loose in the colorful, chaotic universe of Super Mario Bros. The agent (Mario) runs through the classic arcade game’s Mushroom Kingdom without any foreknowledge of the rules or features. Its only mission: Stay curious.

How it does that is more or less basic math: After each discrete action, the bot tries to predict what image it will see next, based on a network of information that the system is constantly improving. It then compares what it actually sees to its prediction. The difference between the images is given a numerical value, which reflects Mario’s level of surprise. Higher values (greater difference) are treated as rewards (surprises). Explains Pathak, the “high prediction error” triggers curiosity, which inspires learning.

After curious Mario spent time exploring his skills in Level 1, he was then able to navigate Level 2 much faster.

And curiosity-augmented AI seems to impart an advantage over standard AI. One problem with reinforcement learning is that the end goal is often too many steps ahead of the initial actions, interfering with the AI’s ability to draw meaningful conclusions about cause and effect. It can’t connect the dots because they’re too far away.

When the Berkeley team pitted a curious agent against a control bot (one programmed for reinforcement learning) in a game called ViZDoom, the curious agent was the clear winner. Where the curious bot swiftly navigated the 3D virtual environment, the control bot repeatedly bumped into a corner. (The curious bot doesn’t keep bumping against the wall, because it quickly grows bored with it.)

But curiosity can come with pitfalls, too. The bot is insatiable for novelty, but not all novelty is useful. A boundless curiosity can be quickly sidetracked by trivialities and noise.

Trevor Darrell, co-director of the Berkeley Artificial Intelligence Research Lab, notes that curiosity is not a new subject of study in AI, but that the team has been able to revisit it using modern machine-learning techniques and develop algorithms that teach the AI what to focus on and what to ignore.

Using artificial neural networks, the bot analyzes its movements in the context of its environment. “It’s like if you’re walking from the BART station toward the Campanile, you don’t have to look at the clouds to see where you are going,” Agrawal explained. “What we say is, ‘Mario, find the things you need [in order] to move, and if it isn’t required to move, ignore it.’”

The researchers say we’re still a long way from having robots who can walk around and interrogate us about the meaning of life.

Still, the team feels their work is an important step in the right direction if we want to develop AI that learns and adapts autonomously. In the meantime, Darrell says the team’s work has given us a useful framework going forward.

“We can’t just go for what’s novel without worrying about what’s relevant, and we can’t learn what’s relevant without this curiosity bonus to drive our exploration.”

Share this article