The surprising intersection of video games and AI

Dive into a learning environment where AI can grow and evolve.

By Jeff House, AVP for AI and Emerging Technologies

Electronic devices under pink and blue neon lights.

When you hear “video game” and “AI” in the same sentence, your thoughts might jump straight to that level you couldn’t get past because the enemy AI was too difficult.

Beyond in-game frustration, there’s much more to AI and video games. AI researchers use video games to create a lifelike world environment for AI to work and learn within, testing its problem-solving power.

The history of AI research with games

AI research has a long history with games. In 1958, just two years after AI was adopted as a research program, researchers unveiled the first program that could play chess. It wasn't very good, but the technology quickly improved, and in 1978, a computer defeated a human chess master for the first time. During this time, limited memory and processing power were the biggest constraints the computer faced. Memory and processing power increased rapidly over subsequent decades; by the end of the century, only the most accomplished players in the world could beat the computer.

In the world of AI, chess is considered somewhat simple. While the number of possible positions is unimaginably huge, computers are fast. Twenty years ago, Deep Blue, IBM's chess-playing supercomputer, could analyze 20 million moves a second. With a player taking three minutes to make a move, Deep Blue could consider 50 billion options. Today, in the world of AI, chess is considered a “solved problem.”

With the game of chess behind them, researchers moved to the strategy board game of Go, which has a much larger problem space—the number of possible games—for the software to navigate. By the early 2010s, with the necessary processing power and a few decades of experience solving difficult problems, the software of the time still couldn't beat top Go players—not even when the computer was given additional stones (playing pieces) to start the game. DeepMind Technologies launched The AlphaGo project in 2014, applying deep-learning neural networks to the game. The next year, AlphaGo beat a professional Go player without an in-game advantage. In 2017, AlphaGo beat the top ranked player in the world—three games to zero.

On November 27, 2019, Lee Se-dol—a South Korean Go expert and the only person to ever beat AlphaGo—announced his retirement because, "I'm not at the top even if I become the number one… There is an entity that cannot be defeated."

It took 50 years to “beat” chess, but only 5 years to beat Go. In response to AlphaGo’s victory, Deep Blue's Murray Campbell called it “the end of an era...Board games are more or less done, and it's time to move on.”

So, what comes next?

While one team at DeepMind was working on AlphaGo, another was working on Deep-Q Networks (DQN), combining neural networks and reinforcement learning (RL)—more on that in a moment—to create a single algorithm that could learn and excel at many different challenging tasks. As an early test in 2015, the team applied this new technology to the curious world of Atari 2600 video games.

As opposed to a game like chess or Go, where the state of the game is represented abstractly, the DQNs were looking at the actual pixels on the screen. Instead of the 19 x 19 grid for Go, this system considered the 192 x 160 grid of pixels and the game score, building up the ability to beat these games from there. Using the same algorithm to achieve a level of performance equal to a human tester, DQNs won not just one or two games but 49 different games. Previously, game-playing software had been specially programmed and trained for a single game—here we saw the same code learn and master 49 different games, based only on what it could see on the screen. Just in the last few months has the 50th and final game, Montezuma’s Revenge, finally been solved.

A big part of this success is due to the field of RL, which falls somewhere between supervised learning (where the machine learning model is trained by giving it sample inputs and the correct results) and unsupervised learning (where the model tries to find patterns without being given the right answer to learn from). RL works by first defining some measure of "reward," and then the algorithm evolves through trial and error, always trying to maximize the reward it receives. A big part of this is capturing a sense of curiosity through the system. An RL agent needs to balance “exploitation” of what it already knows and “exploration” of what it doesn't, all while trying to maximize the reward it receives.

Video games make an ideal test bed for this sort of task. Many games maintain an explicit score that RL wants to maximize; the game has an environment that needs to be navigated and explored; and games typically rely on knowledge building over time.

But beating 1970s-era video games is one thing—with the limited resolution and game complexity constrained by the technology of the time. How about something more modern?

A more modern example

Though it was released almost 10 years ago now, StarCraft II is still the top real-time strategy (RTS) game in the world and the fifth most popular game in South Korea. StarCraft II is a different style of game compared to the Atari games targeted previously. The game space is huge by comparison, the on-screen activity is frenetic, and there are different types of opponents that use different strategies.

It wasn't an easy solution. DeepMind estimates that the processing power to train its AlphaStar AI cost $26 million. That AI is now better than 99.8 percent of all StarCraft II players in the world. And, because the training tends to involve playing different versions of the code against one another, the AI starts to develop unique strategies.

"AlphaStar is an intriguing and unorthodox player—one with the reflexes and speed of the best pros but strategies and a style that are entirely its own,” says Diego "Kelazhur" Schwimmer, a professional StarCraft II player. “The way AlphaStar was trained, with agents competing against each other in a league, has resulted in gameplay that’s unimaginably unusual; it really makes you question how much of StarCraft’s diverse possibilities pro players have really explored."

An opportunity to research and learn

Clearly, AI is getting quite good at video games. But the relationship goes beyond this mastery. Increasingly, AI is being trained in virtual environments; NVIDIA is using their FleX physics engine to create environments so realistic that they’re using RL to teach robots how to move in the real world. And Unity, creator of one of the top video game engines, hired Uber's head of AI away to build their own AI team. Unity now has been running public challenges to use their tools to build smarter AI.

Video games and AI are no longer distant cousins. AI is being used to create better games, teaching us new strategies—and frustrating the heck out of a lot of great players as records fall. And, perhaps most significantly, researchers are using video games as a learning environment where AI can grow and evolve. RL is an exciting area, and the use of video games as learning targets and learning environments is only going to grow more intriguing.

We’re excited to launch a series of blog posts about emerging technologies. Make sure to follow along with us here at Indigo Insights.

technology

Share this post: