The Next Level: World Modeling, Video Games and AI

Nov 05, 2024

Because video games are sometimes dismissed as mere entertainment they aren’t always taken seriously for their often impressive technological achievements. Let’s not forget that the same turbocharged GPUs that power the most advanced AI models are also used to render the graphics of triple-A video games—the computation it demands to display those virtual worlds is no trifling matter. Some modern “open world” video games are essentially “micro-verses”, miniature, snow-globe universes, which generate rich dynamics and emergent phenomena as a consequence of their gameplay rules, graphics, and engines. For AI researchers and developers, video games, as world simulations of varying degrees of plausibility, could present an enormous and still under-appreciated opportunity. Not just to make AI models that are good at playing video games, I might add, but also ones that could learn the general dynamics of a simulation and perhaps even extend that to the real world.

All About World Models

One frontier area of AI research is called World Modeling. A world model is essentially a representation of the experience of a physical environment: factoring in notions such as space, time, cause and effect. Such data is markedly different from the 2D sequential symbol systems such as more conventional textual, code, or math data that most AI models get trained on. Video games, or simulations more broadly, may therefore prove to be a rich source of multi-modal data for a variety of AI systems.

Developers are already tapping into video games as a data source. Startup Decart recently debuted its version of the vastly popular game Minecraft created by a generative diffusion model. By training their model on millions of hours of Minecraft gameplay, they arrived at an AI that can effectively “daydream” a playable version of Minecraft with no code involved.

These technical feats are impressive on their own and could have intriguing consequences for the multi-billion dollar gaming industry further down the road. But I’m more interested at the moment in more conceptual questions about what it could mean for the development of AI models generally. It goes back to what I’ve been saying about the depth of video games being underestimated as fodder for AI training data.

Game Theory

Gaming and AI have a long history, going back at least to the 1950s with Alan Turing’s seminal imitation game (Turing Test) thought experiment, and then later with chess-playing DeepMind, Go-playing AlphaGo and IBM’s Jeopardy-winning Watson. Games of all sorts have long been of interest to AI researchers. As clear, well-defined systems with fixed rules, games are attractive for their explicit conditions and quantifiable objectives. Modern, contemporary video games, however, often have complicated gameplay and brilliantly rendered graphics—all of which would require beefier processing than the simpler boardgames or 8 or 16bit games of earlier years to effectively handle. So we’re entering a new, largely unexplored phase in AI’s relationship with games.

Minecraft stands out even though its graphics are comparatively lo-fi. Instead, Minecraft’s gameplay rules make it compelling as an object of study for AI. Minecraft is essentially an open-ended sandbox where you can use its simple mechanics to combine endless game assets together to create anything you can imagine. Players have used Minecraft to build replicas of historical landmarks such as the Taj Mahal, or to create other games within it. Minecraft is like the real world in some basic respects: it’s built out of comparatively simple building blocks, and there are different rules for combining and relating these blocks to create all varieties of structures and processes. So a game like Minecraft is particularly interesting for its “universality,” much like a universal syntax; you can learn Minecraft to build anything by applying its flexible game rules and elements in 3D space.

Leveling Up

The real reason video games are exciting for AI research has nothing to do with entertainment. It has everything to do with simulation. By analogy, the real world is just a very, very big complicated video game with really, really good graphics.

Some video games have highly realistic physics engines that could serve as a test environment for AI to experiment in world-modeling physics. An AI game with realistic driving mechanics could help train up self-driving cars without the risk of accidents. So long as we can ensure that the model isn’t overfitted to the game world and can generalize to the real world, it could lead in all sorts of promising directions.

Many of these video games share similarities with the real world. They are perceived in three dimensions, they contain various interactions and dynamics that resemble the real world’s causality, and generally serve as much simpler, more tightly bounded microcosms of the physical universe. A multimodal AI model that can generate a 3D video game world and preserve its fixed rulesets is a model that’s closer to representing the physical world and its laws of causality.

So in short, video games should not be undersold as a promising opportunity for AI research and development, not only for the video game industry itself, which could do all sorts of unprecedented things with generative AI, but for generative AI itself-as a way to train models on richer and deeper datasets safely and in a controlled environment. Getting AI and video games, both computationally intense endeavors, to play together is going to be a serious, but potentially fruitful, challenge. It’s time to take things to the next level.

Logic Bombs

Discussion about this post