Morning Edition LIVE
Vol. I · No. 1
Est.
MMXXVI

The A.I. Beat

Dispatches from the frontier of machine intelligence
Three
Dollars
← Front page Industry May 16, 2026 · 7 min read
Industry

Runway Wants to Build World Models Through Video, While AI Radio Stations Prove We're Not There Yet

The AI video startup is betting on a path to general intelligence that doesn't look like the rest of the industry, just as experiments show current models still can't be trusted to run businesses alone.
Runway Wants to Build World Models Through Video, While AI Radio Stations Prove We're Not There Yet

Runway is making a big bet that video generation is the bridge to world models, the holy grail of AI systems that truly understand how the physical world works. It’s a contrarian play from a company that started by building tools for filmmakers and now wants to compete with Google and OpenAI on fundamental AI research.

The startup’s thesis is simple: video contains more information about how the world works than any other modality. When you generate a realistic video of a ball bouncing or water flowing, you’re implicitly modeling physics, causality, and temporal relationships. Do that well enough, at scale, and you’re not just making videos anymore. You’re building a system that understands reality.

It’s an interesting angle, especially coming from an outsider. While the big labs have pivoted toward reasoning models and agents, Runway is doubling down on generative video as a path to something bigger. The company clearly sees its origins outside the traditional AI research pipeline as an advantage, not a handicap.

But before anyone gets too excited about AI systems that truly understand the world, Andon Labs just ran an experiment that shows how far we still have to go.

AI radio stations can’t handle the basics

The setup was straightforward: give four leading AI models (Claude, ChatGPT, Gemini, and Grok) each a radio station to run without human intervention. Simple premise, clear success criteria.

They failed spectacularly.

The experiment, called “Thinking Frequencies” for Claude’s station, “OpenAIR” for ChatGPT, “Backlink Broadcast” for Gemini, and “Grok and Roll Radio” for Grok, was designed to test whether current AI models can handle autonomous business operations. The answer appears to be no.

The models couldn’t maintain consistent programming. They made erratic decisions. And most importantly, they demonstrated exactly why you don’t want AI running critical operations without human oversight right now. These are the same models companies are rushing to deploy as autonomous agents for customer service, data analysis, and business intelligence.

The contrast between Runway’s ambitions and the radio station reality check is telling. We have startups betting billions that video generation will lead to true world models, while current frontier models can’t reliably run a radio station for a few days.

The gap between research and reality

This is the tension the industry is living in right now. Research labs are publishing papers about reasoning, planning, and multi-step problem solving. Startups are raising massive rounds on the promise of AI agents that can handle complex workflows. And then you run an actual experiment where AI has to maintain consistency over time, and it falls apart.

Runway’s bet on video as a path to world models is intellectually compelling. Video really does contain rich information about physics and causality. And there’s a reasonable argument that current approaches to AI, which rely heavily on text and code, are missing something fundamental about how to represent knowledge about the physical world.

But the radio station experiment is a reminder that even our best models struggle with basic operational consistency. They can’t maintain a coherent strategy. They can’t learn from their mistakes in real time. And they definitely can’t be trusted to run something unsupervised, even when the task is relatively constrained.

The question for Runway and everyone else chasing world models is whether better video generation actually gets you there. Maybe it does. Maybe training on enough video data really does give you a system that understands causality and physics in a way that current LLMs don’t.

Or maybe it’s another modality that produces impressive outputs without actually understanding anything. We won’t know until someone builds it.

What this means for the industry

Runway’s approach matters because it represents a different path than the one most of the industry is taking. While OpenAI and Anthropic focus on reasoning and tool use, and Google pours resources into multimodal models that do everything, Runway is making a focused bet on video as the key unlock.

That kind of specialization is rare in AI right now. Most companies are trying to build general-purpose systems that can do a bit of everything. Runway is arguing that going deep on video will get you further than going wide across modalities.

The company is also implicitly betting that you don’t need to be Google to win in foundational AI. That’s a bold claim. Training video generation models is expensive. Competing with labs that have near-unlimited compute and data is hard. But Runway seems to think its focus and domain expertise give it a shot.

Meanwhile, the radio station experiment is the kind of reality check the industry needs more of. It’s easy to get excited about benchmark improvements and demo videos. It’s harder to confront the fact that current AI systems still can’t handle basic autonomous operations.

The gap between what AI can do in controlled settings and what it can do when left alone to run a business is still enormous. That gap isn’t closing as fast as the hype cycle suggests.

Runway’s vision of video-based world models might be the right long-term bet. But the path from here to there is going to be longer and harder than the pitch decks make it sound. The radio stations made sure of that.

industry startups