A Return to Code — The Naval Podcast

Host · Naval Ravikant Published · May 2026

Nivi

Co-host of The Naval Podcast (sets up the conversation)

Naval Ravikant

Host, The Naval Podcast

The Gist

Naval Ravikant, who hasn't seriously coded in decades, describes falling headlong into 'vibe coding' after Claude Opus 4.5 made AI coding agents genuinely capable in late 2025. He explains how the agents work, why one-shotting custom apps is so addictive and 'no-compromises,' and how this democratizes software creation while making pure software 'uninvestable.' Along the way he offers a field guide to the frontier models, the limits of multi-agent setups and context windows, why coding is uniquely trainable, and why he believes Apple's AI miss is the decade's biggest strategic mistake.

Key Takeaways

Vibe coding crossed the threshold with Claude Opus 4.5. Around December 2025, coding agents started behaving like a fast, essentially free junior programmer that stays on track and builds apps end to end.
The agents work because they're wired into Unix. They run in a text-based terminal over a Unix shell, calling commands like grep, awk, and sed, running cron jobs, and spawning tasks — and they translate plain-English intent into code.
You can one-shot custom apps for an audience of one. Naval built a personal 'app store' that delivers bespoke apps — like a custom workout tracker — to his iPhone in about 30 seconds.
The magic is no compromises. Like a self-driving car with no driver to feel self-conscious in front of, an agent lets you build exactly your vision without accommodating a team.
Pure software is becoming uninvestable. Anyone can hack software together today, and agents will soon build scalable, well-architected software — so VCs should chase hardware, network effects, and AI models.
It's a renaissance for individual creators. Vibe coding moves app-building from ~0.1% to a few percent of people; for prototyping, there's never been a better time to be a software creator.
Multi-agent councils mostly add tokens, not minds. Ten instances of one model share a brain and a data set, try to please you, and get led to your answer — so they don't reason like ten different humans.
Each frontier model has a niche. Claude meets you at your level (and Artifacts), ChatGPT is the all-rounder, Gemini wins on search and YouTube data, and Grok is candid and strong on hard technical problems.
Context windows are the wall. As a codebase outgrows ~1M tokens, agents lose the plot, fix the wrong bug, and hack around problems; the human operator has to steer the architecture.
Coding is trainable because it's verifiable. Abundant data plus easy verification — it compiles, the tests pass — is why models excel at code and math but lag in hard-to-grade domains like creative writing.
Apple's AI miss may be the decade's biggest mistake. As interaction shifts from tapping apps to talking to an agent, the phone commoditizes, and Naval expects Apple's growth and margins to compress.

The Conversation

The Inflection Point: Vibe Coding Arrives

Naval frames the moment precisely: around December 2025, with the release of Claude Opus 4.5, AI coding agents hit an inflection point. Unlike his earlier, mixed experiences, this time the hype felt real — the agent stayed on track, built apps soup to nuts, solved thorny problems, and felt like having a fast, essentially free junior programmer who is eager to please. Despite a computer-science degree and a grasp of architecture, networking, and algorithms, he hadn't seriously coded in decades, in large part because the activation energy was so high: wiring together GitHub, a backend on Vercel or Firebase or Railway, and a thicket of jargon and tools.

AI collapses that activation energy. He started with Claude Code like everyone else, reached for Codex on the thornier bugs, and got immediately, happily addicted.

How the Agents Actually Work

The shift, Naval explains, is that these are no longer mere coding assistants that hand you a pile of code to paste into an IDE. You open a terminal — the command-line interface — which is entirely text-based, exactly what these models are best at, since they were trained on text tokens and on the overwhelmingly Unix-flavored code sitting on GitHub and Stack Overflow (and macOS itself is BSD underneath). The agents are long-lived coding AIs connected to Unix at a core level: to the shell to execute commands, to the file system, to classic tools like grep, awk, sed, and pipes that daisy-chain into one another, and to cron so they can run for long stretches and spawn additional shells and tasks as needed.

On top of that plumbing, the agents act as forgiving English translators. Machine translation was an early core use case, and now they translate between Python, C, Lisp, Rust and the rest — and plain English — tolerating loose wording and spelling mistakes. You don't need deep expertise, only a high-level (not simplistic) understanding of computer architecture, networking, and programming, and from there you can go very far.

One-Shotting Apps and the Personal App Store

Naval began by 'one-shotting' apps — giving a description and getting a working app back — then iterating. He built himself a personal app store: he asks for an app, the agent delivers it into a web page that became an app on his iPhone, and he installs it with one click and even gets upgrades. His example is a custom workout tracker, spec'd in close to a single prompt: borrow the functionality of Tonal and Ladder, follow Apple's Human Interface Guidelines, ingest a text log of recent workouts, compute strength scores (after reading scientific papers on how to score by body part), draw a human-body diagram showing which muscles are bigger or smaller, and connect to Apple Health for heart-rate data.

The 'app store' is half a joke — real for him, his friends, and family, but not for wide distribution, since Apple requires apps to be keyed against specific devices. He's careful to note that broad, best-of-breed apps still win for common use cases; the magic of vibe coding is in the truly custom, private, or niche apps that only you would ever want.

Why It's Unbounded and Addictive

He likens the pull to a well-designed video game, which hooks you by dispensing feedback and rewards at the very edge of your capability — never so hard it frustrates, never so easy it bores. But a game is bounded and its rewards are fake; once you've learned the rules, it's done. Vibe coding is the opposite: there's a Turing machine underneath, the objective is set by you and keeps expanding, and the results have real-world relevance, so it never quite fills up. The catch is that it rewards a clear vision — knowing exactly what you want is, he says, the hardest part.

No Compromises: Rebuilding AirChat Solo

Naval's clear vision comes from AirChat, a voice-and-video social messenger he was obsessed with for about a year and built with a team of eight or nine engineers over nine to twelve months. It didn't quite work; they sold it off, returned investors' money, and gave the team nice packages. Now he's rebuilding it from scratch — but exactly the way he wants, with no compromises. Building anything through a team always entails compromise, even for the supposed dictator in charge: you can't tell an engineer to move an icon left, then right, then back again, or demand changes on pure gut feeling.

An agent removes all of that. He compares it to a self-driving car, where you don't feel self-conscious because there's no driver watching you — likewise an autonomous coding agent lets you indulge your own idiosyncrasies and build precisely the thing you envision. He concedes the code may not be high quality this generation — shaky architecture, possible security holes, hard to scale — but the prototypes are fast and true to the creator's vision. Expect more things like Minecraft, which Notch coded alone: weird blocky graphics, but one uncompromised vision, which expands the scope of what gets discovered.

Democratization and the Death of 'Pure Software'

Vibe coding, Naval argues, takes the share of people who might build apps from roughly a tenth of a percent to a few percent. Most people won't — to them a computer is a black box, and a 10x or 100x drop in difficulty changes nothing — but the creative, self-motivated, and articulate can now build, with nobody standing between them and a prototype. Scaling a high-functioning app to many users still demands a real engineering team and probably a full rewrite, but for experimenting, prototyping, and getting to market, he says there has never been a better time to be alive as a creator of software.

The market consequence is blunt: pure software is uninvestable, full stop. If your only edge is building software others can't, that edge is gone for two reasons — anyone can hack it together today, and the agents are improving so fast they'll soon produce scalable, well-architected software. So venture investors should look to hardware, network effects, and AI models; training models, he suggests, is the new building software, at least until auto-research and auto-training arrive. He also points to kids: vibe coding's instant feedback succeeds where Swift Playgrounds and Scratch Jr. struggle, and operating the agents quietly forces them to learn the command line, caching, network backoff, streams, disk writes, and latency-versus-bandwidth trade-offs. His own late nights, once spent reading, doomscrolling, or gaming, now go entirely into Claude and Codex — which is why he's gone quiet on X.

Multi-Agent Reality and a Field Guide to the Models

Nivi raises error correction as the most interesting thing about agents — their ability to learn and self-correct, the way 'thinking' is error correction layered onto next-token prediction, and the way removing hallucinations was itself error correction — and wonders whether the next frontier is agents correcting each other. Naval is skeptical. AI is 'jagged intelligence,' brilliant at some things and dumb at others, and running ten instances of the same model is like ten people sharing one brain and one data set: you're mostly just spending ten times the tokens, not adding ten distinct minds the way ten differently-trained humans would. Different models — Codex, Gemini, GrokCode — are trained slightly differently and so offer somewhat different insights. His quick field guide to the four leading models:

Claude — a strong visual presentation layer (Artifacts) and an unusual knack for meeting you at exactly your level of understanding
ChatGPT — 'the OG,' very good all around
Gemini — frustrating as a product (timeouts, lost context) but fast, with a decisive data advantage from Google's web crawl and YouTube; best when the question is really a search
Grok — the least 'neutered,' the one he trusts to tell the truth, plugged into X for news and notably strong on deep technical, mathematical, and scientific problems

He even pits them against each other: every pull request to his GitHub automatically triggers Codex, Gemini, and Grok to review the code, a kind of roundtable of AIs. But it's less useful than it sounds — there's heavy groupthink, the models rarely contradict you because they're trying to please and have no durable theory of mind, and if you nudge them toward an answer they'll all converge on it.

Where Agents Break: Context Windows and Oversight

The hard limit is context. State of the art is around a million tokens — roughly a million words — and because the transformer's attention mechanism scales with the square of the context, that already implies enormous complexity. As a codebase grows past what fits, the model can no longer hold it all in memory: it starts guessing, compacting, and losing the plot, fixing the wrong thing, fixing the same bug five times, or patching the architecture where the real problem lies elsewhere. Worse, left unwatched it will do boneheaded things, like 'fixing' a bug by deleting the feature that triggered it.

So the operator has to guide it — to say, in effect, let's re-architect that whole thing. Tellingly, when Naval stops the model and calls something a hack, it always agrees ('you're right, that was a hack') even when it wasn't, because it's forever trying to please and doesn't know better. He compares it to a hunting dog: better than you at catching the duck, but point it at the wrong bird and it'll take that one down. It demands real operational oversight — but the combination of a human operator and a state-of-the-art model already yields incredible results, one-shotting simple apps like a task list or a basic game clone, with far more complex one-shots clearly coming.

Why Coding, and Not Creative Writing

What makes models uniquely good at coding, Naval says, is the combination of abundant data and cheap verification: code has to compile and execute, and pre-written tests can immediately confirm whether it did the right thing. Mathematics is similar — lots of solved problems, easy to check — and so is self-driving. Wherever there's plentiful data and a tight verification loop, these models excel. Brand-new fields with little data remain an opportunity for human creativity, and domains that are hard to verify — creative writing above all, where no algorithm can cleanly grade what's good — are where models lag, since you'd need humans in the loop and the output is only as good as their taste.

He offers two reasons the coding models leapt recently: some recursive training, where one model helps improve the next, but more importantly that many of the best software engineers started using them in the past few months, feeding their code and their taste back in. High-taste feedback loops, he stresses, are what improve these models, and they're harder to build than they look.

The Beginning of the End for Apple

The personal-app-store trick becomes a thesis about Apple. You can be at dinner, hear someone describe an app they want, and five minutes later show it to them running on your phone. Apple's dominance rests on its OS and apps being better than everyone else's, but once your day is spent talking to an agent — 'call me an Uber,' 'track my workout, make no mistakes' — rather than tapping individual apps, the phone's role shrinks. Agents don't even need APIs; they can improvise their own. You end up interfacing with the model, not the device, and since Apple is now leaning on Google's Gemini anyway, Naval asks why not just use an Android phone: all you need is a screen, a battery, and connectivity, with interfaces generated on the fly.

In that world Apple competes only on chips and integrated hardware — Samsung or Lenovo margins, not Apple margins — so its market cap compresses. He calls Apple's retreat from AI the biggest strategic mistake in the tech industry of the decade, drawing the analogy to Microsoft, which missed the mobile wave by clinging to Windows and over-indexing on enterprise, letting Apple surpass it. Companies like that can stay rich for a long time, but he believes Apple's long-term growth is now capped unless it turns the AI ship around.

Software Development Becomes Collaborative

Naval closes on where this goes. Inside the app he's building, a bug-reporting pipeline lets a user tap a button to ship logs to a server; every 24 hours Claude works through every report and files fixes into side branches, leaving him as the final gate to approve or reject each one. He extends the idea: users will request and vote on features, a tastemaker or maintainer (human or agent) will decide what actually ships, and software development becomes a collaborative loop among users and agents — agents being, in effect, perfect customer service, since a flawless support rep would also be a tireless, ego-less coder available 24/7. The upshot is that genuine one- or two-person software companies can now scale to millions of users and billions of dollars, extending a lineage that already includes Notch, Satoshi Nakamoto, and the tiny original teams behind Instagram and WhatsApp.

In Their Words

It really feels like having a junior programmer at your disposal who's fast, essentially free, and ready to please.Naval Ravikant

There's never been a better time to be alive as a creator of software.Naval Ravikant

Pure software is uninvestable. I would just full stop right there.Naval Ravikant

It's like 10 people with the same brain, the same data set, talking to each other.Naval Ravikant

The model will always say, oh, I'm sorry, you're right, that was a hack — even if that wasn't a hack.Naval Ravikant

It's better than you at catching that duck... but it's still a dog, so if you point it at a bird that's not a duck, it might take that bird down instead.Naval Ravikant

I think Apple giving up on AI will go down as the biggest strategic mistake in the tech industry of this decade.Naval Ravikant

You truly can have one-person, two-person software companies now that can scale to millions upon millions of users and make billions upon billions of dollars.Naval Ravikant

References Mentioned

People

Naval Ravikant — host and principal voice
Nivi — co-host who frames the conversation
Notch (Markus Persson) — solo-coded Minecraft, cited as the no-compromises ideal
Satoshi Nakamoto — invoked as a tiny-team precedent

AI Tools & Models

Claude, Claude Code, and Claude Opus 4.5 — the December 2025 inflection point; Artifacts for visual output
OpenAI Codex — used for thornier bugs and pull-request review
Google Gemini — strong on search via Google's crawl and YouTube
Grok / GrokCode (xAI) — candid, plugged into X, strong on hard technical problems
ChatGPT — 'the OG,' a strong all-rounder

Companies & Products

AngelList — the episode's sponsor; the fund-admin firm Naval and Nivi started
AirChat — Naval's former voice/video social messenger, now being rebuilt solo
Apple — iPhone, Xcode, Apple Health, and the Human Interface Guidelines
Tonal and Ladder — referenced when spec'ing the custom workout app
Minecraft, Instagram, and WhatsApp — small-team success precedents
Microsoft/Windows, NVIDIA, Samsung, and Lenovo — invoked in the Apple market-cap argument
Vercel, Firebase, and Railway — backend services that used to make setup painful

Concepts & Dev Tools

Vibe coding and 'one-shotting' apps
Unix / BSD, the shell, and commands like grep, awk, sed, and pipes; cron jobs; GitHub pull requests; the command line (CLI)
Context windows (~1M tokens) and the transformer's quadratic attention
'Jagged intelligence,' error correction, and high-taste feedback loops
'Pure software is uninvestable'