Issue #166 | Write Is the Final Boss of AI

Last week, I made the case that AI raised the floor on mediocrity rather than the ceiling on excellence: that adequacy is now free, which makes it worthless as a competitive position.
Then this week’s news cycle did its level best to test the thesis: Meta launched its official Ads AI Connectors (an MCP server and a CLI, both in open beta) that let advertisers run Meta campaigns from inside ChatGPT, Claude or any MCP-compatible agent. This change came on the heels of a months-long campaign by Meta to ban users who connected AI systems via unverified rails – so whether or not it lasts is anyone’s guess (as is the fate of the users whose ill-timed connections resulted in permabans). Then Google (Alphabet) posted exceptional Q1 results – but the highlight was $35.7B in single-quarter CapEx and raised 2026 AI infrastructure guidance to $180–190B. Microsoft is at $190B for the year. Meta is at $125–145B. The Mag-7 alone will be ~0.75T in AI CapEx this year. You read that right: three quarters of a trillion dollars in 2026 AI CapEx from 7 companies.
The conversation that #165 kicked off in DMs kept circling back to a related-but-distinct question, and this week’s news provided a resounding answer: if AI commoditized the visible output, what about the visible deployments? Should the rest of us be running the same playbook the giants are running?
The honest answer is no. And the reason has nothing to do with capability.
The Asymmetry Most People Are Missing
Most of the discourse around AI deployment treats “AI does things” as a single maturity curve: first you let it summarize, then you let it draft, then you let it act. That framing collapses 2 operations that aren’t in the same category, let alone on the same curve.
Read-actions produce intermediate output. Summarize this thread. Analyze this dataset. Draft this brief. Surface these patterns. The output is presented to a human who exercises judgment before anything goes too far – they edit, delete, modify, re-prompt, challenge, whatever. The commit boundary stays where it has always been: between the AI and the person who decides whether to act on what the AI produced. Errors at this layer are bounded by time-to-discovery, which is short, because a human is in the loop precisely to discover them.
Write-actions invert this. The agent is the commit boundary. Sent emails. Executed trades. Deployed code. Filed contracts. Placed orders. Posted content. Reallocated budget. The human is removed from the loop precisely at the moment irreversibility enters the system. An error at this layer is not a draft to be revised. It’s a transaction that has cleared, an account that has been closed, a customer who has been wronged, a post that’s visible for the entire internet to see.
The error distribution between these two regimes is not similar-in-shape and different-in-magnitude. It is structurally different. Read errors are linear and recoverable; the cost is bounded by the review step. Write errors are convex and fat-tailed, because time-to-detection becomes the multiplier on the size of the loss.
This is the Taleb distinction, and it’s worth invoking directly. In Mediocristan, observation converges on truth. Watch a hundred deployments, learn the failure rate, plan accordingly. In Extremistan, observation systematically understates risk because the distribution is dominated by rare, high-magnitude events that haven’t yet appeared in your sample. Write-action AI deployment lives in Extremistan. Sampling success at scale isn’t evidence of safety – it’s the prelude to the case study.
That distinction is where most builders, operators and vibe-coded whatevers are accumulating exposure they aren’t modeling.
The Rails Always Lag the Capability
Here’s the part of this conversation almost no one is having: the historical pattern for transformational technologies is not “invented, then dominant.” It’s “invented, then dormant for decades while the surrounding infrastructure catches up.” The lag between capability and infrastructure is where the actual fortunes are made and lost, and the pattern is consistent enough to be predictive.
Trevithick’s locomotive ran in 1804. The transcontinental railroad was hammered together in 1869. Standardized time zones – which were essentially imposed on the United States by the railroad industry to make scheduling possible – didn’t arrive until 1883. The real arrival of rail as an economic-dominant force was closer to 80 years after the locomotive existed. Henry Ford made the car affordable in 1908. The Federal-Aid Highway Act funded the interstate system in 1956. The 48 years between were filled with gasoline distribution networks, driver licensing, traffic codes, automotive insurance, the medico-legal apparatus around vehicle injury, parking systems, signage standards. Eisenhower’s interstates were the capstone, not the foundation. Edison’s Pearl Street Station opened in 1882. Rural electrification wasn’t substantively complete until the 1950s. It took in excess of 70 years to deploy the most obviously transformative general-purpose technology of its era.
Even the internet, which feels like it arrived all at once, sits atop a regulatory inheritance most people don’t think about: the 1934 Communications Act framework, common carrier doctrine, FCC oversight, the 1996 Telecommunications Act that enabled commercial deployment, Section 230 immunity that made user-generated content businesses viable. None of those were built by internet companies. None of them could have been built in time. Every commercially important firm on the public network is operating inside infrastructure it didn’t author and couldn’t have replicated on its own timeline.
The pattern decomposes into four categories of rails any general-purpose technology requires:
- Physical infrastructure – compute, power, network for AI – is the only one being built at scale today, and the only one Wall Street is pricing.
- Regulatory infrastructure – liability frameworks for autonomous action, identity standards for agents, audit and disclosure mandates – barely exists.
- Trust and verification infrastructure – cryptographic agent identity, action attestation, reversibility, insurance markets for agent error – essentially doesn’t exist outside narrow domains.
- Human and institutional infrastructure – credentialing, approval workflows that don’t collapse into rubber-stamping, organizational liability assignment, capital instruments that price agent risk – is mostly absent at the scale that general write-action deployment would require.
The capability layer is on a five-year curve. Three of the 4 rail categories are on 20-30 year curves. The objection that software rails are faster than physical rails is true in the literal sense and false in the operational sense. Code is fast. Trust is slow. Liability frameworks are slow. Insurance markets are slow. Court precedent is slow. Human institutions update on human-institution timelines regardless of how fast the underlying technology improves.
The gap between those curves is where the cautionary tales will live.
The Visible Record Is a Press Release
The case for “this is the playbook” is built on the assumption that the visible record reflects the actual record. It doesn’t. Reasoning about agent risk from public failure cases is reasoning about military losses from press releases. The filtering is real:
Direct economic incentive. At a large firm, a $4M loss disappears into a cost of revenue line item. Disclosure triggers reputational damage, legal bills, compliance reviews, potentially regulatory scrutiny. Absorption is cheaper – so it is chosen.
NDA-protected settlements. Most counterparties resolve faster and cleaner under non-disclosure. Both sides have incentive to keep it quiet. When a rogue AI agent inadvertently gives airline customers a 90% discount on their fare, the airline asks you to sign a piece of paper, then honors it along with throwing in some more freebies (miles, vouchers, whatever). You’re delighted, they don’t have to deal with bad press or millions of users trying to find a similar exploit in their system.
Regulatory reporting thresholds. Most operational errors don’t cross the magnitude bar that triggers an 8-K or its equivalent. They never become public information.
Internal classification. AI-driven failures get logged as “system errors,” “process exceptions” or “reconciliation issues” because virtually every company’s P&L doesn’t have AI-specific categories. You can spend hours scouring public company P&Ls and come up with nothing even when there’s a real something to find – simply because the “something” is buried in a seemingly unrelated line item.
What you’re seeing in the public record – the think pieces, the glowing features, the X threads, the LinkedIn posts, the podcast interviews – is the survivor cohort, with the specific subset of incidents the firm chose to disclose or publish. That isn’t dishonest. It’s just what reporting looks like when the reporters control the narrative. Treating it as a representative sample of what agent deployment actually produces gets you the wrong distribution and, more importantly, the wrong floor on expected loss.
When JP Morgan talks about LOXM optimizing global commodities trades (and subsequent gains on those trades / savings vs. humans) – they’re likely not disclosing the training cost / errors of that agent. JP Morgan moves ~$10T a day around the globe. They move billions in commodities positions every day. Nearly a trillion in ForEx. If LOXM messed up (and I’m not saying it did) and made a $4M error, would that be a material event to the bank? No. $4M – in the grand scheme of things – means nothing to JP Morgan. Not only is it a rounding error on a single 10Q line item, it’s a rounding error vs. the reputational damage the firm would take if such a mistake ever got out. Ergo: if LOXM (or any of JP Morgan’s 300+ agents) made a mistake, it’s in the bank’s interest to handle it quietly.
The harder thing to see is that successful deployments aren’t reducing systemic tail risk – they’re concentrating it. Three mechanisms drive this. Correlation: as more firms deploy similar agents using similar models with similar guardrails, the failure modes correlate across the ecosystem rather than diversifying away. When one agent class fails, it fails across all the firms that deployed it – the 2008 mortgage problem, where each individual exposure looked low-risk and the correlated failure mode was invisible until it materialized. Complacency: each successful quarter without an incident reduces vigilance, expands the scope of authorized agent actions, and removes redundant checks. The system gets more brittle as it appears more reliable – the Three Mile Island pattern, the Boeing MCAS pattern, where operational success without underlying safety culture creates the conditions for catastrophic failure precisely because it relaxes the vigilance that was preventing it. Leverage: as deployments succeed, the actions agents are trusted with scale up. The agent that started by drafting emails ends up executing trades. The agent that started by suggesting copy ends up allocating six-figure daily budgets. The blast radius of an eventual error grows even as the per-action error rate falls. Net tail exposure increases.
The current period of widely-publicized successful AI agent deployments isn’t evidence the tail risk thesis is overstated. It’s the period during which tail risk is accumulating fastest, in firms that currently believe they’ve solved the deployment problem.
The Balance Sheet Asymmetry
And that is where the playbook actually breaks down for the 99% of firms that aren’t JPMorgan.
A multinational bank with a trillion-dollar balance sheet can absorb a $4M agent failure as a rounding error. They can self-insure. They have legal infrastructure that contemplates autonomous error. They have deep, complex relationships with every regulator. They have a cost-of-capital advantage that lets them eat the loss and keep moving.
A 25-person company doesn’t have any of those things. Neither does a Series A SaaS company. Neither does a regional services firm doing $30M in revenue. For those operators, the first material AI agent failure may not be a rounding error – it may be a survival event.
That may sound extreme, but it’s not.
Imagine a healthy agency running $10M in monthly client ad spend. They deploy a budget pacing agent. A vendor API change breaks an assumption the agent was making, and over a single weekend it overspends by $400k (roughly 1.7x “normal” daily spend) before anyone catches it. That single error consumes more than the agency’s cash on hand. The E&O policy denies the claim because it never contemplated autonomous agent error. 2 impacted clients churn, while some of the remaining clients begin asking uncomfortable questions about what else might be running unsupervised. The firm is now months from insolvency, and the only path through requires raising emergency capital from a position of weakness.
The same nominal incident at a firm with a hundred times the balance sheet would be a footnote. At this agency, it’s the agency.
That’s the key piece that’s missing in the public discourse: the Fortune 500 didn’t deploy agents because the technology was uniquely safe. They deployed agents because their balance sheet, legal teams and risk mitigation made the downside survivable. The visible behavior – let’s go spin up agents like [insert F500 company here] – was the easy thing to copy. What no one mentions is the invisible architecture underneath those deployments that makes them possible. That’s harder to copy.
It’s also worth naming what the small operator is actually choosing when they skip the rails. They think they’re choosing speed over caution. They’re really choosing exposure over insurance, in a market where the premium hasn’t been quoted because no one is selling the policy yet.
Meta Just Did the Lesson in Public
If you wanted a real-time case study on why the rails matter, you got one this month.
Through March and into April, Meta was permanently banning ad accounts that connected through unverified MCP servers. The open-source kind. The ones that took a personal access token, called the Marketing API directly without rate limiting, and looked indistinguishable from bot traffic to Meta’s security systems. The advertisers running them (mostly) weren’t malicious. They were doing exactly what the LinkedIn case studies told them to do: connecting an AI agent to a live ad account so they could prompt their way to campaign management. Meta’s ban bot didn’t ask about intent. It asked whether the connection was authenticated, scoped, rate-limited and audit-able. For most of the open-source MCPs floating around, the answer was no. Accounts disappeared. Appeals were rejected. One user posted publicly on X that Meta permanently banning accounts after 16 years and over $1.5M in lifetime ad spend. He wasn’t the only one. The only response was an automated “Your review was unsuccessful.” The trigger was connecting Claude to his account via an MCP server. He is not the only one.
This week, Meta launched its own. The official Meta Ads AI Connectors (an MCP server and a CLI, both in open beta) went live with verified app status, scoped permissions, rate limiting, audit trails and a sanctioned authentication flow. Meta did not change its position on AI agents in ad accounts. It changed which agents it was willing to recognize. Same write access, different rails.
Read that sequence again, because it’s the entire thesis on a 4-week timeline. The visible behavior – AI agents managing campaigns – was always the easy part to copy. The invisible architecture (verified app review, scoped permissions, rate limits, audit logs) was the part Meta was actually enforcing. The operators who copied the visible behavior without the architecture had their accounts banned. The operators who waited for the architecture were rewarded.
This is the historical pattern playing out in compressed time. Meta built the rails before it sanctioned the deployment. Everyone who tried to deploy ahead of the rails got run over by the regulatory infrastructure – which in this case is Meta’s enforcement layer, but the dynamic is identical to what happens when a regulator, a court, or an insurance carrier eventually shows up and asks who authorized the autonomous action. The operators running ahead of the rails are operating uninsured. They just don’t know it yet.
Constraint Architecture Is the Actual Moat
Which brings me back to last week’s argument from a different angle.
The capability layer is commoditizing. 0.75T in 2026 Mag7 CapEx is very difficult to ignore. The frontier models are an API call away, the gap between them is shrinking, and compute is being built at a pace that all but guarantees capability access becomes table stakes. If your competitive position depends on having access to a capability your competitors don’t, that position has a six-to-eighteen month half-life.
The constraint layer is doing the opposite. It’s where the actual leverage lives, and it’s structurally defensible because it isn’t a feature. It’s a system. Hard dollar caps with kill switches. Read-only by default, write-capable by exception, with a named human owner on every exception. Reversibility built into the write layer rather than bolted onto the recovery process. Insurance riders that explicitly contemplate autonomous error rather than implicitly excluding it. Audit trails robust enough to satisfy a regulator with subpoena power or an attorney during discovery.
None of that is sexy. None of it shows up in a case study. All of it is what determines whether the first agent failure is a learning event or a closing event.
This is the same lesson as last week, applied one layer deeper. Last week’s issue argued that emulating the visible artifacts of competent output (the newsletter, the LinkedIn post, the deck) without the underlying mechanism (the perspective, the relationship, the worldview) produces work that exists but doesn’t accomplish anything. This week’s argument is the operational version: emulating the visible behavior of large-firm agent deployment without the underlying architecture (balance sheet, legal infrastructure, constraint systems) produces exposure that exists but isn’t survivable.
In both cases, the artifact is the easy thing to copy. The mechanism is the actual asset.
To borrow a phrase from the inimitable Taylor Holiday, “write is the final boss of AI.” That’s not because the technology can’t do it (it absolutely can and increasingly will) but because doing it without the rails is a position no operator of modest capital should hold. When you see a competitor deploying write-action AI successfully, you aren’t seeing proof that it works. You’re seeing the period before the case study, or the visible portion of a much more robust system (or balance sheet) you can’t see. The companies that figure out the constraint architecture first are going to write the case studies. The firms that copy the visible behavior without the rails are going to be the case studies.
Pick which one you’d rather be.
Cheers,
Sam

