The Future of AI in Business Operations

There's an episode of *It's Always Sunny* where the gang tries to solve the gas crisis by buying a truck and hoarding gasoline, and it goes exactly as well as you'd expect. That's most enterprise AI strategy I've seen up close. Ambitious. Confident. Mostly setting the kitchen on fire.

Predictions about AI in business are usually wrong in a specific way. They overestimate what changes in two years and underestimate what changes in ten. I'm not going to pretend I know what AI looks like in 2035. I do think there are a few patterns showing up in the work right now that are worth paying attention to, because they're already changing how the smart teams build things.

The first pattern: the unit of automation is shifting from the task to the workflow. For the last decade, "AI in business" mostly meant single-purpose models doing one thing. Score this lead. Classify this email. Forecast this number. The thing you're seeing now, and what I'm spending most of my time on, is multi-step workflows where an agent or a chain of agents handles the whole arc, including the ambiguous parts in the middle. This is harder to build, harder to evaluate, and substantially more useful when it works. It also fails in much weirder ways, which is its own design problem.

The second pattern: the hard problem is no longer the model. It's the boundary between the model and reality. A model that's 95% accurate in a benchmark is not the same as a system that's 95% accurate in production, because production is full of API timeouts, stale data, ambiguous user input, and situations the training set never saw. Most of the work in real AI deployments now is the connective tissue. Guardrails, evaluation pipelines, the observability layer that catches drift, the human-in-the-loop fallbacks for the cases the model handles badly. None of that is sexy. All of it is what determines whether the system is actually useful.

The third pattern: the business case is moving from cost savings to capability expansion. The first wave of AI in business was sold on "this will let you do the same thing with fewer people." That framing is fine, but it's small. The more interesting framing, and the one I see working better in practice, is "this will let you do something you couldn't do before at all." A small team can now run support workflows that would have required ten people. A founder can analyze documents that would have required a research department. A venue can handle bookings that would have required a full-time staffer. The math changes when you stop comparing AI to a person and start comparing it to "didn't happen at all."

The fourth pattern, and this one I'm less sure about: the value is going to concentrate in the unglamorous middle. Everyone is paying attention to the foundation models on one end and the consumer-facing chat interfaces on the other. The actual money, I suspect, is going to be made by the boring infrastructure companies in the middle. The vector database. The evaluation framework. The orchestration layer. The compliance tooling. This is where the work is hard and the brand value is low and the moats are real, which is usually where the durable businesses live.

There's a related pattern I want to flag that I think is underappreciated. The biggest blocker to AI deployment in most enterprises is not technical. It's organizational. Companies don't have anyone whose job it is to own the AI system end-to-end. Engineering builds the model. Operations runs the workflow. Compliance reviews the output. Nobody is responsible for whether the whole thing actually works. Until that role exists, deployments stall. The companies that figure out how to staff it are going to move much faster than the ones that don't.

A few things I'm watching with curiosity rather than confidence:

The evaluation problem is going to get harder before it gets easier. We're building systems that we can't fully test. Nobody has a great answer for this yet. The teams that develop strong internal evaluation cultures are going to win quietly while everyone else fights about model rankings.

Small models are going to matter more than people think. Not because they're better than the big ones, but because the cost difference at scale is enormous, and a 7B model fine-tuned on your specific task often beats a 400B general-purpose model on the work you actually care about. I've seen this play out in real deployments. The pattern repeats.

The interface is the product. People are going to stop caring which foundation model is under the hood, the same way they stopped caring which database their app uses. The differentiator will be how well the system handles the messy parts that the model doesn't see.

The thing that won't change, I think, is that the value of AI in any given business is going to be determined by how well someone in that business can translate vague problems into concrete ones. That skill is rare now and will stay rare. Models get cheaper. Translation doesn't.

Sam Spade, the detective at the center of *The Maltese Falcon*, asked at the end what the bird is made of, says it's "the stuff that dreams are made of." Most AI predictions traffic in the same material. Light, valuable-sounding, worth almost nothing once you weigh them. I'd rather be wrong about all of this in a useful way than right in a generic one. If any of these patterns map to what you're seeing on the ground, or if you think I'm reading the field wrong, I'd want to hear it.

The Future of AI in Business Operations

Continue Reading

Building Intelligent Automation Systems

Data Science in Practice: Real-world Applications