← All insights
The Math·10 min read·

The AI stack buyers want to see (and what each layer signals)

There is now a recognisable reference architecture for production AI. It has five layers. Each layer sends a specific signal to a sophisticated acquirer about how grown-up your AI operation actually is.

The reference architecture

a16z's widely-cited write-up on emerging architectures for LLM applications captures the consensus shape of a production AI system [1]. Five layers, bottom to top:

  1. Models — the underlying LLMs (frontier and open-weight) doing the actual inference.
  2. Gateway — a router that sits in front of the models, choosing per request and handling fallback[2].
  3. Orchestration — workflows and agents that compose model calls and tool calls into useful work.
  4. Observability and evals — logging, tracing, and automated quality testing of every workflow[3].
  5. Governance — policy, risk, human-in-the-loop rules, and the audit trail that ties it all together[4].

What each layer signals to a buyer

Models — and why "we use the best model" is the wrong answer

A buyer wants to see a documented model policy: which models are approved for which use cases, how that decision was made, and how it is reviewed. "We use GPT-5 for everything" signals vendor concentration risk. "We use cheap models for classification, frontier models for reasoning, and the choice is logged per request" signals an operating system.

Gateway — and why its absence is now a red flag

The gateway is what makes the model layer swappable. Without one, every workflow is hard-coded to a vendor; with one, vendor risk is bounded and cost is observable. Gartner has been explicit that single-vendor AI strategies are now a diligence flag.

Orchestration — and why it should mostly be workflows

A grown-up AI estate is mostly fixed workflows with a few true agents at the edges. The reverse — agents everywhere, workflows nowhere — signals an organisation that has confused capability with control. We cover the distinction in agentic AI for operations.

Observability — the layer that converts AI from claim to evidence

Tools like Langfuse, LangSmith and the major cloud observability vendors all converge on the same primitives: traces, prompts, outputs, costs, latency, eval scores. A buyer reading these dashboards can answer the only question they actually care about — does this work, reliably, at this cost — without having to take management's word for it.

Governance — the layer that bounds the downside

NIST's AI RMF four-function model — govern, map, measure, manage — is the standard buyers and their advisors are increasingly anchoring to. A two-page policy that maps to those four functions, with named owners and named controls, is enough at lower-middle-market scale.

How this moves the multiple

A buyer paying a multiple of EBITDA is paying for the probability that EBITDA persists. Each layer of the stack above is, ultimately, a piece of evidence that the AI-driven margin and throughput gains in your trailing twelve months are durable rather than accidental. That is the entire argument for multiple expansion from AI — and it sits or falls on whether the stack is real.

Frequently asked questions

What does a production AI stack look like?
Five layers: a model gateway (e.g. OpenRouter), models routed per task, observability and logging (e.g. Langfuse), evaluation/quality measurement, and governance aligned to NIST AI RMF or ISO/IEC 42001. The a16z 'emerging architectures' reference is the widely-used baseline.
What does each AI stack layer signal to a buyer?
Gateway → vendor diversification (no single-vendor concentration risk). Observability → control (every call logged, attributable, replayable). Evals → verifiability (output quality measured over time). Governance → compliance (aligned to a recognised framework).
Do I need all five layers?
For a business going to market in 12+ months, yes — even at small scale. The cost of running a thin version of all five is small; the cost of being missing one in diligence is a discount applied to the multiple.

Want this for your business?

Start with a Diagnose. Two weeks. Written report. Honest fit assessment.

Send an enquiry