Jensen Huang recently described the AI economy as a five-layer cake: energy, chips, infrastructure, models, applications. He used this to sell NVIDIA's position in the stack. But if you stare at the economics of each layer long enough, a more interesting pattern emerges.
Software now has a marginal cost for each user. There are only two margin levers in the entire AI economy. They apply at every layer.
Each layer will have to converge on the same business model: usage-based pricing with cost-plus margins. And the size of the "plus" — the margin you get to keep — is determined by exactly two things:
- How differentiated your offering is vs. everyone else selling the same underlying commodity unit in their own brand wrapper.
- How much you can drive down the cost of producing that commodity unit of value without your customer noticing any change in quality.
The Commodity Unit at Every Layer
Each layer of the stack has an atomic unit of value. Each has a cost driven by usage. Each is, at its core, a commodity.
- Energy: The commodity unit is an electron, priced per kilowatt-hour. Utility-scale solar PPAs run $0.04-0.06/kWh depending on region. Retail rates for data centers run $0.10-0.17+/kWh. The spread between those numbers is where margin lives.
- Chips: The commodity unit is a processor, priced per chip. An NVIDIA H100 costs ~$3,320 to manufacture and sells for $25,000-$40,000. That's an 88% gross margin on the chip itself, 75% at the company level (the fattest in the stack right now).
- Infrastructure: The commodity unit is a GPU-hour, priced per hour of compute. H100 cloud rates have crashed from $8-12/hr at peak to $1.49-3.90/hr today — a 44-75% decline depending on provider. The perception of supply uncertainty and market certainty of insatiable demand makes this layer attractive in the near term.
- Models: The commodity unit is a token, priced per million tokens. GPT-4-equivalent performance went from ~$36/million tokens at GPT-4's launch in early 2023 to $0.40/million tokens today via GPT-4.1 mini. A ~1,000x cost decline in three years when you factor in blended rates and efficiency gains.
- Applications: The commodity unit is now intelligence measured in tokens. Pricing is still messy — 61% of SaaS companies now use hybrid models blending seats with usage — but it's migrating toward consumption-based pricing because, for the first time in software history, the marginal cost of serving a user is non-trivial.
Traditional SaaS had near-zero marginal cost. Every new user was almost pure margin. AI applications burn tokens on every interaction. The economics are fundamentally different, and the pricing has to follow. Applications are not dead, they just have to be more compelling to buy than build and be priced on usage not subscription.
The Two-Axis Framework
Margin in this stack comes down to two variables: how much customers are willing to pay up (driven by differentiation), and how low you can push your costs down (driven by infrastructure innovation). So then the interesting question for each layer is: how much room is there to move on each axis?
Some layers have enormous room for product differentiation. Others are stuck selling something indistinguishable. Some layers have wide-open opportunities for cost innovation. Others are constrained by physics or regulation.
Map every layer on these two axes and you get a clear picture of where durable margin will live and where it won't.
The Bookends Win