$665 Billion Spent. 95% Without Measurable Return. The AI Measurement Crisis.

95% of organisations that deployed generative AI saw zero measurable P&L return MIT Project NANDA, July 2025

73% of deployments fail to achieve projected ROI, consistently across 2025–26 McKinsey Global AI Survey 2026

42% of companies scrapped most AI initiatives in 2025, up from 17% the year before S&P Global, 2025

61% of AI projects approved on projected value never formally measured post-deployment MIT Sloan, 2025

20% of organisations capture 74% of all AI economic value — and measure outcomes consistently PwC AI Performance Study, April 2026

Global enterprise AI investment reached an estimated $684 billion in 2025, with enterprise AI spend projected at $665 billion for 2026.^{[RAND, 2025; AI Governance Today, 2026]} The numbers that followed that spend are harder to find in press releases, but they appear consistently across independent research: MIT’s Project NANDA found that 95% of organisations deploying generative AI saw zero measurable financial return.^{[MIT NANDA, 2025]} McKinsey’s 2026 Global AI Survey put the ROI failure rate at 73%.^{[McKinsey, 2026]} S&P Global found that 42% of companies scrapped most of their AI initiatives in 2025 alone — up sharply from 17% the prior year.^{[S&P Global, 2025]}

These figures come from different methodologies, different sample sizes, and different definitions of failure. The direction they point is consistent. The gap between AI spend and AI value is wide, persistent, and — based on the research — almost entirely attributable to organisational and governance failures rather than to limitations in the underlying technology.

The failure mode that recurs most often: organisations approve AI investments on projected value, deploy systems, and move on without establishing the measurement infrastructure needed to determine whether the investment delivered anything. MIT Sloan found this pattern in 61% of enterprise AI projects surveyed.^{[MIT Sloan, 2025]}

What the research actually says about AI project outcomes

The 95% figure from MIT requires careful reading. Project NANDA defines “successfully implemented” AI as systems that deliver sustained productivity gains and documented P&L impact, verified by both end users and executives.^{[MIT NANDA, 2025]} By that standard — which is demanding but operationally reasonable — most enterprise AI deployments in 2025 and 2026 do not qualify. The systems often work; the value rarely materialises in a form that can be reported to a CFO.

Gartner’s December 2025 survey of 782 infrastructure and operations leaders found that only 28% of AI use cases fully met ROI expectations.^{[Gartner, Dec 2025]} Of those who reported at least one failure, 57% attributed it to expecting too much too fast — pointing less to technical overconfidence and more to the absence of realistic, pre-defined success criteria.

RAND Corporation’s 2025 analysis put the overall AI project failure rate at 80.3%: 33.8% abandoned before production, 28.4% completed but failing to deliver expected value, 18.1% completed but cost-unjustifiable.^{[RAND, 2025]} The average large enterprise abandoned 2.3 AI initiatives in 2025 at an average sunk cost of $4.2 million per abandoned project.^{[RAND, 2025]}

The pilot purgatory problem

IDC found that 88% of AI proof-of-concepts never transition to production. Gartner estimates 60% of projects without AI-ready data are abandoned before generating insight. These are not technology failures. They are planning failures — projects approved without measurable outcomes defined, deployed without governance infrastructure, evaluated against criteria never established.

The research on causes is consistent across RAND, MIT, McKinsey, and Gartner: unclear or unmeasured success criteria, inadequate data governance, misclassification of AI deployment as an IT project rather than a business transformation, and loss of executive sponsorship within six months of launch.^{[RAND, 2025; McKinsey, 2026]} RAND found that projects with sustained CEO involvement achieved a 68% success rate; those that lost sponsorship within six months achieved 11%.^{[RAND, 2025]}

“The hype on LinkedIn says everything has changed. Nothing fundamental has shifted.” — COO at a large enterprise organisation

MIT Project NANDA: The GenAI Divide — State of AI in Business, 2025

When tokens become the metric, outcomes become the casualty

Alongside the project failure data sits a separate but related problem: the metrics organisations use to track AI adoption increasingly measure activity rather than value. The clearest illustration emerged in early 2026, when Meta and Shopify created internal leaderboards tracking how many tokens employees consume.^{[CNBC, April 2026]} Nvidia CEO Jensen Huang framed the logic at GTC, suggesting he would be “deeply alarmed” if an engineer earning $500,000 per year was not spending at least $250,000 worth of compute annually.^{[CNBC, April 2026]}

Ali Ghodsi, CEO of Databricks, described the dynamic precisely: “If your goal is to just burn a lot of money, there are easy ways to do that. Resubmit the query to ten places. Put up a loop that just does it again and again. It’s going to cost a lot of money and not lead to anything.”^{[CNBC, April 2026]} PYMNTS reached the same conclusion: high token consumption can signal inefficiency rather than productivity, particularly in agentic workflows where poorly designed loops generate substantial token spend with no corresponding output.^{[PYMNTS, March 2026]}

Salesforce introduced Agentic Work Units (AWUs) in April 2026 as a direct counter — a metric measuring output and impact rather than token consumption, translating AI inputs into work completed: customer service resolution times, product recommendation quality.^{[Axios, April 2026]} HubSpot CEO Yamini Rangan summarised the emerging consensus: “Outcome maxxing >> token maxxing.” Appian CEO Matt Calkins compared tokenmaxxing to the Soviet Union evaluating chandeliers by weight: a system that produces heavy chandeliers and no light.^{[Axios, April 2026]}

What organisations measure	What it signals	Problem
Token consumption per employee	AI interaction volume	Gameable — loops, retries, inefficient prompting all inflate the number
Number of AI tools deployed	Portfolio breadth	Vanity metric — 84% of audits find more tools than IT knew existed
Pilot completion rate	Project throughput	Misleading — 88% of pilots never reach production
User adoption rates	Tool engagement	Incomplete — engagement without outcome measurement proves nothing
Agentic work units (Salesforce)	Work completed per AI action	Outcome-linked — ties compute to business result
P&L-verified productivity gains	Documented financial impact	Defensible — the metric that separates MIT’s 5% from the 95%

How measurement maturity determines outcomes

The clearest empirical link between measurement practice and value realisation comes from the Return on AI Institute’s March 2026 report, based on a survey of 1,006 global executives.^{[Return on AI Institute, March 2026]} The research maps a six-stage AI economic maturity model and tracks the percentage of organisations reporting a “great deal of value” from their AI investments at each stage. The gradient is steep.

STAGE 01

Pilots only — no measurement

AI in proof-of-concept phase with no defined success criteria or post-deployment tracking.

High value

STAGE 02

Production — no outcome assessment

Systems reach deployment but value tracking is not established. The most common failure mode.

18%

High value

STAGE 03

Post-implementation measurement

Outcomes tracked after deployment against defined business metrics.

44%

High value

STAGE 04

Value aggregated across use cases

Impact measured consistently and consolidated across multiple AI deployments.

58%

High value

STAGE 05–06

Formal reporting to leadership / external stakeholders

AI value formally reported with financial verification. Accountability is structural, not ad hoc.

85%

High value

The jump from Stage 2 to Stage 3 alone lifts the proportion reporting high value from 18% to 44%. The move to formal reporting raises it to 85%. These are not marginal differences. They describe a different category of organisational outcome, driven entirely by measurement and accountability discipline — not by which AI systems were deployed or how much was spent.

The finance function's role

The Return on AI Institute identifies finance department involvement in certifying AI value as one of seven factors that most reliably drive economic returns. Deloitte found that only 28% of global finance leaders can report clear, measurable value from AI investments. The gap between those two data points is where most AI spend currently disappears.

What separates the leaders from the rest

PwC’s April 2026 AI Performance Study found that 74% of AI’s economic value is captured by just 20% of organisations.^{[PwC, April 2026]} The research identified what distinguishes those organisations: they use AI as a growth and business reinvention engine targeting new revenue opportunities — not just cost reduction. They are 2.6 times more likely than peers to report AI improving their ability to reinvent their business model.^{[PwC, April 2026]}

MIT’s NANDA study found the 5% generating real P&L impact shared two consistent characteristics: they applied AI to specific, high-impact problems rather than horizontal deployment, and they established KPI ladders before build started — lead metrics capturing early behavioural signals within two weeks, lag metrics measuring P&L outcomes at 90 and 180 days.^{[MIT NANDA, 2025]}

McKinsey’s finding reinforces the governance angle: organisations with explicit ownership of responsible AI score 2.6 out of 4 on the RAI maturity model; those without score 1.8.^{[McKinsey AI Trust Survey, 2026]} The 0.8-point gap is the largest single differentiator in McKinsey’s dataset — larger than industry, region, or investment level in isolation. For a CAIO or board considering where to direct governance investment, this resolves the prioritisation question: accountability structure delivers more than budget or sector alone.

Why governance determines whether AI pays off

Seven factors that drive AI economic returns

The Return on AI Institute and Harvard Business School identified seven factors most reliably associated with AI value: clarity on what kind of value the organisation is pursuing; seeking value in both products and processes; using the full range of AI types; adopting a formal framework for value creation; involving finance in certifying outcomes; training both users and leadership; and using an economic maturity model to track how AI creates value. Six of the seven are governance and process decisions, with no dependency on model selection or infrastructure spend.

Across these studies, the organisations that generate measurable AI value share governance characteristics: they define success criteria before deployment, track outcomes after deployment, assign formal accountability for results, and involve finance in certifying the value claimed. The organisations that skip these steps — and the data suggests most do — produce spend without returns.

Larridin’s survey of 350 finance and IT leaders found 83% report Shadow AI growing faster than IT can track, and 84% discover more AI tools during audits than expected.^[Larridin] Heinz Marketing described it as “the equivalent of pouring fuel into a car with no dashboard, speedometer, or steering alignment.”^{[Heinz Marketing, 2025]} Spend is tracked; outcomes are not.

The Return on AI Institute’s maturity data translates this into measurable terms. Stage 1 organisations — pilots with no measurement — report high value in 4% of cases. Those at the highest maturity stage — formal outcome reporting to leadership and external stakeholders — report it in 85% of cases.^{[Return on AI Institute, March 2026]} That 81-point gap closes entirely through governance decisions, not technology choices.

Sources

MIT NANDAThe GenAI Divide: State of AI in Business 2025. MIT Project NANDA, July 2025. 300+ AI initiatives, practitioner interviews and structured surveys.

McKinseyMcKinsey Global AI Survey 2026; State of AI Trust in 2026 (March 2026). ~500 organisations surveyed Dec 2025–Jan 2026.

RAND CorporationAI Project Outcomes Analysis, 2025. Synthesis across 2,400+ enterprise AI initiatives.

S&P GlobalEnterprise AI Initiative Outcomes, 2025.

PwCAI Performance Study, April 2026. 1,217 senior executives across 25 sectors.

Return on AI Institute / HBREconomic Maturity for Artificial Intelligence, March 2026. 1,006 global executives. With Thomas Davenport, Babson College / MIT IDE.

GartnerAI use cases in Infrastructure and Operations, December 2025. 782 I&O leaders.

DeloitteNavigate the Economics of AI / AI Token Economics for CFOs, January 2026.

Axios / CNBCTokenmaxxing and Salesforce AWU coverage, April 2026.

PYMNTSAI Adoption Metrics Face Scrutiny Over Token-Based Measures, March 2026.

MIT SloanEnterprise AI project approval and measurement study, 2025.

IDC / Larridin / Heinz MarketingSupporting data on pilot-to-production rates, Shadow AI, and AI infrastructure visibility.