The VGER Blog

95% Failure

Written by Patrick O'Leary | Aug 22, 2025 9:50:58 PM

This past week I’ve been peppered about the article on 95% failure rate of AI Projects in paper titled "The GenAI Divide STATE OF AI IN BUSINESS 2025” (not public yet, requires signup & approval) from an MIT team working on Project NANDA.

Side Note: NANDA is MIT’s framework for building distributed, capable agent systems, using standards like MCP and A2A — part of what they call the emerging ‘Agentic Web.’

MIT report: 95% of generative AI pilots at companies are failing

It’s definitely a headline grabber, I got a hold of the actual paper so lets dig into it

The actual statement in the paper is

Despite $30–40 billion in enterprise investment into GenAI, this report uncovers a surprising result in that 95% of organizations are getting zero return

Followed by

Tools like ChatGPT and Copilot are widely adopted. Over 80 percent of organizations have explored or piloted them, and nearly 40 percent report deployment. But these tools primarily enhance individual productivity, not P&L performance.

AI technology itself works well employees already see daily returns from "shadow AI" tools like ChatGPT, Claude, and Copilot. (The shadow part is where employees use the tooling without corporate accounts, contracts , BAA’s etc..).

The failures lie in enterprise-scale AI projects: ambitious pilots, unnecessarily complex systems, and poorly aligned tools that lack scalability and adaptability.

The actual story isn't about "95% failure"—it's that "95% of spending is being misdirected toward the wrong types of AI initiatives."

 

What Did Companies Do Wrong?

From the report and broader industry experience, three key issues emerge:

  • Over-expectation: Executives anticipated immediate P&L impact, treating AI as another "silver bullet" technology.
  • Poor focus: Companies directed spending towards flashy, front-office applications (sales & marketing) instead of back-office workflows where ROI is demonstrably higher.
  • Inadequate learning approach: Pilots became lengthy, expensive, and inflexible. While effective research models fail quickly and cheaply, enterprise AI initiatives often fail slowly and at great cost.

The MIT NANDA team is coining this as the GenAI Divide (at least that’s my reading of it)

Lastly experience, throughout the article, user feedback described internal solutions as lacking capability, e.g. memory, flexibility and adaptability fueling user rejections.

For many that drove them to adopt Shadow AI, like personal ChatGPT accounts. ChatGPT is a product built with memory, tooling, adaptable agents (search, deep research, images etc…) all of which tap into powerful GPT LLMs.

Hundreds of millions are spent on developing it, don’t try to build your own.

Trust me, I’ve built several of my own.

The Missing Role of Product Management in AI

Most of the failing efforts are engineering or data science led, with little product rigor. That means:

  • Too much focus on model quality, features, and technical novelty.
  • Not enough focus on workflow integration, user needs, and measurable outcomes.
  • AI is being deployed as a “cool feature” rather than a solution to a validated problem.

A strong AI product manager would ask:

  • Does this solve a real business pain point?
  • How will it fit into existing workflows?
  • What does success look like in P&L terms?

Without that discipline, companies build demos, not durable tools.

What’s Proven to be Successful?

The report shows patterns in the 5% that succeed, al bet with a low N=300 / K =15

  • External partnerships - succeed at twice the rate of internal builds.
  • Workflow-specific customization - narrow, high-value use cases scale better than general-purpose tools.
  • Learning-capable systems - systems with memory, adaptability workflows (agentic), and continuous feedback loops and teams to support them generate ROI.
  • Back-office focus - eliminating BPO spend, agency contracts, and repetitive admin yields real, measurable savings.

Front office AI can succeed, but only when designed with a realistic understanding of AI’s capabilities. The overlooked back office, however, often delivers faster and clearer ROI.

I have worked on far too many recovery or tiger team projects where we found AI means LLMs and product gaps were filled with naive chatbots and prompts. That’s not the way to do it.

Strategies for Struggling GenAI Initiatives

For companies stuck on the wrong side of the GenAI Divide:

  • Embrace rapid, cost-effective experimentation: Instead of investing 9 months and millions in a pilot, test concepts in weeks using inexpensive prototypes.
  • Integrate product management discipline: Treat AI initiatives as products, not research experiments. Clearly define the problem, establish success metrics, and map out the adoption journey.
  • Redirect investment focus: Look beyond flashy front-office applications. Allocate resources to back-office functions and workflow automation where ROI accumulates more predictably. (focus on easy wins)
  • Support grassroots adoption: Bottom-up implementation led by power users and managers typically succeeds more often than top-down IT directives.
  • Partner with vendors: Collaborate with partners who help evolve your workflows.

Remember you are in a time of change, you can’t have adoption without transformation.

 

Originally linked from https://www.linkedin.com/pulse/95-failure-patrick-o-leary-csd6e