Why Enterprise AI Agents Fail Before the Model Ever Runs

Enterprise technology has a habit of declaring transformations finished before the work is actually done. Cloud was called complete while most companies were still running hybrid environments they barely understood. Digital transformation was declared a success while the workflows underneath were still built for paper. The same pattern is now playing out with AI.

In the past eighteen months, the conversation has jumped from how to deploy AI to how to deploy agentic AI. Almost every enterprise is now talking about autonomous systems that run long workflows on their own, with the human moved up into a supervisory role while the agent executes. The capabilities really do exist. And even the productivity case is genuine. What the conversation has skipped is everything that sits beneath the agent.

I see these systems deployed across industrial customers, and the pattern in real deployments is consistent. When a deployment fails, the cause is almost always upstream of the model. The reasoning works, and the model does what it is supposed to do. The trouble is in what the agent is reading, and whether it should believe what it has read.

This is what I mean by data perception. Before an agent can be useful inside an enterprise, it has to be able to see the data that matters: the right records, in whatever format they happen to sit, under whatever access rules apply, with some confidence that what it reads is current and accurate. Building that visibility is itself an agentic problem. It takes specialized agents that read across different source systems and check each other’s work as they go. Without it, autonomy becomes a liability, because the speed of automation turns small errors into expensive ones before anyone is watching.

96%

of organizations are already running AI agents in some capacity

OutSystems 2026

94%

are concerned that agent sprawl is raising complexity, technical debt and security risk

OutSystems 2026

12%

have a centralized platform to govern any of it

OutSystems 2026

The structural mismatch

Enterprise data did not grow up to be read by an agent. It grew up to be read by people who know how to handle ambiguity, who know when to ask a question and when to stop and check before they act. A senior engineer working in a SAP system, a planner working in a Teamcenter PLM, a clinician working in a hospital EMR, a risk officer working in a bank’s trading platform: these are people who know what they do not know. An agent does not, unless it has been built to.

The data itself is messier than the demos suggest. Most large enterprises hold their information in hundreds of formats, from flat files and scanned PDFs to sensor streams, SQL tables, document warehouses, proprietary schemas in CAD and PLM systems, and vendor APIs that change without warning. Access permissions have drifted from their original intent. Lineage is partial at best. Different parts of the same business often hold different views of what should be the same number. This is the actual environment any agent has to operate in.

Drop an autonomous agent into that environment and it does one of three things. It flags the gaps and asks. It papers over them with inference that sounds confident. Or it proceeds on bad data and acts on a wrong conclusion. Most agents in production today do the second or the third, because they were built for demo conditions where the data is clean, access is open and the questions are predictable. Those conditions disappear the moment the agent meets a real deployment.

Three responses to messy data

What an agent does when the data underneath it is not clean

Only the first response is the one you want. Agents tuned on clean demo data tend toward the second and third, where the failure is invisible until it has already moved downstream.

Why the failures compound

A single bad recommendation from an AI system is a contained problem. An autonomous agent that touches a thousand customer accounts in a day, or a thousand trades, can produce a thousand problems before anyone notices the pattern. Compounding is the issue. By the time the failures show up at the supervisory layer, the window to clean them up has already closed.

The scale of adoption is what makes this urgent. A recent OutSystems survey of nearly 1,900 enterprise IT leaders found that 96 percent of organizations are already running AI agents in some form, and 94 percent are worried that the resulting sprawl is adding complexity and security exposure. Only 12 percent have any kind of centralized way to govern it. That gap is the part that concerns me. It is a foundation problem, and the foundation is the data underneath.

Also Read How Model Routers Cut AI Costs by Up to 70% for Enterprises

Why moving the data backfires

When organizations see how messy their data environment really is, the first reaction is usually to fix it by moving the data. Pull it into a data lake, normalize it, ship it to a third-party AI platform that promises to handle the cleanup. I have watched this play out across dozens of customer engagements, and it almost always makes the problem harder, for three reasons.

The first is integrity. A copy of operational data starts drifting from its source the moment it is created. For quarterly financial reporting, that drift is tolerable. For an agent making decisions in manufacturing, logistics, energy or banking, the drift turns into wrong decisions taken at machine speed. The agent ends up operating on yesterday’s reality and acting at today’s pace.

The second is security. Every time data is moved, a new copy lands in a new place with new controls. A handful of movements are manageable. The volume of data needed to support agentic workflows generates hundreds of them, and the attack surface grows with each one.

The third is competitive. The unique part of your AI system is your data: the operational processes and institutional knowledge you have built up over years. The model layer behind the agent is commoditizing faster than most companies appreciate, and what sets your version apart is the part the model never had access to. Ship that data out to a third-party platform and you have handed a competitor most of what they would need to copy what makes your version different.

The alternative is to leave the data where it is and bring the intelligence to it. That is what data perception looks like at the system level: an AI capability that reads across the existing enterprise environment, in whatever form the data exists, while the data itself stays inside the perimeter the organization already controls. How that capability gets built is the harder engineering question, and the part the broader industry has not yet caught up to.

Dimension

Move the data out

Bring intelligence to the data

Integrity

Copies drift from the source the moment they are made

Agents read the live source, so decisions track current reality

Security

Every copy is a new location with its own controls; the attack surface grows

Data stays inside the perimeter the organization already controls

Competitive moat

Institutional knowledge leaves the building with the data

The part rivals never had access to stays in-house

Over time

Needs constant re-syncing and still falls behind

Stays calibrated against the source systems as they change

“ The model layer is commoditizing fast. What sets your AI apart is the part the model never had access to. Move that data out, and you hand a competitor most of what they would need to copy you.

How a perception layer gets built

The way data perception works in practice is itself agentic. The same pattern that makes the application-level agent useful, a model paired with tools, running inside a harness, checked and corrected by other agents, is what makes the perception substrate work.

In a real deployment, the perception layer is a set of specialized agents, each built for a particular kind of system. One agent knows how to read structured systems like SAP and Oracle ERP. Another handles engineering systems such as Teamcenter and Windchill. Others cover clinical systems like Epic and Cerner, time-series and sensor feeds, and the unstructured document and image stores that hold so much of what an enterprise actually knows. Each agent carries its own harness for that system: the schema it expects, the access rules it has to follow, the checks it runs, and what it does when something does not add up. That is the kind of knowledge a person who has worked with the system for years would already carry in their head.

The architecture in one view. Each source system keeps its data in place. A reader agent built for that system pulls a view, the agents cross-check one another, a reconciling agent settles conflicts, and only the validated result rises to the agent running the workflow. The perception agents are governed with the same validation and audit discipline as the agent they feed.

The layers cross-check each other continuously. When an extraction agent reads a customer record from one system and a lineage agent flags an inconsistency with the same record somewhere else, a third agent reconciles the difference and writes the correction back into the perception layer. That feedback loop is the part that matters most. Without it, the perception layer drifts from reality just like a static data lake. With it, the layer stays calibrated against the source systems as those systems change.

This is the practical consequence of building data perception as agentic infrastructure. The agents that perceive the data have to be governed exactly like the agents that consume it, held to the same validation and audit discipline and behind the same approval gates. A perception layer built without those properties becomes one more fragile dependency that fails the moment something relies on it.

What governance has to answer

Responsible agentic deployment comes down to a short set of questions that most organizations cannot currently answer. They are worth putting to any agent before it goes near production.

The governance checklist

Four questions most organizations cannot yet answer

Supervisory oversight does not move accountability. Legal and operational responsibility stays with the organization that deployed the agent, even when the agent was acting on its own. An agent that cannot produce a record of what it did is an agent that cannot be governed.

Take the first question. What an agent sees has to be measured by what it reads in production as the data environment changes, not by what a design document said it should read. The second is harder, because supervisory oversight does not move accountability anywhere. The legal and operational responsibility stays with the organization that deployed the agent, even when the agent was acting on its own.

The third is auditability. Every decision, every data access, and the reasoning behind them, has to be reconstructable after the fact. An agent that cannot produce that record cannot be governed, and that will be a regulatory minimum in most jurisdictions well before the end of this decade. The last question is the one teams skip. The agent will fail at some point, and the protocols for that failure need to exist before it goes live, because compounding gives most incident-response processes too little time to react.

Do the foundation work in parallel

The argument here is about sequencing. The capability is advancing too fast, and the productivity gains are too real, for hesitation to be a strategy. Organizations that wait for a perfect governance framework before they deploy anything will miss the window entirely.

The ones getting this right are doing the perception work in parallel with the agent rollouts. They are taking real inventory of where their data lives and setting up governance before deployment rather than retrofitting it later. And they are keeping the institutional knowledge that makes their systems valuable inside the company. That is how an organization ends up with autonomous AI running at production scale and the means to keep it there.

The ones that skip the foundation will spend the next three years cleaning up what got built in the rush, long after the rest of the market has moved on. The agentic era has arrived. The companies that get real value from it will be the ones that did the unglamorous work of seeing their own data first.

Dr. Arun Subramaniyan

Founder & CEO, Articul8

Arun Subramaniyan is the founder and CEO of Articul8, a domain-specific generative AI platform. He previously led cloud and AI strategy at Intel and ran the extreme-scale computing team at AWS, spanning machine learning, high-performance computing and quantum computing. Earlier he founded the AI products team at GE’s Oil & Gas division and led development of GE’s Digital Twin platform at its Global Research Center. He is an Executive Fellow at Harvard Business School, where he teaches generative AI for business leaders, and holds a PhD in Aerospace Engineering from Purdue University.

Sources

OutSystems, 2026 State of AI Development report, based on a third-party survey of nearly 1,900 global IT leaders, for the figures on agent adoption, sprawl concern and centralized governance (96 percent, 94 percent and 12 percent).
The remaining observations on deployment failure, data movement and governance reflect the author’s own experience building agentic systems for industrial customers.

Editorial Note

The agent-adoption figures cited in this op-ed come from the OutSystems 2026 State of AI Development report. OutSystems is itself an enterprise AI development vendor. Readers are encouraged to consult the original report for full figures and methodology. The wider analysis is the author’s own.