AI confidence slips as engineering teams confront assurance reality

TL;DR:

  • Sentiment tracking by Expleo’s AI Pulse shows confidence in organisations’ ability to deploy AI successfully is slipping, framed by Expleo UK MD Jeff Hoyle as a reality check on the easy phase of AI adoption being over.
  • The pinch point is not model performance but the surrounding estate: fragmented data, inconsistent documentation, and assurance regimes built for systems with stable behaviour rather than for AI whose outputs depend on data conditions and usage patterns.
  • Resultsense view: this is the moment when UK engineering and operations leaders move from “model demos” to “operational evidence”, and where UK government AI assurance guidance, AI Management Essentials and the ISO/IEC 42001 standard become live procurement criteria rather than reference documents.

Hoyle argues the cost of going from pilot to production is rarely the model itself. Engineering teams inherit ageing systems and data captured for unrelated purposes; labels drift, histories contain gaps, and what works for reporting often falls short in live AI use. Confidence dips because that gap becomes visible only once AI is operating against day-to-day reality.

Traditional assurance under pressure

Engineering assurance has historically been built around systems with defined behaviour and stable boundaries. AI breaks that assumption: outputs depend on data, model assumptions, and how operators use the system. Teams now need to show where a system works, where it does not, and how it behaves over time — with oversight that continues after deployment.

UK government guidance frames this clearly. The Introduction to AI Assurance treats the discipline as building justified trust through measurement, evaluation and communication. AI Management Essentials focuses on the management processes around AI rather than the product alone, and the ISO/IEC 42001 standard formalises that into a management system. Hoyle’s argument is that mature UK adoption now needs clear ownership, documented controls, and a defined retraining-or-human-oversight trigger when conditions shift.

Why aviation matters as a benchmark

EASA’s AI Concept Paper places as much weight on explainability and learning assurance as on capability itself. That bias toward post-deployment evidence is, Hoyle argues, where every regulated sector is heading. UK financial services regulators have been signalling similar expectations, and the Bank of England’s PRA the same week explicitly described AI-accelerated vulnerability discovery as a financial-stability concern.

Looking forward

For UK organisations buying AI from frontier labs and their newly formed deployment arms, the practical question is now whether the vendor can produce evidence that survives contact with a live engineering environment — not just controlled benchmarks. Expect tender language for AI services to start citing ISO/IEC 42001, AI Management Essentials and the AI Assurance guidance as default reference points, and expect UK SMEs in regulated supply chains to be asked to demonstrate assurance maturity well before their first AI deployment goes into production.