KPMG pulls report on AI usage due to apparent hallucinations

KPMG withdrew a report titled "Total Experience: Redefining Excellence in the Age of Agentic AI" from its websites after the AI detection firm GPTZero identified a string of inaccuracies that traced back to AI hallucinations.

The Financial Times independently verified the findings. Four organisations named in the report, UBS, the UK's National Health Service, Swiss Federal Railways and Transport for London, told the FT that the claims made about their own use of AI were either untrue or misleading.

The irony writes itself, and most of the coverage has not resisted it. A professional services firm that sells AI strategy consulting to enterprise clients published a flagship report on AI adoption that appears to have been written with inadequate human verification of AI-generated content. The scale of the citation audit is worth setting out plainly:

45 citations appeared in the withdrawn report
Only 5 pointed to real, intact sources
The other 40 titles were either fabricated outright or referenced material that did not say what the report claimed

That is not a quality control slip on the margins of an otherwise sound document. It is a report whose evidentiary basis was almost entirely invented, published under the name of one of the most trusted brands in global professional services.

How bad, specifically

The scale of the fabrication is the detail that separates this from a routine correction. GPTZero also flagged a statistical inconsistency within KPMG's own publishing: the withdrawn report claimed 55% of chief executives ranked AI as their top investment priority, while a separate KPMG publication released around the same time cited a figure of 71% for what appears to be the same underlying data point.

This kind of internal contradiction suggests the report was not carefully checked against KPMG's own existing research before publication, let alone against external sources.

❝

A KPMG spokesperson confirmed the firm removed the report while conducting an internal investigation, stating: "We expect all our people to follow our guidelines on the responsible use of AI, including human oversight to validate content and verify independent sources."

This is the right policy. The report's existence is evidence the policy was not followed in practice, at least on this occasion.

This is not an isolated incident, and the pattern is the real story

KPMG is the second major professional services firm to withdraw a report over AI-generated fabrications in as many months. EY pulled a study on loyalty rewards programmes in May after GPTZero identified what it described as fake footnotes and hallucinated content in that report too.

❝

Edward Tian, GPTZero's chief executive, has warned that the danger compounds over time: inaccurate reports published by trusted institutional names get cited by other media outlets and downstream research, spreading what he called "second-hand AI hallucinations," fabrications that get laundered into the next generation of supposedly credible analysis simply by virtue of citation.

This dynamic should worry anyone in AEC who increasingly relies on third-party market research, consulting white papers and industry benchmarking reports to inform technology investment or strategic planning decisions. If two Big Four firms have already published AI-hallucinated content in flagship reports this year, the base rate for this kind of error across the wider consulting and research industry is very unlikely to be zero.

The practical takeaway is not that AI-assisted research is inherently unreliable. It is that the verification step many firms assumed was happening internally evidently was not at two of the most resourced and reputation-conscious organisations in the professional services sector.

Why this lands differently in construction

AEC has its own version of this exposure already underway. Construction firms are rapidly adopting AI agents for scheduling, estimating, RFI drafting and document review, often with the explicit promise that AI reduces the manual verification burden instead of adding to it.

RIB Software's newly launched Unify platform and Procore's continued AI-native acquisitions are both built on that promise. The KPMG episode is a useful, uncomfortable reminder that AI confidence and AI accuracy are not the same thing and that the most polished, fluent-sounding AI output is often exactly where human reviewers are most likely to let their guard down because it reads as already finished.

A piece from Judy AI Lab made this point well in its analysis of the incident: the structural failure here is not one employee's carelessness, it is a workflow that did not build in mandatory source verification at the point where output looked most complete.

For an AEC firm using AI to draft cost estimates, compliance summaries or safety documentation, that is precisely the failure mode worth designing against. The smoother the draft looks, the more scrutiny it should get, not less.

What good governance looks like in response

The fix being recommended across multiple analyses of this episode converges on a similar shape: a mandatory checkpoint in any AI-assisted content or analysis workflow requiring independent verification of every factual claim, citation and quoted source before publication or use in a client-facing or regulatory context.

However, it evidently was not happening at KPMG when this report went out, and the firm's own existing guidelines on responsible AI use, by its own admission, were not followed.

For construction firms further down the AI adoption curve than KPMG's consulting arm, the more conservative lesson is to treat any AI-generated output destined for a client, a regulator or a public document the same way you would treat a junior analyst's first draft: useful, often largely correct, and never assumed accurate without an independent check against primary sources.

A pattern, not a one-off

This is the second time in two months a Big Four firm has pulled a flagship report over AI hallucinations, and the repeated shape of the failure is worth naming directly:

A polished, confidently written report is published
An external detection firm, in both cases GPTZero, identifies inconsistencies that internal review missed
The firm pulls the document only after independent media verification makes the issue impossible to ignore quietly

This sequence, repeating twice within two months at two different Big Four firms, is the strongest evidence yet that the gap is structural, not a matter of individual carelessness at either organisation.

What this means for AEC firms buying research and consulting services

Construction and engineering organisations regularly commission or purchase market research, technology benchmarking studies and strategic consulting reports from exactly the tier of firm now implicated in two separate hallucination scandals this year.

The practical question this raises is not whether to stop trusting professional services research altogether, which would be an overcorrection, but whether procurement processes for this kind of work should now include an explicit verification clause: a contractual or, at minimum, a stated expectation that cited sources, statistics and named case studies in any delivered report can be independently traced and confirmed before the work is accepted as final.

A report that looks identical in formatting and tone to one written entirely by experienced human analysts may, as KPMG's episode demonstrates, have been produced with far less rigour than its presentation suggests.

Asking the commissioning firm directly how a given report's sources were verified, and treating an unsatisfying answer as a red flag instead of a formality, is now a reasonable part of due diligence, particularly for anything touching on AI adoption benchmarks, technology vendor comparisons or market sizing data that will inform real capital allocation decisions.

Takeaway

Build mandatory source verification into any AI-assisted reporting or analysis workflow before publication, not as an afterthought. KPMG's own guidelines required this and were not followed in practice, which is the more instructive failure than the hallucination itself.
Treat fluent, polished AI output as a higher-risk category for unverified errors, not a lower one. The smoothness of a finished-looking draft is precisely what makes hallucinated content harder to catch.
Be more sceptical of third-party consulting reports and market research cited in your own technology decisions. Two Big Four hallucination incidents within two months suggest this is a sector-wide verification gap, not a one-off lapse.
If your firm is piloting AI for client deliverables, compliance documentation or safety reporting, this is a useful internal case study to circulate now, before your own version of this story happens.

Verify before you publish, and before you buy

Project Flux covers where AI governance is failing in practice, going beyond where vendors claim it is succeeding. Subscribe to the weekly newsletter that keeps the verification conversation honest.

Links and Stuff

All content reflects our personal views and is not intended as professional advice or to represent any organisation.