AI experts have warned there is a risk doctors and specialists could soon be offering clinical guidance based on AI-hallucinated studies following a worrying revelation, in one of the world’s most respected medical publications, that there has been a 12-fold increase in a “dangerous new form of phantom evidence in healthcare”.
One AI specialist warned about “the possibility of fabricated references influencing evidence reviews and clinical guidance through repetition and perceived legitimacy”, while another said that the scale of this problem shows how “how hallucination becomes infrastructure”.
The Lancet, a peer-reviewed general medical journal respected by academics globally, recently published a Columbia University audit of 2.5 million biomedical papers that found 4,046 citations to research that doesn’t exist, scattered across nearly 3,000 published studies.
The rate climbed from 1 in 2,828 papers in 2023 to 1 in 277 by early 2026, with the sharpest spike coinciding, perhaps unsurprisingly, with the rise of AI writing tools in mid-2024.
The study’s lead author, Maxim Topaz, Associate Professor of Nursing, discovered the problem when an AI tool slipped a fake citation into his own paper, and it passed through multiple rounds of peer review before a single editor caught it.
Professionally, he said, after 15 years of research work: “I was mortified.”
Medical scientists use AI to help find supporting research, but if those initial finds are supported by fabricated citations lower in the chain, they don’t know if the ‘facts’ they are seeing are accurate, AI experts warn.
No researcher can verify the full citation chain, which is where fabricated references hide, they add.
Meanwhile, many doctors and specialists will use medical publications to guide their patient care, on the assumption that their content is robust.
The journals with the highest rates of fabrication are large open-access publishers that charge authors to publish, The Lancet found.
Phantom
Mitali Deypurkaystha, AI Strategist & Author at Newcastle upon Tyne-based Impact Icon AI, said critical treatment decisions could be taken because of fake data.
She continued: “Nearly 30% of UK GPs are already using AI tools during patient consultations, according to the Nuffield Trust, yet fabricated biomedical citations are now spreading through research at a rate of 1 in 277 papers, creating a dangerous new form of ‘phantom evidence’ in healthcare.
“The old warning that a lie can travel halfway around the world before the truth has pulled its boots on becomes far more serious when AI can package fiction with the confidence and formatting of legitimate science.
“The real danger is not a single fake study, but fabricated references quietly recycling through reviews, guidelines and AI systems until nobody can tell where the contamination started.
“If a treatment decision is later linked to research that never existed, we may discover modern medicine has created an accountability black hole where publishers, AI companies, reviewers and regulators can all deny responsibility.”
Infrastructure
Rohit Parmar-Mistry, Founder at Burton-on-Trent-based Pattrn Data, said patients may be at risk.
He added: “This is not a footnote problem. It is a patient safety problem wearing academic clothing. If a fake study survives AI drafting, author review and peer review, the system is telling us something uncomfortable: publication has become too fast and too trusting of plausible text.
“A rate of 1 in 277 papers is not background noise. In medicine, small errors compound. One fabricated citation can be copied into another paper, then into a review, then into guidance, until nobody can see the original fraud because there was never an original to find. That is how hallucination becomes infrastructure.
“In our AI Audits, the recurring failure is not AI use. It is AI use without verification strong enough for the risk. Biomedical research needs provenance checks, citation sampling and liability that follows the decision chain.
“If a journal charges for volume but cannot verify the basics, regulators should give its evidence less weight. Trust is not peer-reviewed. It is earned, citation by citation.”
Perceived legitimacy
Katrina Young, AI & Digital Transformation Strategist at KYC Digital, said incorrect citations may be passed on with no checks.
She added: “The issue is not just AI hallucinations. It is a verification weakness AI has accelerated. Peer review was never designed to authenticate every citation in a reference chain, and researchers cannot realistically verify every downstream source manually.
“That creates conditions where fabricated references can pass through authors, reviewers and publishers unnoticed.
“The concern is not one fake citation in isolation, but the possibility of fabricated references influencing evidence reviews and clinical guidance through repetition and perceived legitimacy. Regulators and publishers now need stronger citation validation and provenance checks.”
Ramifications
Colette Mason, Author and AI Ethics Consultant at London-based Clever Clogs AI, said these findings have “important ramifications for all industries”.
She added: “If an academic with 15 years of peer-review experience nearly published a hallucinated citation despite multiple rounds of peer review, what chance does a member of the public have when they upload biomedical papers to analyse their own needs?
“Patients routinely ask AI tools to summarise studies, explain treatment options, and find supporting evidence, with no way of knowing whether the citations they’re shown point to work that was actually conducted.
“When the professional safeguards failed a specialist, expecting a lone layperson to catch what peer review couldn’t isn’t a realistic expectation.
“This has important ramifications for all industries. The audit only looked at biomedical research, and legal, financial, engineering and other sectors weren’t analysed. There’s no reason to believe they’re clean either.”


