Doctors Horrified After Google’s Healthcare AI Makes Up Nonexistent Human Body Part

Joseph Brown
Written By Joseph Brown

SpookySight Staff

Artificial intelligence is being hailed as the next big breakthrough in medicine. Advocates say it could help doctors diagnose diseases faster, catch errors that humans might miss, and save countless hours of administrative work. The vision is seductive: AI tools scanning thousands of medical images in seconds, spotting patterns invisible to the human eye, and generating reports with the click of a button.

But for all its promise, AI has a fatal flaw — it can be wrong. Not just “oops, a typo” wrong, but confidently, persuasively, and dangerously wrong. And unlike a tired doctor who might hesitate before writing something unusual, AI will say its mistake with the swagger of absolute certainty.

Recently, Google’s own healthcare AI, Med-Gemini, made an error so bizarre that it sounds like the setup for a medical comedy skit. It claimed to have found a problem in a part of the brain that doesn’t exist.

The Case of the Fictional Brain Part

In a 2024 research paper, Google proudly showcased Med-Gemini’s abilities by having it analyze brain scans. Somewhere in that analysis, the AI flagged what it described as an “old left basilar ganglia infarct.”

Now, to someone outside the medical field, that might sound legitimate — maybe even impressively precise. But neurologists spotted a glaring issue: there’s no such thing as the “basilar ganglia.” The real structure is called the basal ganglia, which is involved in movement control, learning, and habit formation. The AI appears to have accidentally combined this with the basilar artery, a large blood vessel at the base of the brainstem.

The result? A brain part that sounds like it belongs in a medical textbook but is pure fiction.

YouTube video
Related video:4 Ways Artificial Intelligence is Transforming Healthcare

Read more: OpenAI’s ’01’ Model Tried to Copy Itself Before It Being Shut Down, But Denied It When It Was Caught

How It Took Over a Year to Catch the Mistake

This wasn’t an error caught in the lab before publication. It sat in a public research paper for more than a year before board-certified neurologist Bryan Moore noticed it and contacted The Verge.

Google later admitted the error probably came from “mis-transcription” in its training data — essentially, the AI learned the wrong word pairing because it had seen similar errors in real-world medical reports. In its blog post, Google corrected the terminology, noting that “basilar” often shows up mistakenly instead of “basal.”

But here’s the twist: while Google fixed its blog post, the original scientific paper still contains the phantom brain part. For a company aiming to lead in healthcare AI, leaving such an error untouched in official documentation is eyebrow-raising, to say the least.

What Exactly Is an AI Hallucination?

The “basilar ganglia” mix-up is a textbook example of what’s called an AI hallucination. This isn’t a glitch in the sense of a computer crash. It’s a confident fabrication.

Large language models (LLMs) — the tech behind systems like Med-Gemini — don’t “know” facts the way humans do. They predict the most likely sequence of words based on patterns in massive amounts of text. When they encounter gaps or inconsistencies in their training data, they sometimes improvise — blending details, inventing terms, or pulling concepts from unrelated contexts.

To a human reader, these outputs often sound plausible, because the AI has learned to mimic the style and vocabulary of real experts. But plausibility is not the same as accuracy.

Why Medicine Is Especially Vulnerable

In everyday online searches, a hallucination might give you a wrong date for a historical event or an incorrect cooking temperature — annoying but rarely life-threatening. In healthcare, however, the stakes are astronomically higher.

There are a few reasons medicine is a particularly tricky domain for AI:

  1. Specialized Language – Medical terms are complex and often differ by just a few letters (like “basal” vs. “basilar”), making it easy for an AI to mix them up.
  2. Sparse Data – Unlike general internet text, high-quality, peer-reviewed medical data is limited, which means the AI often has less reliable training material.
  3. Context Sensitivity – In medicine, a tiny change in terminology can mean a completely different diagnosis or treatment.
  4. Authority Bias – Doctors and patients may give undue weight to AI-generated suggestions because they appear objective or data-driven.

Other High-Profile AI Errors in Healthcare

The “basilar ganglia” case isn’t the first time AI has made a potentially dangerous medical blunder:

  • IBM Watson for Oncology once recommended cancer treatments that oncologists described as “unsafe and incorrect” during internal testing.
  • In 2022, an AI system designed to screen for skin cancer misclassified benign moles as malignant and vice versa when lighting conditions in photos varied.
  • A research project in the UK found that an AI tool for diagnosing pneumonia sometimes relied more on quirks in X-ray machine settings than on the actual presence of disease — meaning it wasn’t really “seeing” the illness at all.

These cases underscore the danger of assuming AI systems are infallible just because they produce sophisticated-sounding answers.

Read more: 15 Honest Phrases You’ll Never Hear From a True Narcissist

Google’s Other Medical AI — And Its Own Issues

Med-Gemini isn’t Google’s only player in this space. Its newer model, MedGemma, also shows inconsistencies. In testing, it gave different answers depending on how a medical question was phrased. This isn’t just a minor inconvenience — in healthcare, small shifts in wording can make the difference between an accurate diagnosis and a completely wrong one.

Dr. Judy Gichoya of Emory University points out a fundamental flaw: “Their nature is that they tend to make up things, and it doesn’t say ‘I don’t know.’” In other words, these systems can’t admit uncertainty — they’ll always try to give some answer, even if it’s fiction.

The Rapid Push Into Hospitals

Despite these problems, AI systems are being deployed in hospitals at an accelerating pace. Current uses include:

  • Radiology – Interpreting CT scans, MRIs, and X-rays.
  • Pathology – Analyzing tissue samples for signs of disease.
  • Clinical Documentation – Transcribing and summarizing doctor-patient conversations.
  • Predictive Analytics – Forecasting patient outcomes or the likelihood of complications.
  • Drug Discovery – Identifying promising chemical compounds for future medicines.

While the potential benefits are real, the speed of rollout raises questions about safety, oversight, and long-term reliability.

Why Human Oversight Is Still Essential

For now, the consensus among experts is that AI should assist, not replace, medical professionals. Every AI-generated report must be reviewed by someone who can verify its accuracy.

Dr. Shah of Providence Health warns against treating “human-level performance” as the benchmark: “AI has to have a way higher bar of error than a human. Maybe other people are like, ‘If we can get as high as a human, we’re good enough.’ I don’t buy that for a second.”

This is because human doctors bring not just knowledge but judgment, context awareness, and the ability to recognize when something doesn’t make sense.

The Efficiency Paradox

Here’s the irony: one of AI’s biggest selling points in healthcare is efficiency — the idea that it will save time. But if every AI result needs thorough human review to guard against hallucinations, the time savings shrink. In some cases, verifying AI outputs can take longer than doing the work manually.

That’s not to say AI has no role. It can still be a valuable tool for flagging unusual cases, offering second opinions, or handling routine administrative tasks. But for anything involving a diagnosis, treatment plan, or surgery preparation, human oversight isn’t just advisable — it’s non-negotiable.

YouTube video
Related video:Is AI transforming the future of healthcare? | The Stream

Read more: The Butterfly Effect Is Real—But Science Says It Works Differently Than You Think

The Future — Smarter AI or Smarter Use of AI?

The “basilar ganglia” incident is unlikely to be the last of its kind. As AI tools become more embedded in medicine, the focus shouldn’t just be on making the models “smarter” but on building systems around them that catch and correct errors before they reach patients.

This could involve:

  • Mandatory expert review of AI outputs before they enter medical records.
  • Training AI on higher-quality, verified medical datasets rather than mixed internet sources.
  • Developing AI systems that can express uncertainty rather than always giving a definitive answer.
  • Transparent error tracking so hospitals and regulators can see patterns in AI mistakes.

Final Thoughts — Proceed, But With Caution

On the surface, the invention of a fictional brain part might seem amusing — the kind of goof you’d share in a “tech fails” list. But in reality, it’s a symptom of a deeper issue: AI is not yet a perfect fit for life-and-death decision-making.

The real danger isn’t that AI makes mistakes — humans do that too. The danger is that AI mistakes often sound so convincing that they slip past even trained professionals.

Until AI can consistently tell the difference between fact and fiction — and admit when it’s unsure — it will remain a powerful but risky partner in healthcare. For now, if your medical report says you have an issue in a body part you’ve never heard of, the safest bet is still to ask a real human.

Image: Freepik.