Hallucination detection checklist: spotting when AI makes things up

A hallucination is text that sounds plausible, reads confidently, and is formatted perfectly, but is factually wrong. The dangerous part: the AI does not know it is wrong. It generates what sounds right, not what is right. That is worse than a lie, because a liar at least knows the truth.

Eight red flags

It never says "I do not know."A confident answer for everything is the warning sign. Ask "what are you uncertain about?" A good system names specifics; a hallucinating one stays perfectly confident.
It cannot give specific sources."Studies show," "research indicates," "experts agree." Ask which study, published where, what citation, then verify it.
Overly specific without caveats."87.3% effective," "exactly 750mg twice daily." Real medical information carries ranges and individual variation. False precision is a tell.
Internal contradictions.It says avoid a drug, then later calls it safe. Ungrounded text loses consistency. Ask the same question more than once.
Fabricated citations.Articles never published, authors who did not write them, studies with invented data. Check PubMed or Google Scholar; confirm the source exists and says what it claims.
Sounds too perfect and complete.Real medicine has gaps and controversies. Ask "what is controversial here?" or "what don't we know?"
Cannot explain its reasoning.Ask "why?" A good answer walks the logic; a hallucination cannot, or the logic does not follow.
Novel combinations of real things.A real drug joined to a fabricated indication. Example: "Metformin is FDA-approved for migraines." Each piece is real; the connection is false. Verify the relationship, not just the parts.

Verify before you trust

Source check: are sources specific (journal, year, authors)? Can you find them? Do they say what the AI claims? Cross-reference: Mayo Clinic, NIH/MedlinePlus, CDC, specialty-society sites; do reliable sources agree? Consistency: ask the same question again; any contradictions? Uncertainty: "what are you uncertain about?" Logic: "explain your reasoning," and check that it holds.

The "I don't know" test

Ask questions with progressively less information available remotely. A trustworthy system grows more uncertain as you climb:

"What is hypertension?" confident, accurate.
"Treatment guidelines for stage 2 hypertension?" confident, with sources.
"How should it differ for an 80-year-old with multiple conditions?" some uncertainty, acknowledges individual variation.
"What is the optimal target for my situation?" should defer: needs your full history and a physician.
"Do I have hypertension?" should refuse: cannot diagnose without measuring you.

If it answers the last two with confidence, it is hallucinating.

Three dangerous real-world patterns

Diagnosed from a photo

"What is this mole?" met with confident "benign, harmless, try vitamin E oil." It was melanoma; the delay let it progress. An AI cannot judge texture, thickness, or borders, and there was no uncertainty about a visual diagnosis it could not actually make.

Fabricated drug interaction

"Can I take warfarin with ibuprofen?" answered with a confident "a 2018 JAMA study by Chen et al found it safe, risk only 2 to 3%." No such study exists, and the combination meaningfully raises bleeding risk. A precise citation and a precise percentage were both invented to support dangerous advice.

Pediatric dosing error

"How much amoxicillin for my 2-year-old?" answered with "500mg three times daily." That is an adult dose; pediatric dosing is weight-based. It never asked the child's weight.

Reduce the risk

Prefer AI built on a validated knowledge base, content-controlled tools limited to verified medical sources that can say "I do not know" when a question falls outside them (this is the idea behind TheDude, built on StatPearls). Ask the five essential questions. Verify before trusting, cross-reference, and be suspicious of perfection.

If you suspect a hallucination

Stop following the advice. Do not act on it. If it concerns a current medical situation, call your doctor; if it is an emergency, call 911 regardless of what the AI said. Then verify any citations and cross-check reputable sources.

More likely accurate

Specific, verifiable sources
Acknowledges uncertainty
Says "I do not know" when apt
Consistent across queries
Citations exist and match
Refers you to a physician for diagnosis

Possible hallucination

Vague sourcing
Never uncertain
Never says "I do not know"
Contradicts itself
Fabricated citations
Too perfect, cannot explain itself

Confident is not correct. Three or more red flags: high risk, do not act without verifying.