Module 9: Clinical Scenarios - AI In The Exam Room

Module 9: Clinical Scenarios (Applying the Framework)

Eight Patients Walk Into Your Office With AI Printouts, and Here's What Actually Happens

Module 9 of 10

80%

Introduction

Theory is comfortable. Practice is where you learn whether you actually understood anything.

For the past eight modules, I’ve given you frameworks, principles, scripts, and philosophy. The velociraptor test. The 10 billion sensors. Intelligent Humility. The liability asymmetry. You’ve nodded along, maybe highlighted some phrases, possibly thought “that makes sense.”

Now let’s see if it actually works.

This module is different. No new concepts. No additional frameworks. Just patients—eight of them, each presenting with a story that involves AI, each requiring you to integrate everything you’ve learned into a coherent clinical encounter.

These aren’t hypotheticals I invented. They’re composites of real patients I’ve seen, real AI failures I’ve witnessed, real integration challenges I’ve navigated. The details are changed, but the patterns are true.

For each scenario, I’m going to walk you through:

What the patient presents with
What AI told them (and why it seemed reasonable)
What AI missed (and why it matters)
What your examination reveals
The actual dialogue—what you say, how you say it
The teaching moment that helps them use AI better next time

Read these actively. Imagine yourself in the room. Practice the dialogues out loud if you can—they’ll flow more naturally when you need them.

Because you will need them. These patients are coming. Maybe not these exact presentations, but these patterns. AI-informed, AI-misled, AI-reassured patients who need you to integrate their digital research with your physical reality.

Let’s practice.

Scenario 1: The Chest Pain

Presentation:

25-year-old man, athletic build, presents with three days of left-sided chest pain. Worse with deep breathing and movement. Before you can ask your opening question, he says: “I already looked this up on ChatGPT. It said probably costochondritis. But my dad had a heart attack at 52, so I wanted to get checked.”

What AI Told Him:

Left-sided chest pain worse with movement and breathing suggests musculoskeletal cause
Costochondritis common in young adults
Cardiac causes unlikely given age and symptom pattern
Recommended seeing doctor “if concerned about family history”

What AI Got Right:

Symptom pattern does suggest musculoskeletal etiology
Costochondritis is a reasonable consideration
Appropriate to recommend evaluation given family history concerns

What AI Missed:

Cannot perform physical examination to confirm reproducibility
Cannot assess whether family history significantly elevates risk
Cannot rule out other causes (PE, pericarditis, etc.) without exam
Cannot provide the reassurance that comes from actual evaluation

Your Examination Findings:

Normal vital signs. Comfortable appearance. No distress. Reproducible point tenderness at left costochondral junction. Pain reproduced exactly with palpation. Normal cardiac exam. Lungs clear. No leg swelling or other PE risk factors.

Integration Dialogue:

You: “Let me start by saying—ChatGPT actually got this right. The pattern you described, and what I’m finding on exam, is consistent with costochondritis. Watch this.”

[Press on costochondral junction]

Patient: “Ow—yeah, that’s exactly it.”

You: “That right there is something AI couldn’t do. I just reproduced your pain with physical pressure on your chest wall. Heart pain doesn’t work that way—you can’t make cardiac pain happen by pressing on the outside. That’s a finding that turns ‘probably costochondritis’ into ‘confirmed costochondritis.'”

Patient: “What about my family history?”

You: “Good question, and I understand why it worries you. Here’s the reality: your dad had his heart attack at 52. You’re 25. At your age, with this symptom pattern, with a normal exam and reproducible chest wall tenderness, your cardiac risk is essentially zero. That family history becomes relevant in about 15-20 years—and we’ll monitor you appropriately as you get there. But today? This is musculoskeletal. AI got the diagnosis right. I just confirmed it.”

Patient: “So I didn’t need to come in?”

You: “Actually, you did. Not because AI was wrong—it wasn’t. But because the anxiety about your dad was real, and that anxiety deserved a real answer. AI can give you probabilities. I can give you certainty. You’ll sleep better tonight knowing I examined you than you would have trusting statistics alone. That’s appropriate use of the healthcare system.”

Teaching Moment:

“You used AI well here. You researched your symptoms, got a reasonable answer, but came in anyway because of your family history concern. That’s exactly right. AI for information, physician for confirmation. Keep doing that.”

Scenario 2: The Pediatric Fever

Presentation:

Mother brings 9-month-old with fever of 102°F for 24 hours. She’s been up all night researching. “The app said give Tylenol, push fluids, and monitor. It said to come in if the fever goes above 104 or if she has a rash or stiff neck. She doesn’t have any of those, but… I don’t know. Something feels off.”

What AI Told Her:

Fever common in infants, usually viral
Antipyretics for comfort
Hydration important
Red flags: fever >104°F, rash, neck stiffness, lethargy, difficulty breathing
See doctor if red flags present or parental concern

What AI Got Right:

Most fevers in this age group are viral
General management advice is sound
Red flag list is accurate
Appropriately included “parental concern” as reason to be seen

What AI Missed:

Cannot assess baby’s interaction with environment
Cannot evaluate quality of cry
Cannot determine hydration status
Cannot see subtle lethargy versus normal tired baby
Cannot validate or investigate parental “something feels off”

Your Examination Findings:

Baby is quiet. Not crying, but not engaging either. Making poor eye contact. When you try to get her attention, responses are delayed. Skin turgor slightly decreased. Heart rate 180 (elevated beyond fever). Capillary refill 3 seconds.

Integration Dialogue:

You: “Mom, I need to tell you something. You were right to come in.”

Mother: “But the app said—”

You: “The app gave you good general guidance. For most fevers in 9-month-olds, that advice would be correct. But here’s what I’m seeing that no app could see: when I try to engage your daughter, she’s not tracking my face the way a healthy 9-month-old should. Her heart rate is faster than I’d expect even with this fever. Her skin doesn’t bounce back when I pinch it—that’s dehydration. She’s not on the red flag checklist, but she’s not right.”

Mother: “So what does that mean?”

You: “It means we need to admit her for evaluation and IV fluids. It might be a serious bacterial infection, it might be severe dehydration from a virus—I don’t know yet. But I know she needs more than Tylenol and fluids at home.”

Mother: “I almost didn’t come in. The app said she was fine…”

You: “The app said her symptoms as you described them didn’t meet red flag criteria. But you said something the app couldn’t process: ‘something feels off.’ That’s not nothing. That’s six months of continuous observation telling you this baby isn’t acting like your baby. Your threat detection system picked up a pattern the app couldn’t see. You did exactly the right thing coming in.”

Teaching Moment:

“Here’s what I want you to remember: with kids, especially babies, parental instinct is the most sensitive diagnostic tool we have. Your brain has been building a model of normal for this baby since birth. When that model throws an error—’something’s off’—that’s real data. Apps check boxes. You know your child. Never let an algorithm override that knowledge.”

Outcome:

Blood cultures grew Streptococcus pneumoniae. Early sepsis caught before clinical deterioration. IV antibiotics, full recovery.

Scenario 3: The Diabetic Ketoacidosis

Presentation:

28-year-old woman with Type 1 diabetes for 15 years. Three days of excessive thirst, frequent urination, fatigue, and “feeling weird.” She checked ChatGPT, which recommended adjusting insulin and monitoring blood sugar more frequently. Her sugars have been running 300-400 despite increased insulin.

What AI Told Her:

Symptoms suggest hyperglycemia
Increase basal and bolus insulin
Check blood sugar every 2-3 hours
Increase hydration
Contact doctor if not improving within 24-48 hours

What AI Got Right:

Recognized hyperglycemia pattern
Insulin adjustment is part of management
Hydration is important
Recommended medical contact if not improving

What AI Got Wrong:

Failed to recognize DKA risk indicators
“Feeling weird” should have triggered more urgent guidance
24-48 hour timeline is dangerous if DKA developing
Cannot assess physical signs that distinguish high sugar from DKA

Your Assessment:

She looks ill. Respirations are deep and labored—classic Kussmaul breathing. Her breath has a fruity smell you can detect from three feet away. She’s slightly confused, answering questions slowly. Blood sugar 450, urine shows large ketones, VBG shows pH 7.15.

Integration Dialogue:

You: “I need you to understand what’s happening, because this is serious. You’re in diabetic ketoacidosis. DKA. This is an emergency.”

Patient: “But I’ve had high sugars before…”

You: “This isn’t just high sugar. Here’s what I’m detecting that ChatGPT couldn’t: your breath smells like fruit—that’s acetone from ketone production. Your breathing is deep and labored—that’s your body trying to blow off acid. You’re a little confused—your brain is affected. These are physical signs that differentiate ‘high blood sugar that needs adjustment’ from ‘DKA that needs an ICU.’ AI can’t smell, can’t assess your breathing pattern, can’t detect subtle mental status changes. That’s why home management advice was wrong here.”

Patient: “I thought I was managing it…”

You: “You were doing what seemed reasonable based on what you knew. But DKA can develop even when you’re checking sugars and adjusting insulin, especially if there’s a trigger like infection. The physical signs tell me what the glucose number alone couldn’t. We’re starting IV fluids and insulin now, and you’re going to the hospital.”

Teaching Moment:

“Here’s what I want you to remember: blood sugar numbers aren’t the whole story. When you’re feeling this bad—not just high-sugar bad, but ‘something’s really wrong’ bad—come in. Don’t try to manage it at home based on AI advice. The signs that distinguish manageable hyperglycemia from DKA require someone who can see you, smell you, and assess your breathing. Those are sensors you can’t replicate at home.”

Outcome:

ICU admission for DKA management. Trigger was developing UTI. Full recovery with education about DKA warning signs.

Scenario 4: The Medication Interaction

Presentation:

72-year-old man on warfarin, metoprolol, lisinopril, and metformin. He wants to start a turmeric supplement for joint pain after reading about its anti-inflammatory properties. He asked ChatGPT about drug interactions and was told “no significant interactions documented.”

What AI Told Him:

Searched standard drug interaction databases
No major documented interactions between turmeric and his medications
Generally safe to use as supplement
Recommended “consulting healthcare provider” as standard caveat

What AI Got Right:

No major drug-drug interactions in standard databases
Turmeric is generally considered safe
Appropriately recommended provider consultation

What AI Missed:

Turmeric has anticoagulant properties that can potentiate warfarin
Patient has CKD (not mentioned to AI)
Supplement would need dose consideration for renal function
Alternative approaches might be safer

Your Assessment:

Review his chart: eGFR 35 (stage 3b CKD), INR last check 2.8 (high therapeutic range). Turmeric in standard supplement doses could push INR dangerously high and is cleared renally.

Integration Dialogue:

You: “I’m glad you asked me before starting this. ChatGPT told you there were no interactions, and in a narrow sense, that’s true—it’s not in the standard drug interaction databases. But here’s what AI didn’t account for.”

Patient: “I told it all my medications…”

You: “Did you tell it your kidney function?”

Patient: “No, I didn’t think about that.”

You: “That’s the problem. Turmeric has mild blood-thinning properties—usually not an issue, but you’re already on warfarin and your INR is at the high end of therapeutic. Adding turmeric could push you into bleeding territory. Also, this supplement is processed through your kidneys, and your kidney function is reduced. Standard doses might accumulate. AI knew your medication list. It didn’t know your organ function, your current INR, your individual risk factors. That’s population-level safety versus you-level safety.”

Patient: “So I can’t take anything for my joints?”

You: “I didn’t say that. Let’s talk about what would actually work for you, with your specific situation. There are options—we just need to pick the right one, not the one an algorithm approved for a generic patient.”

Teaching Moment:

“This is a perfect example of AI’s blind spot. It knows general rules—interactions, contraindications, standard doses. But medicine is individual. Your kidney function, your INR, your complete picture—that context changes everything. For medication questions, always check with us. We know you, not just your medication list.”

Scenario 5: The "Just Anxiety"

Presentation:

24-year-old woman with four weeks of palpitations, shortness of breath, lightheadedness, and exercise intolerance. She asked ChatGPT, which told her the symptoms were consistent with panic attacks. She’s frustrated: “Everyone keeps saying it’s anxiety. It’s not anxiety. I know what anxiety feels like. This is different.”

What AI Told Her:

Symptom cluster consistent with panic disorder
Common in young women
Recommended breathing exercises, therapy, stress management
Noted cardiac causes “unlikely at your age”
Suggested seeing doctor “if symptoms persist”

What AI Got Right:

Panic attacks are the most common cause of these symptoms in young women
General management suggestions for panic are reasonable
Did recommend medical evaluation if persistent

What AI Got Wrong:

Failed to ask about recent COVID infection (she had it 6 weeks ago)
Missed the exercise intolerance as a red flag
Didn’t consider post-viral dysautonomia
Dismissed her self-knowledge (“I know what anxiety feels like”)

Your Examination:

Resting heart rate 118. You have her lie flat for 5 minutes, then stand—heart rate jumps to 145 with lightheadedness. She’s not anxious during this test; her body is simply responding abnormally to position change.

Integration Dialogue:

You: “I need to show you something. Lie down for me.”

[After 5 minutes lying flat]

You: “Your heart rate is 95. Now stand up slowly.”

[She stands, you recheck]

You: “Your heart rate just jumped to 145. You didn’t think anxious thoughts. You didn’t have a panic attack. You simply stood up. That’s not anxiety. That’s dysautonomia—your autonomic nervous system isn’t regulating your heart rate properly when you change position.”

Patient: “So I’m not crazy?”

You: “You’re not crazy, and you’re not anxious—at least not as the primary problem. You had COVID six weeks ago. What you’re describing is consistent with POTS—postural orthostatic tachycardia syndrome. It’s a known complication of COVID. AI pattern-matched your symptoms to the most common cause in young women. But AI couldn’t check your orthostatic vitals. That one test changes everything.”

Patient: “Everyone else just said anxiety…”

You: “I know. And I’m sorry. Young women get dismissed with ‘anxiety’ by AI and sometimes by physicians too. But you said something important: ‘I know what anxiety feels like. This is different.’ That’s expertise. You’re an expert on your own body. When your expertise conflicts with a pattern-match—whether from AI or a rushed doctor—trust yourself and keep pushing for answers.”

Teaching Moment:

“Here’s what I want you to remember: when you tell an AI or a doctor that something ‘feels different’ from what you’ve experienced before, that’s data. Your body has been sensing itself for 24 years. You know the difference between your normal anxiety and something new. Never let a pattern-match override your lived experience. If you know it’s different, it’s probably different. Find someone who will actually examine you.”

Outcome:

Formal tilt table test confirmed POTS. Started on treatment protocol with gradual improvement. She’s now able to exercise again.

Scenario 6: The Melanoma Miss

Presentation:

45-year-old man with a mole on his back that his wife says has changed over the past six months. He took a photo and asked ChatGPT to analyze it. AI said it “appears to be a benign nevus” but recommended monitoring for changes.

What AI Told Him:

Image appears to show a pigmented nevus
Borders appear relatively regular
Color relatively uniform
Recommended monitoring for changes
Suggested dermatologist “if concerned”

What AI Got Right:

Acknowledged it couldn’t provide definitive diagnosis
Recommended monitoring and professional evaluation if concerned
Didn’t claim to diagnose definitively

What AI Got Wrong:

Photo quality and angle masked key features
AI cannot assess lesion evolution (patient’s wife could)
“Appears benign” created false reassurance
Delayed presentation by several months

Your Examination:

The lesion is 8mm, asymmetric, with irregular borders and multiple colors (brown, black, and areas of regression with pink). His wife confirms it used to be a uniform brown circle.

Integration Dialogue:

You: “I need to talk to you about this mole. AI told you it looked benign based on the photo. But photos lie—lighting, angle, resolution all affect what’s visible. What I’m seeing in person is different.”

Patient: “It doesn’t look that bad to me…”

You: “Let me show you what I’m seeing. This mole is asymmetric—if I drew a line down the middle, the two halves don’t match. The borders are irregular—see how they’re jagged here? And there are multiple colors—brown here, darker here, and this pink area suggests the mole may have been even larger and partially regressed. These are the ABCDE criteria for melanoma. AI couldn’t assess any of this properly from a phone photo.”

Patient: “So it’s cancer?”

You: “I don’t know yet. That’s what the biopsy is for. But I’m concerned enough that we’re doing this today, not monitoring. Here’s the thing: your wife noticed this was changing. She’s been looking at your back for years. Her observation of evolution—that this used to look different—is actually the most important diagnostic information I have. AI looked at a single image. She saw change over time. That’s the E in ABCDE—evolution.”

Teaching Moment:

“Never trust AI for visual diagnosis. Skin lesions, rashes, anything that requires actually seeing—AI is working with degraded information from photos. Descriptions don’t capture what a trained eye can see in person. And here’s the key: your wife’s observation that this changed matters more than any AI analysis. Someone who knows what your skin looked like before is more valuable than a thousand image analyses.”

Outcome:

Biopsy confirmed melanoma in situ. Excised with clear margins. No evidence of spread. Patient doing well with regular surveillance.

Scenario 7: The Abdominal Pain

Presentation:

22-year-old woman with 18 hours of worsening right lower quadrant abdominal pain. She asked ChatGPT whether this could be appendicitis. AI said “appendicitis typically presents with fever, nausea, and pain that migrates from the umbilicus. Your symptoms are atypical. Consider other causes such as ovarian cyst, constipation, or muscle strain.”

What AI Told Her:

Classic appendicitis usually includes fever
Migration from periumbilical to RLQ is typical
Her presentation is “atypical”
Listed alternative diagnoses
Recommended ER “if pain becomes severe”

What AI Got Right:

Classic appendicitis presentation is fever + migration
Differential diagnosis considerations were reasonable
Did recommend ER if worsening

What AI Got Wrong:

30% of appendicitis cases don’t have fever
“Atypical” presentations are actually common
Cannot assess peritoneal signs
“Wait for severe pain” is dangerous advice

Your Examination:

She looks uncomfortable. Temperature 99.1°F (not technically fever). When you press on her right lower quadrant and release suddenly, she gasps—positive rebound tenderness. Voluntary guarding. Rovsing’s sign positive. Psoas sign positive.

Integration Dialogue:

You: “I’m concerned about appendicitis.”

Patient: “But ChatGPT said my symptoms were atypical…”

You: “ChatGPT played the statistics—most appendicitis has fever, yours is 99.1°F, which is borderline. But here’s what I’m finding that AI couldn’t assess: when I press on your belly and let go quickly, you have rebound pain. That’s a peritoneal sign—it means there’s inflammation irritating the lining of your abdomen. Your muscles are guarding, tensing up to protect the area. When I press on your left side, it hurts on the right—that’s Rovsing’s sign. These are physical findings that suggest appendicitis regardless of whether you hit the ‘typical’ criteria.”

Patient: “So AI was wrong?”

You: “AI was statistically accurate—most appendicitis looks different from yours. But you don’t have most appendicitis. You have your appendicitis, and your body is showing me signs that trump statistics. Medicine isn’t always typical. Physical exam findings override pattern-matching from symptom descriptions.”

Teaching Moment:

“Here’s the lesson: ‘atypical presentation’ doesn’t mean ‘probably not.’ It means ‘doesn’t fit the textbook.’ Real patients often don’t fit textbooks. When AI tells you your symptoms are ‘atypical,’ that’s not reassurance—that’s just a description. What matters is what examination shows, not what boxes you check. Your peritoneal signs are more diagnostic than any symptom checklist.”

Outcome:

CT confirmed appendicitis. Laparoscopic appendectomy. Uncomplicated recovery. She now knows that “atypical” presentations are still real presentations.

Scenario 8: The Medication Dosing Error

Presentation:

68-year-old woman was prescribed gabapentin by another physician for neuropathic pain. She looked up the medication on ChatGPT to understand dosing. AI provided standard dose escalation schedule (300mg TID target). She followed it. Now she’s drowsy, confused, and unsteady on her feet.

What AI Told Her:

Gabapentin standard starting dose 300mg at bedtime
Titrate to 300mg three times daily
May increase further based on response
Common side effects include drowsiness, dizziness
“Consult your physician” for dose adjustments

What AI Got Right:

Standard dosing information is accurate
Side effects are correctly listed
Recommended physician consultation

What AI Got Wrong:

She has CKD stage 4 (eGFR 22)
Gabapentin is renally cleared
Standard doses will accumulate to toxic levels
Her “side effects” are actually toxicity from accumulation

Your Assessment:

She’s somnolent, ataxic, with slurred speech. Gabapentin level is three times therapeutic. Her last creatinine shows eGFR of 22.

Integration Dialogue:

You: “I need to explain what happened. You’re having toxicity from gabapentin—not regular side effects, but the drug building up to dangerous levels.”

Patient’s daughter: “But she took the dose the internet said…”

You: “The dose the internet said is correct—for people with normal kidney function. Your mother’s kidneys work at about 22% of normal. Gabapentin is eliminated through the kidneys. When you give a normal dose to someone with reduced kidney function, the drug accumulates instead of being cleared. She took what would be the right dose for most people. But she’s not most people.”

Patient’s daughter: “AI should have asked about that…”

You: “AI did say to consult a physician about dosing. But it also provided the standard numbers, and those numbers seemed like an answer. Here’s the fundamental problem: AI gives population-level information. Your mother needed individual dosing based on her kidney function. For her, the correct dose is maybe 100mg once daily—a fraction of what she took. AI doesn’t know her creatinine. It doesn’t know her eGFR. It knows the standard numbers. The standard numbers almost killed her.”

Teaching Moment:

“Never dose medications based on AI or internet information, even if it seems straightforward. Drug dosing requires knowing your specific organ function—kidneys, liver—and your other medications. The ‘standard dose’ is standard for an average person with normal function. You’re not an average person. You’re you, with your specific kidneys and your specific liver and your specific situation. Medication dosing is one area where population advice is genuinely dangerous.”

Outcome:

Gabapentin held. Levels cleared over 72 hours. Symptoms resolved. Restarted at renal-adjusted dose with good pain control and no toxicity.

Summary: The Integration Patterns

Across these eight scenarios, notice the recurring patterns:

The Opening: You acknowledge what AI got right before addressing what it missed. This builds trust and positions you as fair arbiter, not defensive competitor.

The Pivot: You transition from AI’s assessment to your findings with “Here’s what I’m finding that AI couldn’t assess…”

The Exam: You make your physical findings explicit. You don’t just examine—you narrate what you’re detecting and why it matters.

The Education: You explain the specific limitation that caused AI to miss this case—visual assessment, physical findings, individual context, atypical presentations.

The Teaching Moment: You give patients a framework they can use in the future. Not “don’t use AI” but “here’s how to recognize when AI might be wrong.”

These patterns work because they’re collaborative rather than dismissive, educational rather than judgmental, and specific rather than vague.

Practice them. Adapt them to your specialty, your patient population, your style. The words don’t have to be mine—but the structure should become yours.

Key Takeaways

In every case, AI did something reasonable—and in every case, AI missed something that mattered. That's not because AI is bad. It's because AI is incomplete.
AI generates hypotheses. You test them. AI matches patterns. You see patients. AI provides information. You make decisions.
The velociraptor brain, the 10 billion sensors, the physical exam, the individual context—these are your domain.
Practice these scenarios actively. Imagine yourself in the room. The patterns will become natural.
These patients are coming. Maybe not these exact presentations, but these patterns. Be ready.
Not to fight them. To integrate them. That's the skill of medicine in the AI age.

Final Remarks

Eight patients. Eight AI assessments. Eight moments where the gap between algorithmic pattern-matching and clinical medicine became visible.

In every case, AI did something reasonable. It pattern-matched symptoms to likely diagnoses. It provided standard information accurately. It recommended physician evaluation when uncertain.

And in every case, AI missed something that mattered. Something physical. Something contextual. Something individual.

That’s not because AI is bad. It’s because AI is incomplete. It’s a tool that does certain things well and other things not at all. Your job isn’t to fight AI or dismiss it—it’s to know where it ends and where you begin.

The velociraptor brain. The 10 billion sensors. The physical exam. The individual context. These are your domain.

AI generates hypotheses. You test them.

AI matches patterns. You see patients.

AI provides information. You make decisions.

These scenarios are your practice ground. Use them. Adapt them. And when the real patients come—with their AI printouts and their ChatGPT conversations and their “the app said” opening lines—you’ll be ready.

Not to fight them. To integrate them.

That’s the skill of medicine in the AI age.