Last time I discussed the book Freakonomics and how it described that people do what they are incentivized to do– often to unintended results. Today I’m going to discuss a recently controversial aspect of medical education where Freakonomics applies, namely the USMLE Step 2 CS (Clinical Skills) exam.
In 2004, “the powers that be” (more on who these powers are later) declared it mandatory that all US medical students must pass this newly created clinical skills examination in order to become licensed.
Wait, you may wonder. Were medical students– our future physicians– not being taught clinical skills before 2004? Were medical schools not testing them? As a medical school graduate of the class of 1996, I can assure you that yes, learning clinical skills was part of the medical school curriculum well before the Step 2 CS exam. And yes, my medical school tested me. Repeatedly. And if I didn’t pass these tests, I would not have graduated. And if I didn’t graduate, I couldn’t get licensed. That’s kind of how medical school works.
So why create the “USMLE Step 2 CS” exam, and what is it? In short, it is a standardized assessment of clinical skills. At least, that’s the intention. The implication of standardized testing is that anyone that has passed it has met some minimum ‘standard’ (there’s that word again) of skills and/or knowledge. That may sound reasonable, but if there’s any chance that the test doesn’t accomplish that, the entire rationale for the test’s existence falls apart. If you think that isn’t possible– that surely this test is a vigorously proven instrument– think again.
This is not a traditional written pen and paper test. Students are placed in a simulated environment with actors presenting as patients (referred to as “SP’s” on the medical student message boards I’ve lurked on, which I assume stands for ‘simulated patient’). These SP’s are interviewed and examined by the examinees, who are observed (and also filmed, apparently) by some ‘examiner’ and graded based on the interaction. But what exactly are examinees being evaluated on? Things like hand washing, it turns out.
Hand washing has been cited as an important infection control measure, which I do not argue with. I suppose that because of that, it became an important part of the CS exam. I had heard that students basically fail the exam if they don’t wash their hands correctly. The fact you could somehow wash your hands incorrectly– in a setting that didn’t involve sterility like an operating room– on this test sounded too bizarre to be true. But a search on Reddit confirmed that it was true, with one poster saying “I am having a mild freak out. Do I go in and wash my hands THEN shake their hand? Or go in, shake hands, interview then wash hands and do the physical? At my school for our [practice clinical skills exam], I just hit the Purell on my way in and out of the room.” and another commenting “If the patient on the CS exam is stable and not in distress or pain, how would you keep the introductory flow going when going straight to washing your hands? You would walk past them to the sink, wash your hands and then shake their hands or no shake? I could see how you do that while talking and then going to shake their hands but I think it’s most important before the actual physical examination from what I’ve been reading.” Wow. And I thought medical school was bad when I went through it over 20 years ago WITHOUT having to deal with yet another high-stakes exam nit-picking when/how I should wash my hands or if hand-sanitizer was good enough.
I should add that I’m not entirely unfamiliar with the concept of a simulated patient exam. My medical school was interested in implementing a simulated patient evaluation as part of their curriculum while I was a student, and I was forced to participate in its development. It didn’t count for my grade, as they were still working out the kinks, but I recall having to run a gauntlet of interviewing SP’s and quickly jotting down a progress note detailing my reasoning and diagnosis after each encounter.
My lingering impression 20+ years later was that I thought the process was absurd. I knew it was fake, and I found it hard to take it very seriously. Additionally, these simulated patients were not medically trained and only knew as much about the disease they were trying to simulate as they were told– and could remember. I recall examining the ‘abdominal pain’ patient and asked her if it hurt more when I gradually applied pressure to her belly or when I quickly released it (assessing for the presence of ‘rebound tenderness’, a classic sign of peritonitis). She abruptly broke character and stopped writhing and moaning on the gurney to pause and think before she finally answered, “I don’t know.”
After I left the room, but before one of my classmates could step up to the plate and take their turn to examine her, I heard an instructor enter and exasperatedly review what she was supposed to do and say. It’s a tall order to have a non-medically educated person fake symptoms in a realistic manner and apparently that didn’t occur to the administrators in charge of implementing this simulated encounter. That’s a pretty big weakness that I don’t think is possible to overcome short of having attending physicians be the SP’s.
While the current USMLE Step 2 CS exam is probably more evolved than the experimental one from my medical student days, I have my doubts that it is significantly better — or worthwhile. Even if they were to strip out medical knowledge that a patient would need to know in order to simulate a clinical scenario and just evaluate clinical skills, what does that mean? Does that mean maintaining appropriate eye contact? And how do we define ‘appropriate’? What about shaking hands? Do we wash hands or shake hands first? Does it matter? In physics, the ‘observer effect’ is the concept that simply observing a situation or phenomenon necessarily changes that phenomenon. How an examinee is affected by the artificial nature of an observed simulated patient examination is somehow lost on test makers. I couldn’t suspend disbelief enough to perform very well on my experimental clinical exam, and I doubt I’m alone in that. I don’t think that was a fair assessment of me. Why should students be punished for performing poorly on an exam that was so obviously artificial that they were too weirded out to be at their best?
Speaking of punishment, because of the resources involved (all of the SP’s need to be trained and paid, space for the testing centers needs to be purchased/rented, IT and staff needs to be hired, fees, overhead, etc.– none of that is free or even cheap), the exam costs medical students $1,275 to take. And there are only five testing centers for the exam, so students who don’t live in/near one of the centers in Los Angeles, Philadelphia, Houston, Chicago, or Atlanta have to pay for travel and spend/pay for a night or two at a hotel.
So why implement an expensive, inconvenient, and unproven standardized test for something medical schools have been doing as part of their mission since their inception? Is a standardized test better than individual medical schools teaching and testing clinical skills?
It doesn’t seem to be. As the Harvard medical student founded group endstep2cs.com notes, “School-based skills exams have several benefits over the national CS exam. For example, while Step 2 CS provides students only a pass/fail grade and a bar graph of clinical versus interpersonal performance, medical schools can provide students with a much more comprehensive assessment as well as targeted feedback that can allow them to improve their skills in communication, history taking, physical examination, and clinical reasoning.”
So why go through all of this? That brings me back to the Freakonomics of why this happened.
Let’s start with “the powers that be” that declared that this new test be mandatory. In this case, two entities were responsible: the National Board of Medical Examiners and the Federation of State Medical Boards. Together they administer the USMLE– the series of tests required for medical licensure. They get to make changes in the exams (which includes the addition of another exam to the series) and do not have to answer to anyone else if they choose not to.
That’s a problem, as noted by the American Medical Association in a 2003 article in which they discuss the reasons for their opposition to the (then) pending new exam requirement (http://journalofethics.ama-assn.org/2003/12/pfor1-0312.html). They noted, “While the NBME and FSMB leadership have heard the AMA’s concerns, they have not taken meaningful measures to address them. The inability of the AMA, AAMC, and medical student groups to influence medical licensing reform has raised fears that professional organizations dealing with medical education have lost their ability to self-regulate.” And so the Step 2 CS Exam came to pass. Again: why?
The answer is money. There were roughly 28,000 medical school graduates in 2017, all of whom had to take and pass this exam. So 28,000 graduates paying $1,275 each comes out to $35,700,000. That’s thirty five million dollars into the coffers of the NBME and FSMB every year. Prior to 2004, that torrent of cash did not exist for them (that’s nearly half a billion dollars that the NBME and FSMB took in over the 13 years since inception of the exam. Without any proven benefit to physicians or patients). That’s what business people call “creation of revenue streams”.
So in the end, these organizations saw a chance to create $35,000,000 per year in revenue for themselves. Since nobody could stop them, they did it. One group is, after all, called the National Board of Medical Examiners. True to their name, they’re in the business of making students take tests. That’s how they make money. If we decided collectively as a society that tests were overrated and we needed less of them, the NBME would make much less money. And they don’t want to make less money. So, they make more tests.
That’s what Freakonomics showed me.