Medical educators have used machine learning to reconsider—and ultimately select—some 20 qualified residents who had been screened out by human reviewers at their institution.
The residency program for which the team applied the technique is a large internal-medicine operation that annually deals with massive volumes of residency applications. From 2018 to 2020, for example, the program received more than 8,200 applications.
The AI-based decision support tool that kept the 20 outstanding trainees from slipping away can weigh more than 60 qualifications, shortcomings, experiences, skillsets, demographic features and so on.
In most of the near-misses, residency program leadership “felt the human reviewer had either overlooked the applicant in the setting of the large application burden or had overweighted a single factor (like a low score in the U.S. Medical Licensing Examination),” the researchers write in a study current in Academic Medicine, which is produced by the Association of American Medical Colleges.
“Identifying such ‘diamonds in the rough’ for further holistic review,” they add, “is a promising aspect of our approach and highlights how such machine learning models can be adjusted for different uses [such as] widescale screening vs. more directed selection decisions.”
The work was conducted at New York University by lead author Jesse Burk-Rafel, MD, senior researcher Yindalon Aphinyanaphongs, MD, PhD, and numerous colleagues of multiple academic disciplines.
The team developed the algorithm to predict the probability of each applicant’s receiving an invitation to interview with the director of the internal-med residency program.
They rendered the algorithm as a decision-support tool usable by program leaders wishing to find initially overlooked applicants.
The authors report that, based on the 20 catches made in internal validation, their approach shows promise in augmenting existing human review.
Additionally, the model maintained its high performance when U.S. Medical Licensing Examination scores were removed. This finding suggests such “holistic” screening methodologies “may obviate the need for USMLE scores in screening—an important outcome, as [the eight-hour] Step 1 [exam] will go pass/fail in January 2022.”
Burk-Rafel et al. call for further refinement of their model with external validation and natural language processing of unstructured textual data.
The authors comment:
Our program uses applicant experiences, attributes and academic metrics—consistent with holistic review—in selecting applicants for interview but does not employ a strict rubric. As the machine learning model learns based on prior program director decisions, creation of the model enable[s] quantification of the weights of different applicant factors in our selection process, at both the applicant and cohort level.”
The study is available as a free PDF download here.