Development and Validation of a Machine-Learning-Based Decision Support Tool for Residency Applicant Screening and Review.
PURPOSE: Residency programs face overwhelming numbers of residency applications, limiting holistic review. Artificial intelligence techniques have been proposed to address this challenge but have not been created. Here, a multidisciplinary team sought to develop and validate a machine-learning (ML) based decision support tool (DST) for residency applicant screening and review.
METHOD: Categorical applicant data from the 2018, 2019, and 2020 residency application cycles (n = 8,243 applicants) at one large internal medicine residency program were downloaded from the Electronic Residency Application Service and linked to the outcome measure: interview invitation by human reviewers (n = 1,235 invites). A ML model using gradient boosting was designed using training data (80% of applicants) with over 60 applicant features (e.g., demographics, experiences, academic metrics). Model performance was validated on held-out data (20% of applicants). Sensitivity analysis was conducted without US Medical Licensing Examination (USMLE) scores. An interactive DST incorporating the ML model was designed and deployed that provided applicant- and cohort-level visualizations.
RESULTS: The ML model areas under the receiver operating characteristic and precision recall curves were 0.95 and 0.76, respectively; these changed to 0.94 and 0.72, respectively, with removal of USMLE scores. Applicants' medical school information was an important driver of predictions - which had face validity based on the local selection process - but numerous predictors contributed. Program directors used the DST in the 2021 application cycle to select 20 applicants for interview that had been initially screened out during human review.
CONCLUSIONS: The authors developed and validated a ML algorithm for predicting residency interview offers from numerous application elements with high performance-even when USMLE scores were removed. Model deployment in a DST highlighted its potential for screening candidates and helped quantify and mitigate biases existing in the selection process. Further work will incorporate unstructured textual data through natural language processing methods.