Development and Validation of a Machine-Learning-Based Decision Support Tool for Residency Applicant Screening and Review.

Document Type

Article

Publication Date

8-3-2021

Publication Title

Academic Medicine

Abstract

PURPOSE: Residency programs face overwhelming numbers of residency applications, limiting holistic review. Artificial intelligence techniques have been proposed to address this challenge but have not been created. Here, a multidisciplinary team sought to develop and validate a machine-learning (ML) based decision support tool (DST) for residency applicant screening and review.

METHOD: Categorical applicant data from the 2018, 2019, and 2020 residency application cycles (n = 8,243 applicants) at one large internal medicine residency program were downloaded from the Electronic Residency Application Service and linked to the outcome measure: interview invitation by human reviewers (n = 1,235 invites). A ML model using gradient boosting was designed using training data (80% of applicants) with over 60 applicant features (e.g., demographics, experiences, academic metrics). Model performance was validated on held-out data (20% of applicants). Sensitivity analysis was conducted without US Medical Licensing Examination (USMLE) scores. An interactive DST incorporating the ML model was designed and deployed that provided applicant- and cohort-level visualizations.

RESULTS: The ML model areas under the receiver operating characteristic and precision recall curves were 0.95 and 0.76, respectively; these changed to 0.94 and 0.72, respectively, with removal of USMLE scores. Applicants' medical school information was an important driver of predictions - which had face validity based on the local selection process - but numerous predictors contributed. Program directors used the DST in the 2021 application cycle to select 20 applicants for interview that had been initially screened out during human review.

CONCLUSIONS: The authors developed and validated a ML algorithm for predicting residency interview offers from numerous application elements with high performance-even when USMLE scores were removed. Model deployment in a DST highlighted its potential for screening candidates and helped quantify and mitigate biases existing in the selection process. Further work will incorporate unstructured textual data through natural language processing methods.

Volume

Online ahead of print

DOI

10.1097/ACM.0000000000004317

ISSN

1938-808X

PubMed ID

34348383

Share

COinS