Development and Fairness Assessment of Machine Learning Models for Predicting 30-Day Readmission After Lung Cancer Surgery

Document Type

Conference Proceeding

Publication Date

6-1-2025

Publication Title

Journal of Clinical Oncology

Abstract

Background: Predicting post-surgical readmissions is essential for improving patient outcomes and reducing healthcare costs. While machine learning (ML) models offer high predictive accuracy, they may perpetuate healthcare disparities if not rigorously evaluated for algorithmic bias. In this study, we examine the limitations of ML-based readmission prediction models, highlighting how bias can persist despite strong performance metrics. We also explore the impact of integrating fairness constraints to mitigate these disparities, ensuring equitable clinical decision-making across racial and ethnic groups. Methods: We analyzed National Surgical Quality Improvement Program (NSQIP) data (2016–2020) for 23,843 lung cancer surgery patients. Multiple ML models were developed using demographic, clinical, and laboratory variables. Model performance was assessed using standard accuracy metrics alongside fairness evaluations, including Demographic Parity and Equalized Odds, to measure disparities across racial groups. Results: The cohort had 56.5% females; 66.4% of cases belonged to the White race, 6.3% were Black, and 2.9% belonged to the Hispanic ethnicity. The median [Q1, Q3] was 69.0 [62.0, 74.0] years, and the overall readmission rate was 7.5%. The median operation time was higher among readmitted cases (171 minutes vs. 157.0; p < 0.001). However, there was no clinically significant difference between median [Q1, Q3] LOS between the two groups (4.0 [2.0, 6.0] vs. 4.0 [3.0, 7.0]; p < 0.001). The best-performing model (CatBoost) achieved high accuracy but showed disparities in prediction rates across racial groups (Demographic Parity Difference: 0.030, Equalized Odds Difference: 0.333) since the model disproportionately flagged Hispanic patients for readmission risk while potentially under-identifying risk in other groups. Significant predictors included operative time, preoperative sodium (139 vs. 140 mmol/L, p , 0.001), and COPD status (33.8% vs. 25.3%, p < 0.001). After implementing fairness constraints, the model maintained strong predictive performance while reducing demographic disparities, with selection rates balancing across racial groups (range: 0.51%-3.50%). Conclusions: Despite their high accuracy, ML models for predicting post-surgical readmissions can reinforce existing healthcare disparities. Our findings underscore the importance of fairness-aware modeling to mitigate bias, ensuring equitable clinical decision support. While fairness constraints improved demographic balance, residual disparities persisted, highlighting the need for ongoing scrutiny when deploying AI in clinical settings. This study emphasizes the critical need for continuous fairness evaluation in medical AI applications to prevent unintentional harm to vulnerable patient populations.

Volume

43

Issue

16 Suppl

First Page

1532

Comments

2025 ASCO (American Society of Clinical Oncology) Annual Meeting, May 30 - June 3, 2025, Chicago, IL

Last Page

1532

DOI

10.1200/JCO.2025.43.16_suppl.1532

Share

COinS