Deep learning for contour quality assurance for RTOG 0933: In-silico evaluation.

Document Type

Article

Publication Date

8-31-2024

Publication Title

Radiotherapy and oncology : journal of the European Society for Therapeutic Radiology and Oncology

Abstract

PURPOSE: To validate a CT-based deep learning (DL) hippocampal segmentation model trained on a single-institutional dataset and explore its utility for multi-institutional contour quality assurance (QA).

METHODS: A DL model was trained to contour hippocampi from a dataset generated by an institutional observer (IO) contouring on brain MRIs from a single-institution cohort. The model was then evaluated on the RTOG 0933 dataset by comparing the treating physician (TP) contours to blinded IO and DL contours using Dice and Haussdorf distance (HD) agreement metrics as well as evaluating differences in dose to hippocampi when TP vs. IO vs. DL contours are used for planning. The specificity and sensitivity of the DL model to capture planning discrepancies was quantified using criteria of HD > 7 mm and Dmax hippocampi > 17 Gy.

RESULTS: The DL model showed greater agreement with IO contours compared to TP contours (DL:IO L/R Dice 74 %/73 %, HD 4.86/4.74; DL:TP L/R Dice 62 %/65 %, HD 7.23/6.94, all p < 0.001). Thirty percent of contours and 53 % of dose plans failed QA. The DL model achieved an AUC L/R 0.80/0.79 on the contour QA task via Haussdorff comparison and AUC of 0.91 via Dmax comparison. The false negative rate was 17.2 %/20.5 % (contours) and 5.8 % (dose). False negative cases tended to demonstrate a higher DL:IO Dice agreement (L/R p = 0.42/0.03) and better qualitative visual agreement compared with true positive cases.

CONCLUSION: Our study demonstrates the feasibility of using a single-institutional DL model to perform contour QA on a multi-institutional trial for the task of hippocampal segmentation.

First Page

110519

Last Page

110519

DOI

10.1016/j.radonc.2024.110519

ISSN

1879-0887

PubMed ID

39222847

Share

COinS