Advertisement
Original Research Article|Articles in Press, 100426

A simple single-cycle interactive strategy to improve deep learning-based segmentation of organs-at-risk in head-and-neck cancer

Open AccessPublished:March 04, 2023DOI:https://doi.org/10.1016/j.phro.2023.100426

      Abstract

      Purpose

      Interactive segmentation seeks to incorporate human knowledge into segmentation models and thereby reducing the total amount of editing of auto-segmentations. By performing only interactions which provide new information, segmentation performance may increase cost-effectively. The aim of this study was to develop, evaluate and test feasibility of a deep learning-based single-cycle interactive segmentation model with the input being computer tomography (CT) and a small amount of information rich contours.

      Methods and Materials

      A single-cycle interactive segmentation model, which took CT and the most cranial and caudal contour slices for each of 16 organs-at-risk for head-and-neck cancer as input, was developed. A CT-only model served as control. The models were evaluated with Dice similarity coefficient, Hausdorff Distance 95th percentile and average symmetric surface distance. A subset of 8 organs-at-risk were selected for a feasibility test. In this, a designated radiation oncologist used both single-cycle interactive segmentation and atlas-based auto-contouring for three cases. Contouring time and added path length were recorded.

      Results

      The medians of Dice increased with single-cycle interactive segmentation in the range of 0.004 (Brain) - 0.90 (EyeBack_merged) when compared to CT-only. In the feasibility test, contouring time and added path length were reduced for all three cases as compared to editing atlas-based auto-segmentations.

      Conclusion

      Single-cycle interactive segmentation improved segmentation metrics when compared to the CT-only model and was clinically feasible from a technical and usability point of view. The study suggested that it may be cost-effective to add a small amount of contouring input to deep learning-based segmentation models.

      Keywords

      1. Background

      Contouring guidelines [
      • Brouwer C.L.
      • Steenbakkers R.J.H.M.
      • Bourhis J.
      • Budach W.
      • Grau C.
      • Grégoire V.
      • et al.
      CT-based delineation of organs at risk in the head and neck region: DAHANCA, EORTC, GORTEC, HKNPCSG, NCIC CTG, NCRI, NRG Oncology and TROG consensus guidelines.
      ,
      • Jensen K.
      • Friborg J.
      • Hansen C.R.
      • Samsøe E.
      • Johansen J.
      • Andersen M.
      • et al.
      radiotherapy guidelines.
      ] provide a framework for defining organs-at-risk to achieve consistency between observers and varying patient anatomies. This implicates, however, that organs-at-risk with non-visible borders are defined by guidelines to be spatially bounded by other anatomical landmarks with high image contrast. Often, this manifests as sharp edges in the axial plane, e.g.: the most cranial axial slice of dens axis marks the transition between spinal cord and brain stem [
      • Brouwer C.L.
      • Steenbakkers R.J.H.M.
      • Bourhis J.
      • Budach W.
      • Grau C.
      • Grégoire V.
      • et al.
      CT-based delineation of organs at risk in the head and neck region: DAHANCA, EORTC, GORTEC, HKNPCSG, NCIC CTG, NCRI, NRG Oncology and TROG consensus guidelines.
      ].
      The recent years’ research in auto-segmentation has provided evidence that observer variation [
      • Stapleford L.J.
      • Lawson J.D.
      • Perkins C.
      • Edelman S.
      • Davis L.
      • McDonald M.W.
      • et al.
      Evaluation of Automatic Atlas-Based Lymph Node Segmentation for Head-and-Neck Cancer.
      ,
      • Chao K.S.C.
      • Bhide S.
      • Chen H.
      • Asper J.
      • Bush S.
      • Franklin G.
      • et al.
      Reduce in Variation and Improve Efficiency of Target Volume Delineation by a Computer-Assisted System Using a Deformable Image Registration Approach.
      ,
      • Sarrade T.
      • Gautier M.
      • Schernberg A.
      • Jenny C.
      • Orthuon A.
      • Maingon P.
      • et al.
      Educative Impact of Automatic Delineation Applied to Head and Neck Cancer Patients on Radiation Oncology Residents.
      ,
      • Lin L.
      • Dou Q.
      • Jin Y.M.
      • Zhou G.Q.
      • Tang Y.Q.
      • Chen W.L.
      • et al.
      Deep learning for automated contouring of primary tumor volumes by MRI for nasopharyngeal carcinoma.
      ] and contouring time can be reduced for organs-at-risk and target volumes in head-and-neck cancer by editing auto-segmentations rather than contouring manually from scratch [
      • Sarrade T.
      • Gautier M.
      • Schernberg A.
      • Jenny C.
      • Orthuon A.
      • Maingon P.
      • et al.
      Educative Impact of Automatic Delineation Applied to Head and Neck Cancer Patients on Radiation Oncology Residents.
      ,
      • Walker G.V.
      • Awan M.
      • Tao R.
      • Koay E.J.
      • Boehling N.S.
      • Grant J.D.
      • et al.
      Prospective randomized double-blind study of atlas-based organ-at-risk autosegmentation-assisted radiation planning in head and neck cancer.
      ,
      • Teguh D.N.
      • Levendag P.C.
      • Voet P.W.J.
      • Al-Mamgani A.
      • Han X.
      • Wolf T.K.
      • et al.
      Clinical validation of atlas-based auto-segmentation of multiple target volumes and normal tissue (swallowing/mastication) structures in the head and neck.
      ,
      • Lee H.
      • Lee E.
      • Kim N.
      • Kim J.
      • ho, Park K, Lee H,
      • et al.
      Clinical evaluation of commercial atlas-based auto-segmentation in the head and neck region. Front.
      ,
      • Hu K.
      • Lin A.
      • Young A.
      • Kubicek G.
      • Piper J.W.
      • Nelson A.S.
      • et al.
      Timesavings for Contour Generation in Head and Neck IMRT: Multi-institutional Experience with an Atlas-based Segmentation Method.
      ]. Auto-segmentation does, however, still require human oversight. Two studies have found that deep learning-based auto-segmentations needed most editing at the cranial and caudal edges of organs-at-risk [
      • Brouwer C.L.
      • Boukerroui D.
      • Oliveira J.
      • Looney P.
      • Steenbakkers R.J.H.M.
      • Langendijk J.A.
      • et al.
      Assessment of manual adjustment performed in clinical practice following deep learning contouring for head and neck organs at risk in radiotherapy.
      ,
      • Vaassen F.
      • Boukerroui D.
      • Looney P.
      • Canters R.
      • Verhoeven K.
      • Peeters S.
      • et al.
      Real-world analysis of manual editing of deep learning contouring in the thorax region.
      ], which is also where sharp edges as defined by guidelines occur. Hence, future segmentation models should focus on improving these areas.
      An emerging research field of interactive segmentation seeks to deal with challenging segmentation tasks by incorporating human knowledge at contour prediction time. In a recent study, Bai et al. [
      • Bai T.
      • Balagopal A.
      • Dohopolski M.
      • Morgan H.E.
      • McBeth R.
      • Tan J.
      • et al.
      A Proof-of-Concept Study of Artificial Intelligence–assisted Contour Editing.
      ] demonstrated a multi-cycle interactive segmentation strategy of organs-at-risk in head-and-neck. Through an iterative process of defining extremity points on predicted contours their model provided updated predictions. Another strategy of multi-cycle interactive segmentation was demonstrated by Smith et al. [
      • Smith A.G.
      • Petersen J.
      • Terrones-Campos C.
      • Berthelsen A.K.
      • Forbes N.J.
      • Darkner S.
      • et al.
      RootPainter3D: Interactive-machine-learning enables rapid and accurate contouring for radiotherapy.
      ], where corrections to heart contours were saved and used to continuously retrain a segmentation model.
      Because the overall aim of interactive segmentation is to reduce the contouring time, it is important to only perform interactions which provide information to the model, which is not readily available in the images. Because auto-segmentations need most editing at the cranial and caudal edge, exactly the most cranial and caudal axial slice of a structure must provide most information per interaction. Hence, the aim of the study was to train and evaluate deep learning-based single-cycle interactive segmentation model which took these slices along with planning computer tomography (CT) as model input. In a feasibility test, the model was integrated with our treatment planning system (TPS) and contouring time and magnitudes of contour corrections were compared against our clinically used atlas-based auto-segmentation for three patients.
      • 2 Materials and Method
        • 2.1 Model Development
          • 2.1.1 Image and Contour Data
      The data consisted of 730 planning CTs along with clinical contours available from patients treated for head-and-neck cancer at our institution between 2013 – 2020. All tumor sites were included (sino-nasal, oral cavity, pharynx, larynx, salivary glands, unknown primary). Not all organs-at-risk of interest were present for all patients [Table 1.]. Data access was approved by the Danish National Science Ethics Commitee.
      • 2.1.2 Image and Contour Preprocessing
      Table 1Overview of presence of organs-at-risk in the data set. PCM: pharyngeal constrictor muscle.
      Organ-at-riskN
      Total730
      Brain679
      BrainStem727
      SpinalCord727
      Lips711
      Esophagus706
      Parotid_merged713
      PCM_Low389
      PCM_Mid389
      PCM_Up388
      Mandible720
      Submandibular_merged524
      Thyroid699
      OpticNerve_merged162
      EyeFront_merged154
      EyeFront_merged155
      OralCavity706
      Images in Digital Imaging and Communications in Medicine (DICOM) format and structure files were converted into Neuroimaging Informatics Technology Initiative (NIFTI) format with the software package dcmrtstruct2nii [

      Phil T, thomas-albrecht, Gay S. Sikerdebaard/dcmrtstruct2nii: dcmrtstruct2nii v2 2022. https://doi.org/10.5281/zenodo.6330598.

      ] to be used for model training. For 16 organs-at-risk, the cranial and caudal contouring input was simulated by sampling them from the clinical contours. These slices of the ground truth were merged into a single NIFTI file with a discreet integer for each organ-at-risk. Ground truth contours were also merged into single NIFTI files, and bilateral organs-at-risk (e.g. parotid glands) were merged and treated as one to avoid crossover predictions.
      • 2.1.3 Model Training
      The data set was split with 90 % (657 CTs) for training/validation and 10 % (73 CTs) for an independent test set. With single folds of nnUNet and default parameters (3D full resolution, nnUNetTrainerV2, 1000 epochs) two models were trained: 1) CT only (1st channel) and 2) CT (1st channel) + simulated user input (2nd channel) (the single-cycle interactive segmentation model). [
      • Isensee F.
      • Jaeger P.F.
      • Kohl S.A.A.
      • Petersen J.
      • Maier-Hein K.H.
      nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation.
      ,

      nnU-Net 2022. https://github.com/MIC-DKFZ/nnUNet (accessed August 23, 2022).

      ]. To speed up model training and prediction, all inputs and ground truths were cropped to the outermost organ-at-risk with a padding of 15 voxels in X and Y direction and 7 voxels in Z direction. See figure 1 for schematic overview of the method.
      • 2.1.4 Model Evaluation
      Figure thumbnail gr1
      Figure 1Overview of the method. Two segmentation models were trained using the same U-Net architecture. The CT-only model (red) and the single-cycle interactive segmentation model (blue) were trained using nnUNet. SCIS: single-cycle interactive segmentation.
      The models were evaluated on the test set with volumetric Dice coefficient, Hausdorff distance 95th percentile (HD95) and average symmetric surface distance (ASSD) obtained with the built-in evaluation function of nnUNet. Variation within groups were evaluated with standard deviation. Dice coefficients were calculated if the given organ-at-risk was present in the ground truth. A missing prediction resulted in a Dice coefficient of 0. HD95 and ASSD were calculated where a given structure existed in both prediction and the test set.
      • 2.2 Feasibility Test
        • 2.2.1 Integration to the TPS
      A data flow infrastructure was developed and implemented to allow real time contouring interactions from within MIM Maestro (MIM Software Inc., OH, USA, [

      MIM Maestro® | Radiation Oncology Automation | AI AutoContouring n.d. https://www.mimsoftware.com/radiation-oncology/mim-maestro (accessed August 15, 2022).

      ]) (MIM):
      1) The user contoured the most cranial and most caudal slice of each organ-at-risk on a “clean” scan.
      2) When finished, the user launched a MIM extension [

      InferenceServer Client - MIMExtensions n.d. https://github.com/mathiser/MIMExtensions/tree/main/pythonInferenceGTV (accessed August 15, 2022).

      ], which sent the CT along with the contouring inputs to a separate GPU-accelerated inference server [

      mathiser/inference_server n.d. https://github.com/mathiser/inference_server (accessed August 15, 2022).

      ] deployed on a standalone computer with one NVIDIA RTX3090. When the predicted contours were ready, they were sent back and automatically loaded into MIM. MIM’s Python API exports scans and contours as Numpy arrays, which were converted to nifti and used as model input. Contours were loaded back into MIM with the API from Numpy arrays.
      3) The user corrected the predicted contours.
      • 2.2.2 Evaluation
      The feasibility of single-cycle interactive segmentation was tested with our current clinical practice as reference, which is editing of auto-segmentations by a multi-atlas workflow provided in MIM. The atlas consisted of consensus contours of 50 in-house patients. One radiation oncologist (JGE) agreed to participate, but was new to contouring in MIM and was therefore given one case for practicing both editing contours with either approach. Eight organs-at-risk (brain stem, esophagus, lips, oral cavity, the three pharyngeal constrictor muscles (PCM) and sub-mandibular glands) were selected for the feasibility test because they were expected to benefit from single-cycle interactive segmentation and their availability in the atlas.
      To simulate the current clinical setting as much as possible, we selected the three newest cases from our test set (from 2020). Over the course of two sessions, the radiation oncologist (1) edited the atlas-based contours and (2) contoured with single-cycle interactive segmentation. The training case and one test case were done during the first session the remaining two cases during the second session.
      Recorded endpoints were (1) contouring time and (2) added path length [
      • Vaassen F.
      • Hazelaar C.
      • Vaniqui A.
      • Gooding M.
      • van der Heyden B.
      • Canters R.
      • et al.
      Evaluation of measures for assessing time-saving of automatic organ-at-risk segmentation in radiotherapy.
      ]. Added path length was implemented with a tolerance of zero voxels [

      mathiser. ContourEval 2022. https://github.com/mathiser/ContourEval (accessed August 23, 2022).

      ] and thus captured the absolute number deleted/inserted of voxels of the contour boundary between predictions and the final clinically acceptable contours.
      • 2.3 Statistical Analysis
      Results of the retrospective analysis were assumed not follow the normal distribution. Metrics were compared with Wilcoxon signed-rank test. Confidence intervals of medians were estimated by reverse percentile bootstrapping (9999 iterations).
      The feasibility test yielded only three data points for each endpoint. These were compared head-to-head on raw numbers.
      All analyses and data handling were done with Pandas 1.3.4 [

      pandas - Python Data Analysis Library n.d. https://pandas.pydata.org/ (accessed October 6, 2022).

      ], Numpy 1.20.3 [

      NumPy n.d. https://numpy.org/ (accessed October 6, 2022).

      ], Scipy 1.9.3 [

      SciPy n.d. https://scipy.org/ (accessed December 29, 2022).

      ] and Python 3.9.7 [

      Python. PythonOrg n.d. https://www.python.org/ (accessed October 6, 2022).

      ].
      • 3 Results
        • 3.1 Model Evaluation
      The single-cycle interactive segmentation model provided predictions for all 876 organs-at-risk present in the test set, while the CT-only model missed to predict a contour in 36 instances (PCM_Low: 5, PCM_Mid: 6, PCM_Up: 5, EyeFront_merged: 7, EyeBack_merged: 13).
      Significant differences (p < 0.01) were observed between the single-cycle interactive segmentation model and CT-only model for all 16 investigated organs-at-risk on all three metrics.
      The smallest increase of median Dice was 0.004 (Brain), the largest was 0.90 (EyeBack_merged) with a median increase of 0.14 (Esophagus). The variation in Dice decreased for all organs-at-risk. The smallest decrease in standard deviation was by 0.01 (Mandible), the largest was by 0.33 (EyeBack_merged) with a median decrease of 0.07 (Esophagus) [Figure 2].
      Figure thumbnail gr2
      Figure 2Dice coefficient of the 16 organs-at-risk for the CT-only model (red) and the single-cycle interactive segmentation model (blue). Medians are shown with horizontal solid bars and confidence intervals with vertical black bars. SCIS: single-cycle interactive segmentation.
      Medians of HD95 decreased for all organs-at-risk except SpinalCord, where no change was observed. The largest decrease was by 8.4 mm (Esophagus), and the median decrease was by 2.6 mm (Submandibular_merged) [figure S1 and table S2 in supplementary].
      Medians of ASSD decreased for all organs-at-risk. The smallest decrease was by 0.1 mm (SpinalCord), the largest was by 3.5 mm (OralCavity) with a median decrease of 0.7 mm (EyesBack_merged) [figure S2 and table S3 in supplementary].
      For more details on means, medians and standard deviations, see table S1-3 in supplementary.
      When visually inspecting predictions of the single-cycle interactive segmentation model three contour characteristics emerged. First, the most cranial and caudal slice often have perfect overlap with - and not extending beyond the provided input [Figure 3, A] Second, in some instances, the contour input changed the shape of the predicted contour in non-edge slices [Figure 3, B]. Third, if the single-cycle interactive segmentation model received a relevant input, then contours were always predicted. The CT-only model failed to predict 36 organs-at-risk, which were present in the test set [Figure 3, C].
      • 3.2 Feasibility Test
      Figure thumbnail gr3
      Figure 3A) An example of a brain stem with the CT-only model (top) and SCIS (bottom). The axial slices shown are from the most cranial slice in the ground truth. In this slice, there is perfect overlap between ground truth and SCIS. The CT-only model also show good overlap in the axial view, but extends two slices more cranial. Red colors: Ground truth of brain stem and spinal cord. Yellow colors are predictions.B) An example of an upper pharyngeal constrictor muscle, predicted with CT-only (left) and SCIS (right). The upper and lower slices shown are the most cranial and caudal slices. for the SCIS model, there is perfect overlap in these slices, while there is good agreement in the middle parts. The CT-only prediction misses the “true” top and contours the organ wider and larger than the ground truth. C) An instance of missing/faulty predictions the CT-only model, which were adjusted with SCIS. Shades of red and and green are ground truth of eyes and optic nerves, respectively. Predictions for optic nerve is in blue, and for eyes in beige, yellow and purple.SCIS: single-cycle interactive segmentation.
      In the feasibility test, time savings for the three test cases, respectively, were 7.1 (29%), 3.7 (21%) and 11.7 (64%) minutes with single-cycle interactive segmentation compared to correcting the atlas predictions [Figure 4, top]. Added path length decreased by 1827 (24 %), 2184 (39 %) and 1435 (51 %) voxels for the three test cases, respectively [Figure 4, bottom].
      • 4 Discussion
      Figure thumbnail gr4
      Figure 4In the top figure are contouring times and on the bottom figure are the added path lengths for the three cases in the feasibility test. Red bars denote correction of atlas-based auto-segmetations. Green bars denote the contouring of the most cranial and most caudal slice of all organs-at-risk. Blue bars denote correction of SCIS. SCIS: single-cycle interactive segmentation.
      The retrospective analysis and the feasibility test both supported the potential benefit of single-cycle interactive segmentation. Figure 2 show that at least 12 of the 16 organs-at-risk improved notably in precision and accuracy. Organs-at-risk which already performed well with the CT-only model (e.g. Parotid, Brain and Mandible) gain little or nothing from single-cycle interactive segmentation [figure 2]. On the other hand, the performance gains increased with decreasing performance of the CT-only model [figure 2 and supplementary figure S3-5]. It should be noted, that the applied statistic is insensitive to the magnitude of change between the groups, but rather to the direction of changes (increase or decrease in metric). Because ground truth was actively added as input to the single-cycle interactive segmentation model, it was not surprising that all organs-at-risk improved at least a bit as compared to the CT-only model, and thereby produced highly significant statistics.
      Supplementary tables S1, S2 and S3 provide an overview of the best Dice, HD95 and ASSD for each structure as reported in [
      • Vrtovec T.
      • Močnik D.
      • Strojan P.
      • Pernuš F.
      • Ibragimov B.
      Auto-segmentation of organs at risk for head and neck radiotherapy planning: From atlas-based to deep learning methods.
      ] alongside with all results from the present study. Keeping in mind that comparing segmentation metrics across studies is tricky and often misleading due to differences in the underlying data, the single-cycle interactive segmentation model performed approximately on par or better than the best reported fully automatic segmentation models. While this serves as a sanity check of the prediction quality, the purpose of this study was to explore the potential of interactive segmentation rather than reporting a model, which outperforms the current state-of-the-art. The available data set was large and spanned a long period of time. Contouring guidelines were introduced during this period and contouring practices undoubtedly changed. Furthermore, contours were produced by many different radiation oncologists with varying levels of experience. The data set therefore suffered from heterogeneity, which must be considered a contributing factor to the relatively poor performance of the CT-only model compared other published auto-segmentation models [
      • Vrtovec T.
      • Močnik D.
      • Strojan P.
      • Pernuš F.
      • Ibragimov B.
      Auto-segmentation of organs at risk for head and neck radiotherapy planning: From atlas-based to deep learning methods.
      ]. For instance, the CT-only model predicted four cases of Brain with a Dice coefficient below 0.3. It turned out that the ground truths were cropped just above the cerebellum in these cases. While the CT-only model predicted a full brain (correctly), the metrics were evaluated with the misleading ground truths. For OralCavity there was also a mix of guidelines applied, i.e “oral cavity” and “extended oral cavity”. The single-cycle interactive segmentation model was provided with the information about the most cranial slice - even though this was erroneous or followed different guidelines – and hence, got the prediction “correct”. This obviously skewed the results in favor of the single-cycle interactive segmentation model, but it also revealed an unanticipated but potential benefit of the model. Despite vast heterogeneity in the training data, the single-cycle interactive segmentation model was able to make precise and accurate contours. Hence, if the boundaries of input contours respect guidelines, the boundaries of the predictions are also likely to do so, and thus mitigating the need for editing in the cranial and caudal extent as described in [
      • Brouwer C.L.
      • Boukerroui D.
      • Oliveira J.
      • Looney P.
      • Steenbakkers R.J.H.M.
      • Langendijk J.A.
      • et al.
      Assessment of manual adjustment performed in clinical practice following deep learning contouring for head and neck organs at risk in radiotherapy.
      ,
      • Vaassen F.
      • Boukerroui D.
      • Looney P.
      • Canters R.
      • Verhoeven K.
      • Peeters S.
      • et al.
      Real-world analysis of manual editing of deep learning contouring in the thorax region.
      ].
      It is unknown whether the benefit of the single-cycle interactive segmentation would persist if applied on highly curated consensus data. However, one might speculate that the benefit would drop for organs-at-risk with a low observer variation but remain high for organs-at-risk with a high observer variability.
      Today, it is well-established that editing of auto-contours reduce inter-observer variability [
      • van der Veen J.
      • Willems S.
      • Deschuymer S.
      • Robben D.
      • Crijns W.
      • Maes F.
      • et al.
      Benefits of deep learning for delineation of organs at risk in head and neck cancer.
      ]. One reason for this is probably that there exist a range of plausible contours, which are clinically acceptable. When clinicians are presented a contour from within this space, they would accept it, and thereby reducing the inconsistencies that stem from contouring from scratch. Hypothetically, if the proposed interactive method was implemented into clinical practice, inter-observer variation might not be reduced as much as for the traditional editing of auto-contours. As shown in figure 4, B, the user input can modify the shape of predicted contours. Hence, inter-observer variation would need to be mitigated through focused and continuous training in defining borders.
      The reasoning behind using the most cranial and caudal slices was that sharp edges primarily are in the cranial-caudal extent of structures [
      • Brouwer C.L.
      • Steenbakkers R.J.H.M.
      • Bourhis J.
      • Budach W.
      • Grau C.
      • Grégoire V.
      • et al.
      CT-based delineation of organs at risk in the head and neck region: DAHANCA, EORTC, GORTEC, HKNPCSG, NCIC CTG, NCRI, NRG Oncology and TROG consensus guidelines.
      ,
      • Jensen K.
      • Friborg J.
      • Hansen C.R.
      • Samsøe E.
      • Johansen J.
      • Andersen M.
      • et al.
      radiotherapy guidelines.
      ], and that deep learning-models are likely to get these edges wrong [
      • Brouwer C.L.
      • Boukerroui D.
      • Oliveira J.
      • Looney P.
      • Steenbakkers R.J.H.M.
      • Langendijk J.A.
      • et al.
      Assessment of manual adjustment performed in clinical practice following deep learning contouring for head and neck organs at risk in radiotherapy.
      ,
      • Vaassen F.
      • Hazelaar C.
      • Vaniqui A.
      • Gooding M.
      • van der Heyden B.
      • Canters R.
      • et al.
      Evaluation of measures for assessing time-saving of automatic organ-at-risk segmentation in radiotherapy.
      ]. Further investigations should seek to uncover to which extent contouring input can be minimized before the segmentation performance is penalized. Perhaps it is not necessary to actually contour the cranial and caudal slice, but only mark a few extremity points like in [
      • Bai T.
      • Balagopal A.
      • Dohopolski M.
      • Morgan H.E.
      • McBeth R.
      • Tan J.
      • et al.
      A Proof-of-Concept Study of Artificial Intelligence–assisted Contour Editing.
      ] - or even just mark the transversal planes of each structure. Time savings will likely increase, if it is possible to diminish the contouring input while maintaining the benefit of single-cycle interactive segmentation.
      A strength of the strategy used in this study was that only one cycle of interaction was used. This allowed implementation of a central inference server with graphical processing units (GPUs), because network overhead is negligible when only one cycle is performed. Multi-cycle interactive segmentation such as [
      • Bai T.
      • Balagopal A.
      • Dohopolski M.
      • Morgan H.E.
      • McBeth R.
      • Tan J.
      • et al.
      A Proof-of-Concept Study of Artificial Intelligence–assisted Contour Editing.
      ,
      • Smith A.G.
      • Petersen J.
      • Terrones-Campos C.
      • Berthelsen A.K.
      • Forbes N.J.
      • Darkner S.
      • et al.
      RootPainter3D: Interactive-machine-learning enables rapid and accurate contouring for radiotherapy.
      ] require many and much faster interactions. If these were deployed on a central inference server, network overhead would likely become a real issue. A remote inference server is much cheaper and easier to maintain, as compared to installing GPUs on all workbenches. Single-cycle interactive segmentation does, however, add a layer of technical complexity compared to regular offline auto-segmentation. The TPS must be able to export scans and contours directly, send them off to an inference server and have contours automatically loaded again. This proved to be feasible in MIM, but may not be for other systems.
      In the feasibility test, time savings were observed for all three cases. Time measurements were, however, prone to random prolongations, because the radiation oncologist was new to contouring in MIM. Added path lengths, on the other hand, are insensitive to “idle” contouring time, and were reduced for all three cases. This indicates that contour predictions of the single-cycle interactive segmentation model were closer to being clinically acceptable as compared to predictions of the atlas-based model. It does, however, not lead to conclusions regarding whether single-cycle interactive segmentation is better than the current state-of-the-art auto-segmentation.
      From a clinical point of view, single-cycle interactive segmentation was deemed feasible. When the radiation oncologist submitted the initial contours to the model, it took around two minutes before the predicted contours were loaded and could be edited. During this time, the radiation oncologist could continue contouring other structures and target volumes. Therefore, the prediction time was not perceived as time wasted.
      In conclusion, a single-cycle interactive segmentation model was developed, in which the most cranial and caudal contour slice of each organ-at-risk along with CT were used as input for deep learning-based segmentation model. Retrospectively, the model improved metrics of most investigated organs-at-risk in head-and-neck cancer. In a small feasibility test, single-cycle interactive segmentation was found to reduce contouring time and added path length on three out of three cases when compared to our clinic’s current atlas-based auto-segmentation practice. Single-cycle interactive segmentation was deemed feasible both from a technical and a clinical point of view. The efficacy must, however, be confirmed on high quality consensus data.

      Declaration of Competing Interest

      The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
      • Acknowledgements
      MER was funded by Department of Clinical Medicine, Aarhus University.

      References

        • Brouwer C.L.
        • Steenbakkers R.J.H.M.
        • Bourhis J.
        • Budach W.
        • Grau C.
        • Grégoire V.
        • et al.
        CT-based delineation of organs at risk in the head and neck region: DAHANCA, EORTC, GORTEC, HKNPCSG, NCIC CTG, NCRI, NRG Oncology and TROG consensus guidelines.
        Radiother Oncol. 2015; 117: 83-90https://doi.org/10.1016/J.RADONC.2015.07.041
        • Jensen K.
        • Friborg J.
        • Hansen C.R.
        • Samsøe E.
        • Johansen J.
        • Andersen M.
        • et al.
        radiotherapy guidelines.
        Radiother Oncol. 2020; 2020https://doi.org/10.1016/j.radonc.2020.07.037
        • Stapleford L.J.
        • Lawson J.D.
        • Perkins C.
        • Edelman S.
        • Davis L.
        • McDonald M.W.
        • et al.
        Evaluation of Automatic Atlas-Based Lymph Node Segmentation for Head-and-Neck Cancer.
        Int J Radiat Oncol Biol Phys. 2010; 77: 959-966https://doi.org/10.1016/j.ijrobp.2009.09.023
        • Chao K.S.C.
        • Bhide S.
        • Chen H.
        • Asper J.
        • Bush S.
        • Franklin G.
        • et al.
        Reduce in Variation and Improve Efficiency of Target Volume Delineation by a Computer-Assisted System Using a Deformable Image Registration Approach.
        Int J Radiat Oncol Biol Phys. 2007; 68: 1512-1521https://doi.org/10.1016/j.ijrobp.2007.04.037
        • Sarrade T.
        • Gautier M.
        • Schernberg A.
        • Jenny C.
        • Orthuon A.
        • Maingon P.
        • et al.
        Educative Impact of Automatic Delineation Applied to Head and Neck Cancer Patients on Radiation Oncology Residents.
        J Cancer Educ. 2022; https://doi.org/10.1007/s13187-022-02157-9
        • Lin L.
        • Dou Q.
        • Jin Y.M.
        • Zhou G.Q.
        • Tang Y.Q.
        • Chen W.L.
        • et al.
        Deep learning for automated contouring of primary tumor volumes by MRI for nasopharyngeal carcinoma.
        Radiology. 2019; 291: 677-686https://doi.org/10.1148/radiol.2019182012
        • Walker G.V.
        • Awan M.
        • Tao R.
        • Koay E.J.
        • Boehling N.S.
        • Grant J.D.
        • et al.
        Prospective randomized double-blind study of atlas-based organ-at-risk autosegmentation-assisted radiation planning in head and neck cancer.
        Radiother Oncol. 2014; 112: 321-325https://doi.org/10.1016/j.radonc.2014.08.028
        • Teguh D.N.
        • Levendag P.C.
        • Voet P.W.J.
        • Al-Mamgani A.
        • Han X.
        • Wolf T.K.
        • et al.
        Clinical validation of atlas-based auto-segmentation of multiple target volumes and normal tissue (swallowing/mastication) structures in the head and neck.
        Int J Radiat Oncol Biol Phys. 2011; 81: 950-957https://doi.org/10.1016/j.ijrobp.2010.07.009
        • Lee H.
        • Lee E.
        • Kim N.
        • Kim J.
        • ho, Park K, Lee H,
        • et al.
        Clinical evaluation of commercial atlas-based auto-segmentation in the head and neck region. Front.
        Oncol. 2019; : 9https://doi.org/10.3389/fonc.2019.00239
        • Hu K.
        • Lin A.
        • Young A.
        • Kubicek G.
        • Piper J.W.
        • Nelson A.S.
        • et al.
        Timesavings for Contour Generation in Head and Neck IMRT: Multi-institutional Experience with an Atlas-based Segmentation Method.
        Int J Radiat Oncol. 2008; 72: S391https://doi.org/10.1016/j.ijrobp.2008.06.1261
        • Brouwer C.L.
        • Boukerroui D.
        • Oliveira J.
        • Looney P.
        • Steenbakkers R.J.H.M.
        • Langendijk J.A.
        • et al.
        Assessment of manual adjustment performed in clinical practice following deep learning contouring for head and neck organs at risk in radiotherapy.
        Phys Imaging Radiat Oncol. 2020; 16: 54-60https://doi.org/10.1016/j.phro.2020.10.001
        • Vaassen F.
        • Boukerroui D.
        • Looney P.
        • Canters R.
        • Verhoeven K.
        • Peeters S.
        • et al.
        Real-world analysis of manual editing of deep learning contouring in the thorax region.
        Phys Imaging Radiat Oncol. 2022; 22: 104-110https://doi.org/10.1016/J.PHRO.2022.04.008
        • Bai T.
        • Balagopal A.
        • Dohopolski M.
        • Morgan H.E.
        • McBeth R.
        • Tan J.
        • et al.
        A Proof-of-Concept Study of Artificial Intelligence–assisted Contour Editing.
        Radiol Artif Intell. 2022; https://doi.org/10.1148/ryai.210214
        • Smith A.G.
        • Petersen J.
        • Terrones-Campos C.
        • Berthelsen A.K.
        • Forbes N.J.
        • Darkner S.
        • et al.
        RootPainter3D: Interactive-machine-learning enables rapid and accurate contouring for radiotherapy.
        Med Phys. 2022; 49: 461-473https://doi.org/10.1002/mp.15353
      1. Phil T, thomas-albrecht, Gay S. Sikerdebaard/dcmrtstruct2nii: dcmrtstruct2nii v2 2022. https://doi.org/10.5281/zenodo.6330598.

        • Isensee F.
        • Jaeger P.F.
        • Kohl S.A.A.
        • Petersen J.
        • Maier-Hein K.H.
        nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation.
        Nat Methods. 2021; 18: 203-211https://doi.org/10.1038/s41592-020-01008-z
      2. nnU-Net 2022. https://github.com/MIC-DKFZ/nnUNet (accessed August 23, 2022).

      3. MIM Maestro® | Radiation Oncology Automation | AI AutoContouring n.d. https://www.mimsoftware.com/radiation-oncology/mim-maestro (accessed August 15, 2022).

      4. InferenceServer Client - MIMExtensions n.d. https://github.com/mathiser/MIMExtensions/tree/main/pythonInferenceGTV (accessed August 15, 2022).

      5. mathiser/inference_server n.d. https://github.com/mathiser/inference_server (accessed August 15, 2022).

        • Vaassen F.
        • Hazelaar C.
        • Vaniqui A.
        • Gooding M.
        • van der Heyden B.
        • Canters R.
        • et al.
        Evaluation of measures for assessing time-saving of automatic organ-at-risk segmentation in radiotherapy.
        Phys Imaging Radiat Oncol. 2020; 13: 1-6https://doi.org/10.1016/j.phro.2019.12.001
      6. mathiser. ContourEval 2022. https://github.com/mathiser/ContourEval (accessed August 23, 2022).

      7. pandas - Python Data Analysis Library n.d. https://pandas.pydata.org/ (accessed October 6, 2022).

      8. NumPy n.d. https://numpy.org/ (accessed October 6, 2022).

      9. SciPy n.d. https://scipy.org/ (accessed December 29, 2022).

      10. Python. PythonOrg n.d. https://www.python.org/ (accessed October 6, 2022).

        • Vrtovec T.
        • Močnik D.
        • Strojan P.
        • Pernuš F.
        • Ibragimov B.
        Auto-segmentation of organs at risk for head and neck radiotherapy planning: From atlas-based to deep learning methods.
        Med Phys. 2020; : mp .14320https://doi.org/10.1002/mp.14320
        • van der Veen J.
        • Willems S.
        • Deschuymer S.
        • Robben D.
        • Crijns W.
        • Maes F.
        • et al.
        Benefits of deep learning for delineation of organs at risk in head and neck cancer.
        Radiother Oncol. 2019; 138: 68-74https://doi.org/10.1016/j.radonc.2019.05.010