Evaluation Metrics
The Dice similarity coefficient and 95% Hausdorff distance were used to quantify the accuracy of the segmented structures. These metrics are described in Chapter 15. In contrast to the implementation described in Chapter 15, evaluation was performed on voxel ized segmentations using Plastimatch. To overcome inconsistencies in the superior and inferior extents of manual labeled tubular structures (spinal cord and esophagus), both these segmented structures and the ground truth structures were cropped at the superior and inferior borders by 10 mm with respect to the ground truth structures.
The directed percent Hausdorff measure, for a percentile r, is the rth percentile distance over all distances from points in X to their closest point in Y.
Experimental steps
The LCTSC data [18], described in Chapter 1, was used for the experiments. The 36 training cases were used for optimizing the registration and atlas fusion parameters. The remaining 24 offline and online test cases were used to evaluate accuracy. Four different deformable image registration strategies were carried out. The first three strategies used the Bspline algorithm, and the last strategy used the demons algorithm. All strategies used multiple registration stages as shown in Table 4.3. The Bspline methods used five stages and the demons algorithm used four stages. The first two stages for all the strategies were rigid, followed by the deformable stages. Three different parameters were varied for the Bspline based model: (1) image subsampling rate (Res) was varied within a range from 2 mm to 6 mm, (2) regularization weight and type (Reg) was varied between the curvature regularizer with weights from 0100, and the third order regularizer with weights from 110, and (3) ВSpline grid spacing (GS) was varied within a range from 10 mm to 100 mm. For the last strategy (demons), the width of the Gaussian kernel used to smoothen the displacement field was varied from 1 mm to 4 mm. The value ranges for each of these parameters were selected based on the authors’ prior experience with image registration. In order to narrow down the range of the parameters, a few experiments were run on a subset of atlases. Table 4.1 describes the acronyms used in the chapter. The fixed and constant parameters for all registration strategies are described in Tables 4.2 and 4.3. Only one parameter is varied for a given strategy, keeping all other parameters fixed. The fixed parameter
TABLE 4.1
Acronyms Used for Each Parameter and Their Explanations
Acronym Used in Text 
Explanation of Parameter 
Res 
Image subsampling rate 
Reg 
Regularization weight and type 
GS 
Grid spacing 
Demons 
Width of Gaussian kernel 
TABLE 4.2
Parameters Varied for Each Registration Strategy
Registration strategy 
Registration Stages 

Rigid1 
Rigid2 
Deformable1 
Deformable2 
Deformable3 

Res 1 
4x4x3 
4x4x3 
4x4x3 
2x2x3 
2x2x3 
Res 2 
4x4x3 
4x4x3 
4x4x3 
3x3x3 
3x3x3 
Res 3 
4x4x3 
4x4x3 
4x4x3 
4x4x3 
4x4x3 
Res 4 
4x4x 12 
4x4x 12 
4x4x 12 
2x2x6 
2x2x6 
Res 5 
6x6x6 
6x6x6 
6x6x6 
3x3x3 
2x2x3 
Res 6 
6x6x6 
6x6x6 
6x6x6 
3x3x3 
3x3x3 
Res 7 
6x6x6 
6x6x6 
6x6x6 
6x6x6 
6x6x6 
Reg 0 
NA 
NA 
Curvature 0 
Curvature 0 
Curvature 0 
Reg 1 
NA 
NA 
Curvature 100 
Curvature 1 
Curvature 0.1 
Reg 2 
NA 
NA 
Curvature 100 
Curvature 10 
Curvature 0.1 
Reg 3 
NA 
NA 
Curvature 100 
Curvature 10 
Curvature 1 
Reg 4 
NA 
NA 
Curvature 100 
Curvature 100 
Curvature 100 
Reg 5 
NA 
NA 
Third order 100 
Third order 10 
Third order 1 
Reg 6 
NA 
NA 
Third order 1000 
Third order 10 
Third order 1 
Reg 7 
NA 
NA 
Third order 1000 
Third order 100 
Third order 1 
Reg 8 
NA 
NA 
Third order 1000 
Third order 100 
Third order 10 
GS 1 
NA 
NA 
100 x 100 x 100 
30 x 30 x 30 
10 x 10 x 10 
GS 2 
NA 
NA 
100 x 100 x 100 
30 x 30 x 30 
20 x 20 x 20 
GS 3 
NA 
NA 
100 x 100 x 100 
50 x 50 x 50 
10 x 10 x 10 
GS 4 
NA 
NA 
100 x 100 x 100 
50 x 50 x 50 
20 x 20 x 20 
GS 5 
NA 
NA 
100 x 100 x 100 
50 x 50 x 50 
30 x 30 x 30 
GS 6 
NA 
NA 
100 x 100 x 100 
50 x 50 x 50 
50 x 50 x 50 
GS 7 
NA 
NA 
100 x 100 x 100 
100 x 100 x 100 
100 x 100 x 100 
Demons 1 
NA 
NA 
2 
1 
NA 
Demons 2 
NA 
NA 
3 
2 
NA 
Demons 3 
NA 
NA 
4 
3 
NA 
Demons 4 
NA 
NA 
5 
4 
NA 
settings are marked in bold in Table 4.2. For example, when the image subsampling rate is varied, the grid spacing, regularization type, and weight are held constant as seen in Table 4.3. Once the optimal parameters were determined, each query image was segmented based on the optimal parameters and the segmentations were evaluated for each anatomical structure against the ground truth.
TABLE 4.3
Constant Parameters for the Registration Strategies
Parameter varied 
Constant parameters 
Registration Stages 

Rigid1 
Rigid2 
Deformable1 
Deformable2 
Deformable3 

Image subsampling (mm) 
Regularizer type and weight 
NA 
NA 
Curvature 100 
Curvature 1 
Curvature 0.1 
Grid spacing (mm) 
NA 
NA 
100 x 100 x I00 
50 x 50 x 50 
30 x 30 x 30 

Regularizer type and weight 
Image subsampling (mm) 
6x6x6 
6x6x6 
6x6x6 
3x3x3 
3x3x3 
Grid spacing (mm) 
NA 
NA 
100 x 100 x 100 
50 x 50 x 50 
30 x 30 x 30 

Grid spacing (mm) 
Image subsampling (mm) 
6x6x6 
6x6x6 
6x6x6 
3 x 3 x 3 
3 x 3 x 3 
Regularizer type and weight 
NA 
NA 
Curvature 100 
Curvature 1 
Curvature 0.1 

Width of Gaussian kernel 
Image subsampling (mm) 
6x6x6 
6x6x6 
6x6x6 
3x3x3 
NA 
Results
The voxel sampling rate was found not to affect the performance of either of the registration algorithms substantially. The Bspline method was also found to be relatively robust to variations in the controlpoint spacing. However, both methods were found to be most sensitive to the tuning of the regularizer. Figure 4.1 compares the average Dice Similarity Coefficient (DSC) over the five Organsatrisk (OARs) by varying the (a) width of the Gaussian kernel and (b) curvature regularizer weight. It is observed that increasing the smoothness has an inverse impact on the performance as measured by DSC. Gaussian kernel of 1 mm width and curvature regularizer weight of 0.1 lead to the highest average DSC.
GS4 was the best overall optimization strategy. The segmentation accuracy of the lungs was significantly higher than the other structures (DSC = 0.95 ± 0.02,95% HD = 5.29 ± 3.25). The segmentation accuracy of the esophagus and spinal cord were relatively low (spinal cord: DSC = 0.8 ±
0.08, 95% HD = 12.34 ± 13.9; esophagus: DSC = 0.59 ± 0.08, 95% HD = 9.47 ± 6.57) due to poor softtissue contrast. Figure 4.2 shows the average over the 24 test cases (a) DSC and (b) 95% HD achieved by the best performing strategy (GS4) for all the OARs and the average over the OARs in the form of box plots, indicating confidence interval of [5,95] % and their corresponding outliers. Figure 4.3 shows the segmentation performance for the best performing strategy. Rows 15 show the 5th, 25th, 50th, 75th, and 95th quartiles of the median distribution, with the 5th quartile being the worst and the 95th quartile being the best segmentation. The 5th quartile example produces a poor segmentation due to the presence of a tumor in the left lung.
FIGURE 4.1 Average DSC over the five OARs achieved by varying (a) width of the Gaussian kernel and (b) curvature regularizer weight for the 24 test cases.
FIGURE 4.2 (a) DSC and (b) 95 % HD achieved by the best performing strategy for the five OARs and their average over the 24 test cases.
FIGURE 4.3 Segmentations generated using the best performing strategy. Rows 15 show the 5th, 25th, 50th, 75th, and 95th percentile images, from worst to best, from the 24 test cases.
Summary
Identifying the optimal registration and voting parameters is a challenging exercise. For practical purposes, numerous algorithm details such as atlas selection and label fusion parameters must be held constant during testing. However, one must consider that interplay could exist between these algorithm settings. For example, a registration with high regularization may require a different number of atlases than a registration with low regularization.
In general, both Bspline and demons registrations performed similarly. Algorithms were not very sensitive to the voxel sampling rate, which means that faster registrations at higher sampling rates can be considered. An intermediate schedule with final grid spacing of 20 mm was preferred for Bspline grid spacing. However, final segmentation results were not found to be highly sensitive to these parameters, and average Dice similarity varied only by a few percent over fairly broad parameter setting ranges. However, both algorithms were found to be more affected by the choice of regularizer parameters, with smaller regularization penalty terms being preferred for both Bspline and demons registrations.
References
 1. Alven, J, et at. (2016). “Uberatlas: fast and robust registration for multiatlas segmentation.” 80:249255.
 2. Bai. J, et at. (2012). “Atlasbased automatic mouse brain image segmentation revisited: model complexity vs. image registration.” 30(6): 789798.
 3. Datteri, R, et al. (2011). “Estimation of registration accuracy applied to multiatlas segmentation.” MICCAI Workshop on MultiAllas Labeling and Statistical Fusion.
 4. Doshi, J, et al. (2016). “MUSE: multiatlas region segmentation utilizing ensembles of registration algorithms and parameters, and locally optimal atlas selection.” 127: 186195.
 5. Rueckert, D, et al. (1999). “Nonrigid registration using freeform deformations: application to breast MR images.” 18(8): 712721.
 6. Heckemann, RA, et al. (2010). “Improving intersubject image registration using tissueclass information benefits robustness and accuracy of multiatlas based anatomical segmentation.” 51(1): 221227.
 7. Lotjonen, JM, et al. (2010). “Fast and robust multiatlas segmentation of brain magnetic resonance images.” 49(3): 23522365.
 8. Sjoberg, C, et al. (2013). “Multiatlas based segmentation using probabilistic label fusion with adaptive weighting of image similarity measures.” 110(3): 308319.
 9. Yeo, ВТ, et al. (2008). “Effects of registration regularization and atlas sharpness on segmentation accuracy.” 12(5): 603615.
 10. Zaffino, P, et al. (2016). “Plastimatch MABS, an open source tool for automatic image segmentation.” 43(9): 51555160.
 11. Shah, KD, et al. (2020). “A generalized framework for analytic regularization of uniform cubic Bspline displacement fields.” arXiv:2010.02400
 12. Shackleford, JA, et al. (2012). “Analytic regularization of uniform cubic Bspline deformation fields.” International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer.
 13. Shackleford, JA, et al. (2010). “On developing Bspline registration algorithms for multicore processors." 55(21): 6329.
 14. Joshi, S, et al. (2004). “Unbiased diffeomorphic atlas construction for computational anatomy.” 23(Supplement 1): S151S160.
 15. Avants. BB. et al. (2008). “Symmetric diffeomorphic image registration with crosscorrelation: evaluating automated labeling of elderly and neurodegenerative brain.” 12(1): 2641.
 16. Vercauteren, T, et al. (2009). “Diffeomorphic demons: efficient nonparametric image registration.” 45(1): S61S72.
 17. Thirion, JP (1998). “Image matching as a diffusion process: an analogy with Maxwell’s demons.” Medical Image Analysis 2(3): 243260.
 18. Yang, J, et al. (2018). “Autosegmentation for thoracic radiation treatment planning: a grand challenge at AAPM 2017." 45(10): 45684581.