Task-based evaluation of deep image super-resolution in medical imaging

[Collaborative project with Professor Anastasio]

In medical imaging, it is sometimes desirable to acquire high resolution images that reveal anatomical and physiological information to support clinical practice such as diagnosis and image-guided therapies. However, for certain imaging modalities (i.e., magnetic resonance imaging (MRI)), acquiring high resolution images can be a very time-consuming and resource-intensive process. One popular solution recently developed is to create a high resolution version of the acquired low-resolution image by use of deep image super-resolution (DL-SR) methods. It has been demonstrated in literature that deep super-resolution networks can improve the image quality measured by traditional physical metrics such as mean square error (MSE), structural similarity index metric (SSIM) and peak signal-to-noise ratio (PSNR). However, it is not clear how well these metrics quantify the diagnostic value of the generated SR images. Here, a task-based super-resolution (SR) image quality assessment is conducted to quantitatively evaluate the efficiency and performance of DL-SR methods. A Rayleigh task is designed to investigate the impact of signal length and super-resolution network complexity on s binary detection performance. Numerical observers (NOs) including the regularized Hotelling Observer (RHO), the anthropomorphic Gabor channelized observers (Gabor CHO) and the ResNet-approximated ideal observer (ResNet-IO) are implemented to assess the Rayleigh task performance. For the datasets considered in this study, little to no improvement in task performance of the considered NOs due to the considered DL-SR SR networks, despite substantial improvement in traditional IQ metrics.

Methodology

Results

Our numerical experiments confirmed that, as expected, DL-SR could improve traditional measures of IQ. However, for many of the study designs considered, the DL-SR methods provided little or no improvement in task performance and could even degrade it. It was observed that DL-SR could improve the task-performance of sub-optimal observers under certain conditions. The presented study highlights the urgent need for the objective assessment of DL-SR methods and suggests avenues for improving their efficacy in medical imaging applications.

RHO templates computed on (a) HR and LR images, and (b–f) images from SRCNN and SRGAN resulting from sweeping the regularization parameter λ

Related publications

    1. Varun A. Kelkar, Xiaohui Zhang, Jason Granstedt, Hua Li, and Mark A. Anastasio “Task-based evaluation of deep image super-resolution in medical imaging“, Proc. SPIE 11599, Medical Imaging 2021: Image Perception, Observer Performance, and Technology Assessment, 115990X.