CVPR 2021

Iterative Filter Adaptive Network
for Single Image Defocus Deblurring

POSTECH
RealDOF Testset
overall framework

We propose IFaN that is specifically designed for
effective handling of spatially-varying and large defocus blur.

Abstract

We propose a novel end-to-end learning-based approach for single image defocus deblurring. The proposed approach is equipped with a novel Iterative Filter Adaptive Network (IFAN) that is specifically designed to handle spatially-varying and large defocus blur. For adaptively handling spatially-varying blur, IFAN predicts pixel-wise deblurring filters, which are applied to defocused features of an input image to generate deblurred features. For effectively managing large blur, IFAN models deblurring filters as stacks of small-sized separable filters. Predicted separable deblurring filters are applied to defocused features using a novel Iterative Adaptive Convolution (IAC) layer. We also propose a training scheme based on defocus disparity estimation and reblurring, which significantly boosts the deblurring quality. We demonstrate that our method achieves state-of-the-art performance both quantitatively and qualitatively on real-world images.

Iterative Adaptive Convolution (IAC)

For effectively handling spatially-varying and large defocus blur, it is critical to secure large receptive fields. However, while FAC facilitates spatially-adaptive processing of features, increasing the filter size to cover wider receptive fields results in huge memory consumption and computational cost. To resolve this limitation, we propose the IAC layer that iteratively applies small-sized separable filters to efficiently enlarge the receptive fields with little computational overhead.

IAC

Disparity Map Estimation

As the disparities between dual-pixel stereo images are proportional to blur magnitudes, we can train IFAN to learn more accurate defocus blur information by training IFAN to predict the disparity map. To this end, during training, we feed the right image \(I_B^r\) from a pair of dual-pixel stereo images to our deblurring network. Then, we train the disparity map estimator to predict a disparity map \(d^{r\rightarrow l}\!\in\!\mathbb{R}^{h\times w}\) between the downsampled left and right stereo images. \[\mathcal{L}_{disp} = MSE(I_{B\downarrow}^{r\rightarrow l}, I_{B\downarrow}^l),\] where \(MSE(\cdot)\) is the mean-squared error function. \(I_{B\downarrow}^l\) is a left image downsampled by \(\frac{1}{8}\). \(I_{B\downarrow}^{r\rightarrow l}\) is a right image downsampled by \(\frac{1}{8}\) and warped by the disparity map \(d^{r\rightarrow l}\).

DME

Reblurring

We train IFAN also using the reblurring task. For the learning of reblurring, we introduce an auxiliary reblurring network. The reblurring network is attached at the end of IFAN and trained to invert deblurring filters \(\textbf{F}_{deblur}\) to reblurring filters \(\textbf{F}_{reblur}\). Then, using \(\textbf{F}_{reblur}\), the IAC layer reblurs a downsampled ground-truth image \(I_{S\downarrow}\in\mathbb{R}^{h\times w\times 3}\) to reproduce a downsampled version of the defocused input image. For training IFAN as well as the reblurring network, we use a reblurring loss defined as: \[\mathcal{L}_{reblur}=MSE(I_{SB\downarrow}, I_{B\downarrow}),\] where \(I_{SB\downarrow}\) is a reblurred image obtained from \(I_{S\downarrow}\) using \(\textbf{F}_{reblur}\), and \(I_{B\downarrow}\) is a downsampled input image.

RBN

RealDOF Test Set

To quantitatively measure the performance of our method on real-world defocus blur images, we prepare a new dataset named Real Depth of Field (RealDOF) test set. RealDOF consists of 50 scenes. For each scene, the dataset provides a pair of a defocused image and its corresponding all-in-focus image. To capture image pairs of the same scene with different depth-of-fields, we built a dual-camera system with a beam splitter. Specifically, our system is equipped with two cameras attached to the vertical rig with a beam splitter. The system is also equipped with a multi-camera trigger to synchronize the camera shutters to capture images simultaneously. The captured images are post-processed for geometric and photometric alignments, similarly to [25].

RealDOF_system

Results

Ablation Study

Ablation

Quantitative Comparison

Qualitative Comparison

BibTeX

@InProceedings{Lee2021IFAN,
    author    = {Junyong Lee and Hyeongseok Son and Jaesung Rim and Sunghyun Cho and Seungyong Lee},
    title     = {Iterative Filter Adaptive Network for Single Image Defocus Deblurring},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2021}
}