CVPR 2021

Iterative Filter Adaptive Network
for Single Image Defocus Deblurring

overall framework

We propose IFaN that is specifically designed for
effective handling of spatially-varying and large defocus blur.


We propose a novel end-to-end learning-based approach for single image defocus deblurring. The proposed approach is equipped with a novel Iterative Filter Adaptive Network (IFAN) that is specifically designed to handle spatially-varying and large defocus blur. For adaptively handling spatially-varying blur, IFAN predicts pixel-wise deblurring filters, which are applied to defocused features of an input image to generate deblurred features. For effectively managing large blur, IFAN models deblurring filters as stacks of small-sized separable filters. Predicted separable deblurring filters are applied to defocused features using a novel Iterative Adaptive Convolution (IAC) layer. We also propose a training scheme based on defocus disparity estimation and reblurring, which significantly boosts the deblurring quality. We demonstrate that our method achieves state-of-the-art performance both quantitatively and qualitatively on real-world images.

Iterative Adaptive Convolution (IAC)

For effectively handling spatially-varying and large defocus blur, it is critical to secure large receptive fields. However, while FAC facilitates spatially-adaptive processing of features, increasing the filter size to cover wider receptive fields results in huge memory consumption and computational cost. To resolve this limitation, we propose the IAC layer that iteratively applies small-sized separable filters to efficiently enlarge the receptive fields with little computational overhead.


Disparity Map Estimation

As the disparities between dual-pixel stereo images are proportional to blur magnitudes, we can train IFAN to learn more accurate defocus blur information by training IFAN to predict the disparity map. To this end, during training, we feed the right image \(I_B^r\) from a pair of dual-pixel stereo images to our deblurring network. Then, we train the disparity map estimator to predict a disparity map \(d^{r\rightarrow l}\!\in\!\mathbb{R}^{h\times w}\) between the downsampled left and right stereo images. \[\mathcal{L}_{disp} = MSE(I_{B\downarrow}^{r\rightarrow l}, I_{B\downarrow}^l),\] where \(MSE(\cdot)\) is the mean-squared error function. \(I_{B\downarrow}^l\) is a left image downsampled by \(\frac{1}{8}\). \(I_{B\downarrow}^{r\rightarrow l}\) is a right image downsampled by \(\frac{1}{8}\) and warped by the disparity map \(d^{r\rightarrow l}\).



We train IFAN also using the reblurring task. For the learning of reblurring, we introduce an auxiliary reblurring network. The reblurring network is attached at the end of IFAN and trained to invert deblurring filters \(\textbf{F}_{deblur}\) to reblurring filters \(\textbf{F}_{reblur}\). Then, using \(\textbf{F}_{reblur}\), the IAC layer reblurs a downsampled ground-truth image \(I_{S\downarrow}\in\mathbb{R}^{h\times w\times 3}\) to reproduce a downsampled version of the defocused input image. For training IFAN as well as the reblurring network, we use a reblurring loss defined as: \[\mathcal{L}_{reblur}=MSE(I_{SB\downarrow}, I_{B\downarrow}),\] where \(I_{SB\downarrow}\) is a reblurred image obtained from \(I_{S\downarrow}\) using \(\textbf{F}_{reblur}\), and \(I_{B\downarrow}\) is a downsampled input image.


RealDOF Test Set

To quantitatively measure the performance of our method on real-world defocus blur images, we prepare a new dataset named Real Depth of Field (RealDOF) test set. RealDOF consists of 50 scenes. For each scene, the dataset provides a pair of a defocused image and its corresponding all-in-focus image. To capture image pairs of the same scene with different depth-of-fields, we built a dual-camera system with a beam splitter. Specifically, our system is equipped with two cameras attached to the vertical rig with a beam splitter. The system is also equipped with a multi-camera trigger to synchronize the camera shutters to capture images simultaneously. The captured images are post-processed for geometric and photometric alignments, similarly to [25].



Ablation Study


Quantitative Comparison

Qualitative Comparison


    author    = {Junyong Lee and Hyeongseok Son and Jaesung Rim and Sunghyun Cho and Seungyong Lee},
    title     = {Iterative Filter Adaptive Network for Single Image Defocus Deblurring},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2021}