[C77] Multi-Frame ISP: Enhancing Vision-Based Tasks with RAW and Infrared Videos

Abstract

Recent vision-based downstream tasks perform well under standard conditions but degrade in low-light or high-exposure environments due to their reliance on well-lit datasets aligned with the human visual system. To address this, we propose a Multi-Frame ISP method that utilizes RAW and IR video frames for enhanced robustness. RAW images retain rich environmental details for low-light scenarios, while IR images remain unaffected by visible light. Unlike traditional ISP pipelines or paired translation models, our approach introduces Global ISP for image-wide color correction by Selective ISP module and Local ISP for region-specific enhancement using temporal information. Experiments on the RAW video datasets ImageVID and YouTubeVOS, and the IR dataset FLIR, show that our method outperforms conventional ISP and translation models by dynamically adapting to lighting variations, thereby enhancing object detection and segmentation.

Publication
IEEE International Conference on Advanced Visual and Signal-Based Systems