Download PDFOpen PDF in browserA Novel Iterative Fusion Multi-Task Learning Framework for Solving Dense PredictionEasyChair Preprint 1212815 pages•Date: February 14, 2024AbstractDense prediction tasks are hot topics in computer vision that aim to predict each input image pixel, such as semantic segmentation, monocular depth estimation, edge estimation, etc. With advanced deep learning, many dense prediction tasks have been greatly improved. Multi-task learning is one of the top research lines to boost task performance further. Properly designed multi-task model architectures have better performance and minor memory usage than single-task models. This paper proposes a novel Multi-task Learning (MTL) framework with a task pair interaction module to tackle several dense prediction tasks. Different from most widely used MTL structures which share features on some specific layer and branch to task-specific layer, the output task-specific features are remixed via a task pair interaction module (TPIM) to get more shared features in this paper. Due to joint learning, tasks are mutually supervised and provide rich shared information to each other for improving final results. The TPIM includes a novel cross-task interaction block (CIB) which comprises two attention mechanisms, self-attention and pixel-wise global attention. In contrast with the commonly used global attention mechanism, an Iterative Fusion Block (IFB) is introduced to effectively fuse affinity information between task pairs. Extensive experiments on two benchmark datasets (NYUD-v2 and PASCAL) demonstrate that our proposal is effective in comparison to existing methods. Keyphrases: Dense Prediction, cross-task interaction, iterative fusion, multi-task learning
|