ConvMTL: Multi-task Learning via Self-supervised Learning for Simultaneous Dense Predictions

Vijayasri Iyer, Senthil Kumar Thangavel, Madhusudana Rao Nalluri, Maiga Chang

    Research output: Chapter in Book/Report/Conference proceedingPublished Conference contributionpeer-review

    Abstract

    Perception systems in autonomous vehicles are required to perform multiple scene-understanding tasks under tight constraints of latency and power. Single-task neural networks can become unscalable when the number of tasks increases in the perception stack. Multi-task learning has been shown to improve parameter efficiency and enable models to learn more generalizable task representations compared to single-task neural networks. This work explores a novel convolutional multi-task neural network architecture that simultaneously performs two dense prediction tasks, semantic segmentation and depth estimation. A self-supervised ResNet-50 backbone is used as the basis of the proposed network, along with a multi-scale feature fusion module and a dense decoder. The model uses a simple weighted loss function with an informed search algorithm identifying the optimal parameters. The performance of the proposed model on the segmentation task is assessed using the mean Intersection of Union (mIoU) and pixel accuracy. In contrast, absolute and relative errors assess the depth estimation task. The obtained results for segmentation and depth estimation are mIoU of 73.81%, pixel accuracy of 93.52%, an absolute error of 0.130, and a relative error of 29.05. The model’s performance is comparable to existing multitask algorithms on the Cityscapes dataset, using only 2975 training samples.

    Original languageEnglish
    Title of host publicationComputer Vision and Image Processing - 8th International Conference, CVIP 2023, Revised Selected Papers
    EditorsHarkeerat Kaur, Vinit Jakhetiya, Puneet Goyal, Pritee Khanna, Balasubramanian Raman, Sanjeev Kumar
    Pages455-466
    Number of pages12
    DOIs
    Publication statusPublished - 2024
    Event8th International Conference on Computer Vision and Image Processing, CVIP 2023 - Jammu, India
    Duration: 3 Nov. 20235 Nov. 2023

    Publication series

    NameCommunications in Computer and Information Science
    Volume2009 CCIS
    ISSN (Print)1865-0929
    ISSN (Electronic)1865-0937

    Conference

    Conference8th International Conference on Computer Vision and Image Processing, CVIP 2023
    Country/TerritoryIndia
    CityJammu
    Period3/11/235/11/23

    Keywords

    • Autonomous Driving
    • Computer Vision
    • Deep Learning
    • Multi-task Learning
    • Transfer Learning

    Fingerprint

    Dive into the research topics of 'ConvMTL: Multi-task Learning via Self-supervised Learning for Simultaneous Dense Predictions'. Together they form a unique fingerprint.

    Cite this