OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks

Sophia Sirko-Galouchenko Alexandre Boulch Spyros Gidaris
Andrei Bursuc Antonin Vobecky Renaud Marlet Patrick Pérez

CVPR Workshop WAD 2024

Abstract

We introduce a self-supervised pretraining method, called OcFeat, for camera-only Bird's-Eye-View (BEV) segmentation networks. With OccFeat, we pretrain a BEV network via occupancy prediction and feature distillation tasks. Occupancy prediction provides a 3D geometric understanding of the scene to the model. However, the geometry learned is class-agnostic. Hence, we add semantic information to the model in the 3D space through distillation from a self-supervised pretrained image foundation model. Models pretrained with our method exhibit improved BEV semantic segmentation performance, particularly in low-data scenarios. Moreover, empirical results affirm the efficacy of integrating feature distillation with 3D occupancy prediction in our pretraining approach.

BibTeX

@misc{sirkogalouchenko2024occfeat,
      title={OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks}, 
      author={Sophia Sirko-Galouchenko and Alexandre Boulch and Spyros Gidaris and Andrei Bursuc and Antonin Vobecky and Patrick Pérez and Renaud Marlet},
      year={2024},
      eprint={2404.14027},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks

Sophia Sirko-Galouchenko Alexandre Boulch Spyros Gidaris Andrei Bursuc Antonin Vobecky Renaud Marlet Patrick Pérez

CVPR Workshop WAD 2024

Abstract

BibTeX

Sophia Sirko-Galouchenko Alexandre Boulch Spyros Gidaris
Andrei Bursuc Antonin Vobecky Renaud Marlet Patrick Pérez