Research

Our work spans four intertwined axes. Click any axis to browse matching publications.

3D Perception and Scene Understanding

3D Perception and Scene Understanding

Autonomous vehicles rely on cameras, LiDARs, radars, and ultrasonics to perceive their surroundings. We study how to fuse and interpret these multi-modal signals to build accurate 3D representations of the driving scene — object detection, semantic segmentation, depth estimation, motion forecasting, and pose estimation.

Foundation Models and World Modeling

Foundation Models and World Modeling

Large-scale pretrained models can go beyond fixed ontologies and adapt to a wide variety of downstream tasks. We investigate vision and vision-language foundation models, as well as world models that learn to simulate and predict how driving scenes evolve over time — enabling better generalization with less task-specific supervision.

Physical AI and End-to-End Planning

Physical AI and End-to-End Planning

Rather than treating perception and decision-making as separate modules, end-to-end approaches learn to map sensor inputs directly to driving actions. We explore neural planning architectures and physical AI methods that reason jointly about scene understanding and trajectory planning — aiming for driving systems that are both simpler and more effective.

Robust, Reliable and Explainable Models

Robust, Reliable and Explainable Models

Safety-critical applications demand models that are resilient to distribution shifts, adverse conditions, and unexpected inputs. We work on uncertainty estimation, domain generalization, robustness to corruptions and adversarial perturbations, and explainability methods that help understand and trust the decisions made by deep learning systems.