In this era of active development of autonomous vehicles, it becomes crucial to provide driving systems with the capacity to explain their decisions. In this work, we focus on generating high-level driving explanations as the vehicle drives. We present BEEF, for BEhavior Explanation with Fusion, a deep architecture which explains the behavior of a trajectory prediction model. Supervised by annotations of human driving decisions justifications, BEEF learns to fuse features from multiple levels. Leveraging recent advances in the multi-modal fusion literature, BEEF is carefully designed to model the correlations between high-level decisions features and mid-level perceptual features. The flexibility and efficiency of our approach are validated with extensive experiments on the HDD and BDD-X datasets.
@article{beef2021, author = {Hedi Ben{-}Younes and {\'{E}}loi Zablocki and Patrick P{\'{e}}rez and Matthieu Cord}, title = {Driving Behavior Explanation with Multi-level Fusion}, journal = {Pattern Recognition (PR)}, year = {2022} }