HabiCrowd: A High Performance Simulator for Crowd-Aware Visual Navigation

IROS 2024

An Dinh Vuong1       Toan Nguyen1       Minh Nhat Vu2      
Baoru Huang3       Binh Huynh4       Thieu Vo5               Anh Nguyen6

1FPT Software AI Center   2ACIN - TU Wien   3Imperial College London  
4HUST   5Ton Duc Thang University   6University of Liverpool

HabiCrowd, a new dataset and benchmark for crowd-aware visual navigation that surpasses other benchmarks in terms of human diversity and computational utilization.

Abstract

Visual navigation, a foundational aspect of Embodied AI (E-AI), has been significantly studied in the past few years. While many 3D simulators have been introduced to support visual navigation tasks, scarcely works have been directed towards combining human dynamics, creating the gap between simulation and real-world applications. Furthermore, current 3D simulators incorporating human dynamics have several limitations, particularly in terms of computational efficiency, which is a promise of E-AI simulators. To overcome these shortcomings, we introduce HabiCrowd, the first standard benchmark for crowd-aware visual navigation that integrates a crowd dynamics model with diverse human settings into photorealistic environments. Empirical evaluations demonstrate that our proposed human dynamics model achieves state-of-the-art performance in collision avoidance, while exhibiting superior computational efficiency compared to its counterparts. We leverage HabiCrowd to conduct several comprehensive studies on crowd-aware visual navigation tasks and human-robot interactions.

Key Contributions

UPL++, our proposed human dynamics model achieve collision-free, while significantly outperforms computational time of state-of-the-art human dynamics model of iGibson-SN (ORCA).

Comparison with related human dynamics model.

Furthermore, our simulator shows remarkable computational utilization over related benchmarks. The rendering speed is approximately two times faster than Isaac Sim, approximately three times faster than iGibson-SN while RAM utilizations is approximately six times smaller than Isaac Sim. We also achieve wider range of human diversity. More information can be found in our paper.

Comparison with related benchmarks.

An example of our human dynamics UPL++ projected on 2D floor plan.

Crowd-aware Visual Navigation

Utilizing HabiCrowd, we benchmark two tasks: point-goal and object-goal navigation. We benchmark five state-of-the-art baselines on both crowd-aware navigation (robotics) and egocentric-based (computer vision) literature.

Comparison with related human dynamics model.

Acknowledgements

We borrow github page from HyperNeRF. Special thanks to them!