TY - JOUR
T1 - Learning social compliant multi-modal distributions of human path in crowds
AU - Shi, X.
AU - Zhang, H.
AU - Yuan, W.
AU - Huang, D.
AU - Guo, Z.
AU - Shibasaki, R.
N1 - Publisher Copyright:
© 2022 Copernicus GmbH. All rights reserved.
PY - 2022/5/17
Y1 - 2022/5/17
N2 - Long-Term human path forecasting in crowds is critical for autonomous moving platforms (like autonomous driving cars and social robots) to avoid collision and make high-quality planning. It is not easy for prediction systems to successfully take into account social interactions and predict a distribution of future possible path in a highly interactive and dynamic circumstance. In this paper, we develop a data-driven model for long-Term trajectory prediction, which naturally takes into account social interactions through a spatio-Temporal graph representation and predicts multi-modes of future trajectories. Different from generative adversarial network (GAN) based models which generate samples and then provide distributions of samples, we use mixture density functions to describe human motion and intuitively map the distribution of future path with explicit densities. To prevent the model from collapsing into a single mode and truly capture the intrinsic multi-modality, we further use a Winner-Takes-All (WTA) loss instead of computing loss over all modes. Extensive experiments over several trajectory prediction benchmarks demonstrate that our method is able to capture the multi-modality of human motion and forecast the distributions of plausible futures in complex scenarios.
AB - Long-Term human path forecasting in crowds is critical for autonomous moving platforms (like autonomous driving cars and social robots) to avoid collision and make high-quality planning. It is not easy for prediction systems to successfully take into account social interactions and predict a distribution of future possible path in a highly interactive and dynamic circumstance. In this paper, we develop a data-driven model for long-Term trajectory prediction, which naturally takes into account social interactions through a spatio-Temporal graph representation and predicts multi-modes of future trajectories. Different from generative adversarial network (GAN) based models which generate samples and then provide distributions of samples, we use mixture density functions to describe human motion and intuitively map the distribution of future path with explicit densities. To prevent the model from collapsing into a single mode and truly capture the intrinsic multi-modality, we further use a Winner-Takes-All (WTA) loss instead of computing loss over all modes. Extensive experiments over several trajectory prediction benchmarks demonstrate that our method is able to capture the multi-modality of human motion and forecast the distributions of plausible futures in complex scenarios.
KW - Deep Learning.
KW - LSTM
KW - Multi-modal
KW - Pedestrian Trajectory Prediction
KW - Social Interactions
UR - http://www.scopus.com/inward/record.url?scp=85132035536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132035536&partnerID=8YFLogxK
U2 - 10.5194/isprs-Annals-V-4-2022-91-2022
DO - 10.5194/isprs-Annals-V-4-2022-91-2022
M3 - Conference article
AN - SCOPUS:85132035536
SN - 2194-9042
VL - 5
SP - 91
EP - 98
JO - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
JF - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
IS - 4
T2 - 2022 24th ISPRS Congress on Imaging Today, Foreseeing Tomorrow, Commission IV
Y2 - 6 June 2022 through 11 June 2022
ER -