TY - JOUR
T1 - Data-Driven Policy Learning Methods from Biological Behavior
T2 - A Systematic Review
AU - Wang, Yuchen
AU - Hayashibe, Mitsuhiro
AU - Owaki, Dai
N1 - Publisher Copyright:
© 2024 by the authors.
PY - 2024/5
Y1 - 2024/5
N2 - Policy learning enables agents to learn how to map states to actions, thus enabling adaptive and flexible behavioral generation in complex environments. Policy learning methods are fundamental to reinforcement learning techniques. However, as problem complexity and the requirement for motion flexibility increase, traditional methods that rely on manual design have revealed their limitations. Conversely, data-driven policy learning focuses on extracting strategies from biological behavioral data and aims to replicate these behaviors in real-world environments. This approach enhances the adaptability of agents to dynamic substrates. Furthermore, this approach has been extensively applied in autonomous driving, robot control, and interpretation of biological behavior. In this review, we survey developments in data-driven policy-learning algorithms over the past decade. We categorized them into the following three types according to the purpose of the method: (1) imitation learning (IL), (2) inverse reinforcement learning (IRL), and (3) causal policy learning (CPL). We describe the classification principles, methodologies, progress, and applications of each category in detail. In addition, we discuss the distinct features and practical applications of these methods. Finally, we explore the challenges these methods face and prospective directions for future research.
AB - Policy learning enables agents to learn how to map states to actions, thus enabling adaptive and flexible behavioral generation in complex environments. Policy learning methods are fundamental to reinforcement learning techniques. However, as problem complexity and the requirement for motion flexibility increase, traditional methods that rely on manual design have revealed their limitations. Conversely, data-driven policy learning focuses on extracting strategies from biological behavioral data and aims to replicate these behaviors in real-world environments. This approach enhances the adaptability of agents to dynamic substrates. Furthermore, this approach has been extensively applied in autonomous driving, robot control, and interpretation of biological behavior. In this review, we survey developments in data-driven policy-learning algorithms over the past decade. We categorized them into the following three types according to the purpose of the method: (1) imitation learning (IL), (2) inverse reinforcement learning (IRL), and (3) causal policy learning (CPL). We describe the classification principles, methodologies, progress, and applications of each category in detail. In addition, we discuss the distinct features and practical applications of these methods. Finally, we explore the challenges these methods face and prospective directions for future research.
KW - behavior strategy
KW - causal inference
KW - imitation learning
KW - inverse reinforcement learning
KW - policy learning
UR - http://www.scopus.com/inward/record.url?scp=85194161185&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85194161185&partnerID=8YFLogxK
U2 - 10.3390/app14104038
DO - 10.3390/app14104038
M3 - Review article
AN - SCOPUS:85194161185
SN - 2076-3417
VL - 14
JO - Applied Sciences (Switzerland)
JF - Applied Sciences (Switzerland)
IS - 10
M1 - 4038
ER -