TY - GEN
T1 - Towards generating simulated walking motion using position based deep reinforcement learning
AU - Jones, William
AU - Gangapurwala, Siddhant
AU - Havoutis, Ioannis
AU - Yoshida, Kazuya
N1 - Funding Information:
Acknowledgments. This research is supported by the UKRI and EPSRC (EP/R026084/1, EP/R026173/1, EP/S002383/1) and the EU H2020 project MEMMO (780684). This work has been conducted as part of ANYmal Research, a community to advance legged robotics.
Publisher Copyright:
© Springer Nature Switzerland AG 2019.
PY - 2019
Y1 - 2019
N2 - Much of robotics research aims to develop control solutions that exploit the machine's dynamics in order to achieve an extraordinarily agile behaviour [1]. This, however, is limited by the use of traditional model-based control techniques such as model predictive control and quadratic programming. These solutions are often based on simplified mechanical models which result in mechanically constrained and inefficient behaviour, thereby limiting the agility of the robotic system in development [2]. Treating the control of robotic systems as a reinforcement learning (RL) problem enables the use of model-free algorithms that attempt to learn a policy which maximizes the expected future (discounted) reward without inferring the effects of an executed action on the environment.
AB - Much of robotics research aims to develop control solutions that exploit the machine's dynamics in order to achieve an extraordinarily agile behaviour [1]. This, however, is limited by the use of traditional model-based control techniques such as model predictive control and quadratic programming. These solutions are often based on simplified mechanical models which result in mechanically constrained and inefficient behaviour, thereby limiting the agility of the robotic system in development [2]. Treating the control of robotic systems as a reinforcement learning (RL) problem enables the use of model-free algorithms that attempt to learn a policy which maximizes the expected future (discounted) reward without inferring the effects of an executed action on the environment.
KW - ANYmal
KW - Proximal Policy Optimization
KW - Reinforcement learning
KW - Walking robot
UR - http://www.scopus.com/inward/record.url?scp=85073911518&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073911518&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-25332-5_42
DO - 10.1007/978-3-030-25332-5_42
M3 - Conference contribution
AN - SCOPUS:85073911518
SN - 9783030253318
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 467
EP - 470
BT - Towards Autonomous Robotic Systems - 20th Annual Conference, TAROS 2019, Proceedings
A2 - Althoefer, Kaspar
A2 - Konstantinova, Jelizaveta
A2 - Zhang, Ketao
PB - Springer Verlag
T2 - 20th Towards Autonomous Robotic Systems Conference, TAROS 2019
Y2 - 3 July 2019 through 5 July 2019
ER -