Towards generating simulated walking motion using position based deep reinforcement learning

William Jones, Siddhant Gangapurwala, Ioannis Havoutis, Kazuya Yoshida

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Much of robotics research aims to develop control solutions that exploit the machine's dynamics in order to achieve an extraordinarily agile behaviour [1]. This, however, is limited by the use of traditional model-based control techniques such as model predictive control and quadratic programming. These solutions are often based on simplified mechanical models which result in mechanically constrained and inefficient behaviour, thereby limiting the agility of the robotic system in development [2]. Treating the control of robotic systems as a reinforcement learning (RL) problem enables the use of model-free algorithms that attempt to learn a policy which maximizes the expected future (discounted) reward without inferring the effects of an executed action on the environment.

Original languageEnglish
Title of host publicationTowards Autonomous Robotic Systems - 20th Annual Conference, TAROS 2019, Proceedings
EditorsKaspar Althoefer, Jelizaveta Konstantinova, Ketao Zhang
PublisherSpringer Verlag
Pages467-470
Number of pages4
ISBN (Print)9783030253318
DOIs
Publication statusPublished - 2019
Event20th Towards Autonomous Robotic Systems Conference, TAROS 2019 - London, United Kingdom
Duration: 2019 Jul 32019 Jul 5

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11650 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference20th Towards Autonomous Robotic Systems Conference, TAROS 2019
Country/TerritoryUnited Kingdom
CityLondon
Period19/7/319/7/5

Keywords

  • ANYmal
  • Proximal Policy Optimization
  • Reinforcement learning
  • Walking robot

Fingerprint

Dive into the research topics of 'Towards generating simulated walking motion using position based deep reinforcement learning'. Together they form a unique fingerprint.

Cite this