TY - JOUR
T1 - FUSION SPARSE AND SHAPING REWARD FUNCTION IN SOFT ACTOR-CRITIC DEEP REINFORCEMENT LEARNING FOR MOBILE ROBOT NAVIGATION
AU - Bakar, Mohamad Hafiz Abu
AU - Shamsudin, Abu Ubaidah
AU - Soomro, Zubair Adil
AU - Tadokoro, Satoshi
AU - Salaan, C. J.
N1 - Publisher Copyright:
© 2024, Penerbit UTM Press. All rights reserved.
PY - 2024/3
Y1 - 2024/3
N2 - Nowadays, the advancement in autonomous robots is the latest influenced by the development of a world surrounded by new technologies. Deep Reinforcement Learning (DRL) allows systems to operate automatically, so the robot will learn the next movement based on the interaction with the environment. Moreover, since robots require continuous action, Soft Actor Critic Deep Reinforcement Learning (SAC DRL) is considered the latest DRL approach solution. SAC is used because its ability to control continuous action to produce more accurate movements. SAC fundamental is robust against unpredictability, but some weaknesses have been identified, particularly in the exploration process for accuracy learning with faster maturity. To address this issue, the study identified a solution using a reward function appropriate for the system to guide in the learning process. This research proposes several types of reward functions based on sparse and shaping reward in SAC method to investigate the effectiveness of mobile robot learning. Finally, the experiment shows that using fusion sparse and shaping rewards in the SAC DRL successfully navigates to the target position and can also increase accuracy based on the average error result of 4.99%.
AB - Nowadays, the advancement in autonomous robots is the latest influenced by the development of a world surrounded by new technologies. Deep Reinforcement Learning (DRL) allows systems to operate automatically, so the robot will learn the next movement based on the interaction with the environment. Moreover, since robots require continuous action, Soft Actor Critic Deep Reinforcement Learning (SAC DRL) is considered the latest DRL approach solution. SAC is used because its ability to control continuous action to produce more accurate movements. SAC fundamental is robust against unpredictability, but some weaknesses have been identified, particularly in the exploration process for accuracy learning with faster maturity. To address this issue, the study identified a solution using a reward function appropriate for the system to guide in the learning process. This research proposes several types of reward functions based on sparse and shaping reward in SAC method to investigate the effectiveness of mobile robot learning. Finally, the experiment shows that using fusion sparse and shaping rewards in the SAC DRL successfully navigates to the target position and can also increase accuracy based on the average error result of 4.99%.
KW - Deep Reinforcement Learning
KW - Mobile robot navigation
KW - Reward function
KW - Shaping reward
KW - Soft Actor Critic Deep Reinforcement Learning (SAC DRL)
KW - Sparse reward
UR - http://www.scopus.com/inward/record.url?scp=85185671528&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85185671528&partnerID=8YFLogxK
U2 - 10.11113/jurnalteknologi.v86.20147
DO - 10.11113/jurnalteknologi.v86.20147
M3 - Article
AN - SCOPUS:85185671528
SN - 0127-9696
VL - 86
SP - 37
EP - 49
JO - Jurnal Teknologi
JF - Jurnal Teknologi
IS - 2
ER -