TY - JOUR
T1 - Reinforcement Learning for Robotic Assembly Using Non-Diagonal Stiffness Matrix
AU - Oikawa, Masahide
AU - Kusakabe, Tsukasa
AU - Kutsuzawa, Kyo
AU - Sakaino, Sho
AU - Tsuji, Toshiaki
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - Contact-rich tasks, wherein multiple contact transitions occur in a series of operations, have been extensively studied for task automation. Precision assembly, a typical example of contact-rich tasks, requires high time constants to cope with the change in contact state. Therefore, this letter proposes a local trajectory planning method for precision assembly with high time constants. Because the non-diagonal component of a stiffness matrix can induce motion at high sampling frequencies, we use this concept to design a stiffness matrix to guide the motion of an object and propose a method to control it. We introduce reinforcement learning (RL) for the selection of the stiffness matrix because the relationship between the desired direction and the sensor response is difficult to model. An architecture with various sampling rates for RL and admittance control has the advantage of rapid response owing to a high time constant of the local trajectory modification. The effectiveness of the method is verified experimentally on two contact-rich tasks: inserting a peg into a hole and inserting a gear. Using the proposed method, the average total time needed to insert the peg in the hole is 1.64 s, which is less than half the time reported by the best of the existing state of the art studies.
AB - Contact-rich tasks, wherein multiple contact transitions occur in a series of operations, have been extensively studied for task automation. Precision assembly, a typical example of contact-rich tasks, requires high time constants to cope with the change in contact state. Therefore, this letter proposes a local trajectory planning method for precision assembly with high time constants. Because the non-diagonal component of a stiffness matrix can induce motion at high sampling frequencies, we use this concept to design a stiffness matrix to guide the motion of an object and propose a method to control it. We introduce reinforcement learning (RL) for the selection of the stiffness matrix because the relationship between the desired direction and the sensor response is difficult to model. An architecture with various sampling rates for RL and admittance control has the advantage of rapid response owing to a high time constant of the local trajectory modification. The effectiveness of the method is verified experimentally on two contact-rich tasks: inserting a peg into a hole and inserting a gear. Using the proposed method, the average total time needed to insert the peg in the hole is 1.64 s, which is less than half the time reported by the best of the existing state of the art studies.
KW - Compliance and impedance control
KW - compliant assembly
KW - force and tactile sensing
KW - reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85101773547&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85101773547&partnerID=8YFLogxK
U2 - 10.1109/LRA.2021.3060389
DO - 10.1109/LRA.2021.3060389
M3 - Article
AN - SCOPUS:85101773547
SN - 2377-3766
VL - 6
SP - 2737
EP - 2744
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 2
M1 - 9361338
ER -