TY - GEN
T1 - Analysis of human learning process on manual control of complex systems
AU - Goto, Takakuni
AU - Homma, Noriyasu
AU - Yoshizawa, Makoto
AU - Abe, Kenichi
PY - 2006
Y1 - 2006
N2 - In this paper, a novel analysis technique is applied for investigating human operators' trial and error learning process to control a nonholonomic system, 2-link planer underactuated manipulator (2PUAM). An essential core of the technique is to use a value function of the reinforcement learning scheme for revealing how the operators can find a control strategy. It is an advantage of the proposed technique compared to the others that a transition of the value function may explain the changes of the operators' strategies during the learning process. According to the results of the analysis, the operators tended to explore an objective trajectory first, and then shift to the tracking control of the trajectory. The tracking was accompanied with acceleration to achieve the goal faster. Interestingly, the acceleration disturbs the objective trajectory due to the complex dynamics of the target, and induces another exploration to get better trajectories. The fact that this phase transition structure under unsupervised learning environment is consistent with previously reported results for a supervised case implies that the structure can be a general nature of human learning process.
AB - In this paper, a novel analysis technique is applied for investigating human operators' trial and error learning process to control a nonholonomic system, 2-link planer underactuated manipulator (2PUAM). An essential core of the technique is to use a value function of the reinforcement learning scheme for revealing how the operators can find a control strategy. It is an advantage of the proposed technique compared to the others that a transition of the value function may explain the changes of the operators' strategies during the learning process. According to the results of the analysis, the operators tended to explore an objective trajectory first, and then shift to the tracking control of the trajectory. The tracking was accompanied with acceleration to achieve the goal faster. Interestingly, the acceleration disturbs the objective trajectory due to the complex dynamics of the target, and induces another exploration to get better trajectories. The fact that this phase transition structure under unsupervised learning environment is consistent with previously reported results for a supervised case implies that the structure can be a general nature of human learning process.
KW - 2-link planer underactuated manipulator
KW - Manual control
KW - Reinforcement learning
KW - Value function
UR - http://www.scopus.com/inward/record.url?scp=34548119480&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34548119480&partnerID=8YFLogxK
U2 - 10.1109/ICSMC.2006.384955
DO - 10.1109/ICSMC.2006.384955
M3 - Conference contribution
AN - SCOPUS:34548119480
SN - 1424401003
SN - 9781424401000
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 1648
EP - 1653
BT - 2006 IEEE International Conference on Systems, Man and Cybernetics
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2006 IEEE International Conference on Systems, Man and Cybernetics
Y2 - 8 October 2006 through 11 October 2006
ER -