Abstract
It is an easy task for the human visual system to gaze continuously at an object moving in three-dimensional (3-D) space. While tracking the object, human vision seems able to comprehend its 3-D shape with binocular vision. We conjecture that, in the human visual system, the function of comprehending the 3-D shape is essential for robust tracking of a moving object. In order to examine this conjecture, we constructed an experimental system of binocular vision for motion tracking. The system is composed of a pair of active pan-tilt cameras and a robot arm. The cameras are for simulating the two eyes of a human while the robot arm is for simulating the motion of the human body below the neck. The two active cameras are controlled so as to fix their gaze at a particular point on an object surface. The shape of the object surface around the point is reconstructed in real-time from the two images taken by the cameras based on the differences in the image brightness. If the two cameras successfully gaze at a single point on the object surface, it is possible to reconstruct the local object shape in real-time. At the same time, the reconstructed shape is used for keeping a fixation point on the object surface for gazing, which enables robust tracking of the object. Thus these two processes, reconstruction of the 3-D shape and maintaining the fixation point, must be mutually connected and form one closed loop. We demonstrate the effectiveness of this framework for visual tracking through several experiments.
Original language | English |
---|---|
Pages (from-to) | 1057-1072 |
Number of pages | 16 |
Journal | Advanced Robotics |
Volume | 17 |
Issue number | 10 |
DOIs | |
Publication status | Published - 2003 |
Keywords
- Active vision
- Binocular vision
- Object tracking
- Robot vision
- Three-dimensional shape reconstruction