Binocular motion tracking by gaze fixation control and three-dimensional shape reconstruction

Yoshinori Satoh, Takayuki Okatani, Koichiro Deguchi

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)


It is an easy task for the human visual system to gaze continuously at an object moving in three-dimensional (3-D) space. While tracking the object, human vision seems able to comprehend its 3-D shape with binocular vision. We conjecture that, in the human visual system, the function of comprehending the 3-D shape is essential for robust tracking of a moving object. In order to examine this conjecture, we constructed an experimental system of binocular vision for motion tracking. The system is composed of a pair of active pan-tilt cameras and a robot arm. The cameras are for simulating the two eyes of a human while the robot arm is for simulating the motion of the human body below the neck. The two active cameras are controlled so as to fix their gaze at a particular point on an object surface. The shape of the object surface around the point is reconstructed in real-time from the two images taken by the cameras based on the differences in the image brightness. If the two cameras successfully gaze at a single point on the object surface, it is possible to reconstruct the local object shape in real-time. At the same time, the reconstructed shape is used for keeping a fixation point on the object surface for gazing, which enables robust tracking of the object. Thus these two processes, reconstruction of the 3-D shape and maintaining the fixation point, must be mutually connected and form one closed loop. We demonstrate the effectiveness of this framework for visual tracking through several experiments.

Original languageEnglish
Pages (from-to)1057-1072
Number of pages16
JournalAdvanced Robotics
Issue number10
Publication statusPublished - 2003


  • Active vision
  • Binocular vision
  • Object tracking
  • Robot vision
  • Three-dimensional shape reconstruction


Dive into the research topics of 'Binocular motion tracking by gaze fixation control and three-dimensional shape reconstruction'. Together they form a unique fingerprint.

Cite this