Learning Dense Correspondences for Video Objects

Wen Chi Chin, Zih Jian Jhang, Hwann Tzong Chen, Koichi Ito

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


We introduce a learning based method for extracting distinctive features on video objects. Based on the extracted features, we are able to derive dense correspondences between the object in the current video frame and the reference template, and then use the correspondences to identify the grasping points on the object. We train a deep-learning model to predict dense feature maps using the training data collected via solving simultaneous localization and mapping (SLAM). Further, a new feature-aggregation technique based on the optical flow of consecutive frames is applied to the integration of multiple feature maps for alleviating uncertainties. We also use the optical flow information to assess the reliability of feature matching. The experimental results show that our approach effectively reduces unreliable correspondences and thus improves the matching accuracy.

Original languageEnglish
Title of host publication2019 IEEE International Conference on Image Processing, ICIP 2019 - Proceedings
PublisherIEEE Computer Society
Number of pages5
ISBN (Electronic)9781538662496
Publication statusPublished - 2019 Sept
Event26th IEEE International Conference on Image Processing, ICIP 2019 - Taipei, Taiwan, Province of China
Duration: 2019 Sept 222019 Sept 25

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880


Conference26th IEEE International Conference on Image Processing, ICIP 2019
Country/TerritoryTaiwan, Province of China


  • dense correspondence
  • feature map aggregation
  • optical flow
  • visual descriptor
  • visual descriptor


Dive into the research topics of 'Learning Dense Correspondences for Video Objects'. Together they form a unique fingerprint.

Cite this