TY - JOUR
T1 - A cluster-based disambiguation method using pose consistency verification for structure from motion
AU - Gong, Ye
AU - Zhou, Pengwei
AU - Liu, Changfeng
AU - Yu, Yan
AU - Yao, Jian
AU - Yuan, Wei
AU - Li, Li
N1 - Publisher Copyright:
© 2024
PY - 2024/3
Y1 - 2024/3
N2 - Structure from motion (SfM) recovers scene structures and camera poses based on feature matching, and faces challenges from ambiguous scenes. There are a large number of ambiguous scenes in real environment, which contain many duplicate structures and textures. The ambiguity leads to incorrect feature matches between images with similar appearance, and makes geometric misalignment in SfM. To address this problem, recent methods have focused on investigating the inconsistencies in feature topology among multi-view images. However, the feature topology is directly derived from 2D images. Thus, it is susceptible to feature occlusion caused by changes in perspective. Therefore, we propose a new method that disambiguates scenes using pose consistency rather than feature consistency. The pose consistency is conducted in 3D geometric space which is less sensitive to feature occlusion. Thus, the pose consistency is more robust than feature consistency. Our core motivation lies that the incorrect matches between ambiguous images will cause pose deviation from the global poses generated by correct matches. To detect this pose deviation, we first combine local and global information of the scene to generate the global reliable camera poses. The local information of each image is obtained by image clustering, and it strengthens the global information that is represented as the verified maximum spanning tree of clusters. Then, the global poses serve as the reference for further pose consistency verification. The global poses also enable us to perform both rotation and translation consistency verification for uncertain matches. During the pose consistency verification, the pose deviation calculated on image-level may be too small to be noticed. Thus, we propose to perform pose consistency verification at cluster-level instead of image-level to amplify the pose deviation. In the experiments, we compared our approach with several state-of-the-art methods, including COLMAP, Geodesic-SfM and TC-SfM, on both ambiguous and regular datasets. The results demonstrate that our approach achieves the best robustness, only our approach succeeds on all ambiguous image sequences (14/14). The quantitative evaluation results on image sequences with ground truth also show that our approach achieves the best accuracy (average RMSE of translation = 0.109, average RMSE of rotation = 0.827) among all methods. The source code of our approach is publicly available at https://github.com/gongyeted/MA-SfM.
AB - Structure from motion (SfM) recovers scene structures and camera poses based on feature matching, and faces challenges from ambiguous scenes. There are a large number of ambiguous scenes in real environment, which contain many duplicate structures and textures. The ambiguity leads to incorrect feature matches between images with similar appearance, and makes geometric misalignment in SfM. To address this problem, recent methods have focused on investigating the inconsistencies in feature topology among multi-view images. However, the feature topology is directly derived from 2D images. Thus, it is susceptible to feature occlusion caused by changes in perspective. Therefore, we propose a new method that disambiguates scenes using pose consistency rather than feature consistency. The pose consistency is conducted in 3D geometric space which is less sensitive to feature occlusion. Thus, the pose consistency is more robust than feature consistency. Our core motivation lies that the incorrect matches between ambiguous images will cause pose deviation from the global poses generated by correct matches. To detect this pose deviation, we first combine local and global information of the scene to generate the global reliable camera poses. The local information of each image is obtained by image clustering, and it strengthens the global information that is represented as the verified maximum spanning tree of clusters. Then, the global poses serve as the reference for further pose consistency verification. The global poses also enable us to perform both rotation and translation consistency verification for uncertain matches. During the pose consistency verification, the pose deviation calculated on image-level may be too small to be noticed. Thus, we propose to perform pose consistency verification at cluster-level instead of image-level to amplify the pose deviation. In the experiments, we compared our approach with several state-of-the-art methods, including COLMAP, Geodesic-SfM and TC-SfM, on both ambiguous and regular datasets. The results demonstrate that our approach achieves the best robustness, only our approach succeeds on all ambiguous image sequences (14/14). The quantitative evaluation results on image sequences with ground truth also show that our approach achieves the best accuracy (average RMSE of translation = 0.109, average RMSE of rotation = 0.827) among all methods. The source code of our approach is publicly available at https://github.com/gongyeted/MA-SfM.
KW - Duplicate structure disambiguation
KW - Image-based 3D reconstruction
KW - Pose consistency verification
KW - Structure from motion
UR - http://www.scopus.com/inward/record.url?scp=85186418568&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85186418568&partnerID=8YFLogxK
U2 - 10.1016/j.isprsjprs.2024.02.016
DO - 10.1016/j.isprsjprs.2024.02.016
M3 - Article
AN - SCOPUS:85186418568
SN - 0924-2716
VL - 209
SP - 398
EP - 414
JO - ISPRS Journal of Photogrammetry and Remote Sensing
JF - ISPRS Journal of Photogrammetry and Remote Sensing
ER -