TY - GEN
T1 - Learning to Bundle-adjust
T2 - 18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
AU - Tanaka, Tetsuya
AU - Sasagawa, Yukihiro
AU - Okatani, Takayuki
N1 - Funding Information:
This work was partly supported by JSPS KAKENHI Grant Number 20H05952 and JP19H01110.
Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Bundle adjustment (BA) occupies a large portion of the execution time of SfM and visual SLAM. Local BA over the latest several keyframes plays a crucial role in visual SLAM. Its execution time should be sufficiently short for robust tracking; this is especially critical for embedded systems with a limited computational resource. This study proposes a learning-based bundle adjuster using a graph network. It works faster and can be used instead of conventional optimization-based BA. The graph network operates on a graph consisting of the nodes of keyframes and landmarks and the edges representing the landmarks' visibility. The graph network receives the parameters' initial values as inputs and predicts their updates to the optimal values. It internally uses an intermediate representation of inputs which we design inspired by the normal equation of the Levenberg-Marquardt method. It is trained using the sum of reprojection errors as a loss function. The experiments show that the proposed method outputs parameter estimates with slightly inferior accuracy in 1/60-1/10 of time compared with the conventional BA.
AB - Bundle adjustment (BA) occupies a large portion of the execution time of SfM and visual SLAM. Local BA over the latest several keyframes plays a crucial role in visual SLAM. Its execution time should be sufficiently short for robust tracking; this is especially critical for embedded systems with a limited computational resource. This study proposes a learning-based bundle adjuster using a graph network. It works faster and can be used instead of conventional optimization-based BA. The graph network operates on a graph consisting of the nodes of keyframes and landmarks and the edges representing the landmarks' visibility. The graph network receives the parameters' initial values as inputs and predicts their updates to the optimal values. It internally uses an intermediate representation of inputs which we design inspired by the normal equation of the Levenberg-Marquardt method. It is trained using the sum of reprojection errors as a loss function. The experiments show that the proposed method outputs parameter estimates with slightly inferior accuracy in 1/60-1/10 of time compared with the conventional BA.
UR - http://www.scopus.com/inward/record.url?scp=85127819370&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127819370&partnerID=8YFLogxK
U2 - 10.1109/ICCV48922.2021.00619
DO - 10.1109/ICCV48922.2021.00619
M3 - Conference contribution
AN - SCOPUS:85127819370
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 6230
EP - 6239
BT - Proceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 October 2021 through 17 October 2021
ER -