Visual Odometry Based on Convolutional Neural Networks for Large-Scale Scenes

EasyChair Preprint 413

10 pages•Date: August 9, 2018

Abstract

The main task of visual odometry (VO) is to measure camera motion and image depth, which is the basis of 3D reconstruction and the front-end of simultaneous localization and mapping (SLAM). However, most of the existing methods have low accuracy or require advanced sensors. In order to predict camera pose and image depth at the same time with high accuracy from image sequences captured by regular camera, we train a novel framework, named PD-Net, and it is based on a convolutional neural network (CNN). There are two main modules: one is pose estimator which is able to estimate the 6-DoF camera pose, the other is depth estimator computing the depth of its view. The keys of our proposed framework are that PD-Net comprises some shared convolutional layers and is divided into two branches to estimate camera motion and image depth, respectively. Experiments on KITTI and TUM datasets show that our proposed method can extract meaningful depth estimation and successfully estimate frame-to-frame camera rotations and translations in large scenes even texture-less. It outperforms previous methods in terms of accuracy and robustness.

Keyphrases: 3D reconstruction, CNN, Depth Prediction, SLAM, Visual Odometry, pose estimation

Links:	https://easychair.org/publications/preprint/mDGN
	https://doi.org/10.29007/mf57

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:413,
  author    = {Xuyang Meng and Chunxiao Fan and Yue Ming},
  title     = {Visual Odometry Based on Convolutional Neural Networks for Large-Scale Scenes},
  doi       = {10.29007/mf57},
  howpublished = {EasyChair Preprint 413},
  year      = {EasyChair, 2018}}

Download PDF Open PDF in browser