Authors:
Philippe Pérez de San Roman
1
;
2
;
Pascal Desbarats
2
;
Jean-Philippe Domenger
2
and
Axel Buendia
3
;
4
Affiliations:
1
ITECA, 264 Rue Fontchaudiere, 16000 Angoulême, France
;
2
Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, F-33400 Talence, France
;
3
CNAM-CEDRIC Paris, 292 Rue Saint Martin, 75003 Paris, France
;
4
SpirOps, 8 Passage de la Bonne Graine, 75011, Paris, France
Keyword(s):
Deep Learning, 6-DOF Pose Estimation, 3D Detection, Dataset, RGB-D.
Abstract:
Maintenance is inevitable, time-consuming, expensive, and risky to production and maintenance operators. Porting maintenance support applications to mixed reality (MR) headsets would ease operations. To function, the application needs to anchor 3D graphics onto real objects, i.e. locate and track real-world objects in three dimensions. This task is known in the computer vision community as Six Degree of Freedom Pose Estimation (6-Dof) and is best solved using Convolutional Neural Networks (CNNs). Training them required numerous examples, but acquiring real labeled images for 6-DoF pose estimation is a challenge on its own. In this article, we propose first a thorough review of existing non-synthetic datasets for 6-DoF pose estimations. This allows identifying several reasons why synthetic training data has been favored over real training data. Nothing can replace real images. We show next that it is possible to overcome the limitations faced by previous datasets by presenting a new m
ethodology for labeled images acquisition. And finally, we present a new dataset named NEMA that allows deep learning methods to be trained without the need for synthetic data.
(More)