3D object detection with deep learning

Félix Escalona Moncholi; Miguel Ángel Rodríguez González; Francisco Gómez; Jesús Martínez Gómez; Miguel Cazorla Quevedo

Ayuda

3D object detection with deep learning

Escalona, Félix ^[1] ; Rodríguez, Ángel ; Gomez-Donoso, Francisco ^[1] ; Martínez-Gómez, Jesús ; Cazorla, Miguel ^[1]
1. [1] Universitat d'Alacant
  
  Universitat d'Alacant
  
  Alicante, España
Localización: JoPha: Journal of Physical Agents, ISSN-e 1888-0258, Vol. 8, Nº. 1, 2017 (Ejemplar dedicado a: Special Issue on Advances on Physical Agents), págs. 3-10
Idioma: inglés
DOI: 10.14198/jopha.2017.8.1.02
Enlaces
- Texto completo
Resumen
- Finding an appropriate environment representation is a crucial problem in robotics. 3D data has been recently used thanks to the advent of low cost RGB-D cameras. We propose a new way to represent a 3D map based on the information provided by an expert. Namely, the expert is the output of a Convolutional Neural Network trained with deep learning techniques. Relying on such information, we propose the generation of 3D maps using individual semantic labels, which are associated with environment objects or semantic labels. So, for each label we are provided with a partial 3D map whose data belong to the 3D perceptions, namely point clouds, which have an associated probability above a given threshold. The final map is obtained by registering and merging all these partial maps. The use of semantic labels provide us a with way to build the map while recognizing objects.
Referencias bibliográficas
- [1] Y. Bengio. Learning deep architectures for AI. Foundations and trends R in Machine Learning, 2(1):1–127, 2009.
- [2] P.J. Besl and N.D. McKay. A method for registration of 3-d shapes. IEEE Trans. on Pattern Analysis and Machine Intelligence, 14(2):239– 256,...
- [3] P. Bhattacharya and M.L. Gavrilova. Roadmap-based path planning - using the voronoi diagram for a clearance-based shortest path. IEEE Robot....
- [4] L. Bo, X. Ren, and D. Fox. Unsupervised feature learning for rgb-d based object recognition. In Experimental Robotics, pages 387–402. Springer,...
- [5] O. Booij, B. Terwijn, Z. Zivkovic, and B. Kröse. Navigation using an appearance based topological map. In International Conference on Robotics...
- [6] G. Carneiro, J. Nascimento, and A.P. Bradley. Unregistered multiview mammogram analysis with pre-trained deep learning models. In Medical Image...
- [7] M. Cazorla, P. Gil, S. Puente, J. L. Muñoz, and D. Pastor. An improvement of a slam rgb-d method with movement prediction derived from...
- [8] A. Hermans, G. Floros, and B. Leibe. Dense 3d semantic mapping of indoor scenes from rgb-d images. In International Conference on Robotics...
- [9] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast...
- [10] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C.J.C....
- [11] M. Labbé and F. Michaud. Online global loop closure detection for large-scale multi-session graph-based slam. In International Conference on...
- [12] H. Lee, R. Grosse, R. Ranganath, and A.Y Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations....
- [13] J. Martínez-Gómez, V. Morell, M. Cazorla, and I. García-Varea. Semantic localization in the PCL library. Robotics and Autonomous Systems, 75,...
- [14] V.N. Murthy, S. Maji, and R. Manmatha. Automatic image annotation using deep learning representations. In Proceedings of the 5th ACM on...
- [15] N. Neverova, C. Wolf, G.W. Taylor, and F. Nebout. Multi-scale deep learning for gesture detection and localization. In Computer VisionECCV...
- [16] A. Pronobis, O. Martinez Mozos, B. Caputo, and P. Jensfelt. Multimodal semantic place classification. The International Journal of Robotics...
- [17] J.C. Rangel, M. Cazorla, I. García-Varea, J. Martínez-Gómez, É. Fromont, and M. Sebban. Scene classification based on semantic labeling....
- [18] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, An. Karpathy, A. Khosla, M. Bernstein, A.C. Berg, and L. FeiFei....
- [19] N. Silberman, D. Hoiem, P. Kohli, and R. Fergus. Indoor segmentation and support inference from rgbd images. In Computer Vision–ECCV 2012,...
- [20] J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In Computer Vision, 2003. Proceedings. Ninth...
- [21] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. 2015.
- [22] S. Thrun et al. Robotic mapping: A survey. Exploring artificial intelligence in the new millennium, pages 1–35, 2002.
- [23] B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning deep features for scene recognition using places database. In Advances in...