Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition

Deep learning based methods have achieved unprecedented success in solving several computer vision problems involving RGB images. However, this level of success is yet to be seen on RGB-D images owing to two major challenges in this domain: training data deficiency and multi-modality input dissimila...

Full description

Bibliographic Details
Main Authors:	Mohd Zaki, Hasan Firdaus, Shafait, Faisal, Mian, Ajmal
Format:	Proceeding Paper
Language:	English English
Published:	IEEE 2016
Subjects:	QA75 Electronic computers. Computer science
Online Access:	http://irep.iium.edu.my/60177/ http://irep.iium.edu.my/60177/3/60177%20Convolutional%20Hypercube%20Pyramid.pdf http://irep.iium.edu.my/60177/2/60177%20Convolutional%20Hypercube%20Pyramid.scopus.pdf

_version_	1848785445688705024
author	Mohd Zaki, Hasan Firdaus Shafait, Faisal Mian, Ajmal
author_facet	Mohd Zaki, Hasan Firdaus Shafait, Faisal Mian, Ajmal
author_sort	Mohd Zaki, Hasan Firdaus
building	IIUM Repository
collection	Online Access
description	Deep learning based methods have achieved unprecedented success in solving several computer vision problems involving RGB images. However, this level of success is yet to be seen on RGB-D images owing to two major challenges in this domain: training data deficiency and multi-modality input dissimilarity. We present an RGB-D object recognition framework that addresses these two key challenges by effectively embedding depth and point cloud data into the RGB domain. We employ a convolutional neural network (CNN) pre-trained on RGB data as a feature extractor for both color and depth channels and propose a rich coarse-to-fine feature representation scheme, coined Hypercube Pyramid, that is able to capture discriminatory information at different levels of detail. Finally, we present a novel fusion scheme to combine the Hypercube Pyramid features with the activations of fully connected neurons to construct a compact representation prior to classification. By employing Extreme Learning Machines (ELM) as non-linear classifiers, we show that the proposed method outperforms ten state-of-the-art algorithms for several tasks in terms of recognition accuracy on the benchmark Washington RGB-D and 2D3D object datasets by a large margin (upto 50% reduction in error rate).
first_indexed	2025-11-14T16:53:16Z
format	Proceeding Paper
id	iium-60177
institution	International Islamic University Malaysia
institution_category	Local University
language	English English
last_indexed	2025-11-14T16:53:16Z
publishDate	2016
publisher	IEEE
recordtype	eprints
repository_type	Digital Repository
spelling	iium-601772018-08-06T08:13:23Z http://irep.iium.edu.my/60177/ Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition Mohd Zaki, Hasan Firdaus Shafait, Faisal Mian, Ajmal QA75 Electronic computers. Computer science Deep learning based methods have achieved unprecedented success in solving several computer vision problems involving RGB images. However, this level of success is yet to be seen on RGB-D images owing to two major challenges in this domain: training data deficiency and multi-modality input dissimilarity. We present an RGB-D object recognition framework that addresses these two key challenges by effectively embedding depth and point cloud data into the RGB domain. We employ a convolutional neural network (CNN) pre-trained on RGB data as a feature extractor for both color and depth channels and propose a rich coarse-to-fine feature representation scheme, coined Hypercube Pyramid, that is able to capture discriminatory information at different levels of detail. Finally, we present a novel fusion scheme to combine the Hypercube Pyramid features with the activations of fully connected neurons to construct a compact representation prior to classification. By employing Extreme Learning Machines (ELM) as non-linear classifiers, we show that the proposed method outperforms ten state-of-the-art algorithms for several tasks in terms of recognition accuracy on the benchmark Washington RGB-D and 2D3D object datasets by a large margin (upto 50% reduction in error rate). IEEE 2016 Proceeding Paper PeerReviewed application/pdf en http://irep.iium.edu.my/60177/3/60177%20Convolutional%20Hypercube%20Pyramid.pdf application/pdf en http://irep.iium.edu.my/60177/2/60177%20Convolutional%20Hypercube%20Pyramid.scopus.pdf Mohd Zaki, Hasan Firdaus and Shafait, Faisal and Mian, Ajmal (2016) Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), 16-21 May 2016, Stockholm, Sweden. https://ieeexplore.ieee.org/document/7487310/ 10.1109/ICRA.2016.7487310
spellingShingle	QA75 Electronic computers. Computer science Mohd Zaki, Hasan Firdaus Shafait, Faisal Mian, Ajmal Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition
title	Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition
title_full	Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition
title_fullStr	Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition
title_full_unstemmed	Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition
title_short	Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition
title_sort	convolutional hypercube pyramid for accurate rgb-d object category and instance recognition
topic	QA75 Electronic computers. Computer science
url	http://irep.iium.edu.my/60177/ http://irep.iium.edu.my/60177/ http://irep.iium.edu.my/60177/ http://irep.iium.edu.my/60177/3/60177%20Convolutional%20Hypercube%20Pyramid.pdf http://irep.iium.edu.my/60177/2/60177%20Convolutional%20Hypercube%20Pyramid.scopus.pdf

Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition

Similar Items