Model based methods for locating, enhancing and recognising low resolution objects in video

Visual perception is our most important sense which enables us to detect and recognise objects even in low detail video scenes. While humans are able to perform such object detection and recognition tasks reliably, most computer vision algorithms struggle with wide angle surveillance videos that mak...

Full description

Bibliographic Details
Main Author:	Kramer, Annika
Format:	Thesis
Language:	English
Published:	Curtin University 2009
Subjects:	object recognition video surveillance visual perception object detection video processing person-specific shape information face pose 3D generic face model
Online Access:	http://hdl.handle.net/20.500.11937/585

_version_	1848743421202661376
author	Kramer, Annika
author_facet	Kramer, Annika
author_sort	Kramer, Annika
building	Curtin Institutional Repository
collection	Online Access
description	Visual perception is our most important sense which enables us to detect and recognise objects even in low detail video scenes. While humans are able to perform such object detection and recognition tasks reliably, most computer vision algorithms struggle with wide angle surveillance videos that make automatic processing difficult due to low resolution and poor detail objects. Additional problems arise from varying pose and lighting conditions as well as non-cooperative subjects. All these constraints pose problems for automatic scene interpretation of surveillance video, including object detection, tracking and object recognition.Therefore, the aim of this thesis is to detect, enhance and recognise objects by incorporating a priori information and by using model based approaches. Motivated by the increasing demand for automatic methods for object detection, enhancement and recognition in video surveillance, different aspects of the video processing task are investigated with a focus on human faces. In particular, the challenge of fully automatic face pose and shape estimation by fitting a deformable 3D generic face model under varying pose and lighting conditions is tackled. Principal Component Analysis (PCA) is utilised to build an appearance model that is then used within a particle filter based approach to fit the 3D face mask to the image. This recovers face pose and person-specific shape information simultaneously. Experiments demonstrate the use in different resolution and under varying pose and lighting conditions. Following that, a combined tracking and super resolution approach enhances the quality of poor detail video objects. A 3D object mask is subdivided such that every mask triangle is smaller than a pixel when projected into the image and then used for model based tracking. The mask subdivision then allows for super resolution of the object by combining several video frames. This approach achieves better results than traditional super resolution methods without the use of interpolation or deblurring.Lastly, object recognition is performed in two different ways. The first recognition method is applied to characters and used for license plate recognition. A novel character model is proposed to create different appearances which are then matched with the image of unknown characters for recognition. This allows for simultaneous character segmentation and recognition and high recognition rates are achieved for low resolution characters down to only five pixels in size. While this approach is only feasible for objects with a limited number of different appearances, like characters, the second recognition method is applicable to any object, including human faces. Therefore, a generic 3D face model is automatically fitted to an image of a human face and recognition is performed on a mask level rather than image level. This approach does not require an initial pose estimation nor the selection of feature points, the face alignment is provided implicitly by the mask fitting process.
first_indexed	2025-11-14T05:45:18Z
format	Thesis
id	curtin-20.500.11937-585
institution	Curtin University Malaysia
institution_category	Local University
language	English
last_indexed	2025-11-14T05:45:18Z
publishDate	2009
publisher	Curtin University
recordtype	eprints
repository_type	Digital Repository
spelling	curtin-20.500.11937-5852017-02-20T06:42:15Z Model based methods for locating, enhancing and recognising low resolution objects in video Kramer, Annika object recognition video surveillance visual perception object detection video processing person-specific shape information face pose 3D generic face model Visual perception is our most important sense which enables us to detect and recognise objects even in low detail video scenes. While humans are able to perform such object detection and recognition tasks reliably, most computer vision algorithms struggle with wide angle surveillance videos that make automatic processing difficult due to low resolution and poor detail objects. Additional problems arise from varying pose and lighting conditions as well as non-cooperative subjects. All these constraints pose problems for automatic scene interpretation of surveillance video, including object detection, tracking and object recognition.Therefore, the aim of this thesis is to detect, enhance and recognise objects by incorporating a priori information and by using model based approaches. Motivated by the increasing demand for automatic methods for object detection, enhancement and recognition in video surveillance, different aspects of the video processing task are investigated with a focus on human faces. In particular, the challenge of fully automatic face pose and shape estimation by fitting a deformable 3D generic face model under varying pose and lighting conditions is tackled. Principal Component Analysis (PCA) is utilised to build an appearance model that is then used within a particle filter based approach to fit the 3D face mask to the image. This recovers face pose and person-specific shape information simultaneously. Experiments demonstrate the use in different resolution and under varying pose and lighting conditions. Following that, a combined tracking and super resolution approach enhances the quality of poor detail video objects. A 3D object mask is subdivided such that every mask triangle is smaller than a pixel when projected into the image and then used for model based tracking. The mask subdivision then allows for super resolution of the object by combining several video frames. This approach achieves better results than traditional super resolution methods without the use of interpolation or deblurring.Lastly, object recognition is performed in two different ways. The first recognition method is applied to characters and used for license plate recognition. A novel character model is proposed to create different appearances which are then matched with the image of unknown characters for recognition. This allows for simultaneous character segmentation and recognition and high recognition rates are achieved for low resolution characters down to only five pixels in size. While this approach is only feasible for objects with a limited number of different appearances, like characters, the second recognition method is applicable to any object, including human faces. Therefore, a generic 3D face model is automatically fitted to an image of a human face and recognition is performed on a mask level rather than image level. This approach does not require an initial pose estimation nor the selection of feature points, the face alignment is provided implicitly by the mask fitting process. 2009 Thesis http://hdl.handle.net/20.500.11937/585 en Curtin University fulltext
spellingShingle	object recognition video surveillance visual perception object detection video processing person-specific shape information face pose 3D generic face model Kramer, Annika Model based methods for locating, enhancing and recognising low resolution objects in video
title	Model based methods for locating, enhancing and recognising low resolution objects in video
title_full	Model based methods for locating, enhancing and recognising low resolution objects in video
title_fullStr	Model based methods for locating, enhancing and recognising low resolution objects in video
title_full_unstemmed	Model based methods for locating, enhancing and recognising low resolution objects in video
title_short	Model based methods for locating, enhancing and recognising low resolution objects in video
title_sort	model based methods for locating, enhancing and recognising low resolution objects in video
topic	object recognition video surveillance visual perception object detection video processing person-specific shape information face pose 3D generic face model
url	http://hdl.handle.net/20.500.11937/585

Model based methods for locating, enhancing and recognising low resolution objects in video

Similar Items