Model based methods for locating, enhancing and recognising low resolution objects in video

Visual perception is our most important sense which enables us to detect and recognise objects even in low detail video scenes. While humans are able to perform such object detection and recognition tasks reliably, most computer vision algorithms struggle with wide angle surveillance videos that mak...

Full description

Bibliographic Details
Main Author: Kramer, Annika
Format: Thesis
Language:English
Published: Curtin University 2009
Subjects:
Online Access:http://hdl.handle.net/20.500.11937/585
_version_ 1848743421202661376
author Kramer, Annika
author_facet Kramer, Annika
author_sort Kramer, Annika
building Curtin Institutional Repository
collection Online Access
description Visual perception is our most important sense which enables us to detect and recognise objects even in low detail video scenes. While humans are able to perform such object detection and recognition tasks reliably, most computer vision algorithms struggle with wide angle surveillance videos that make automatic processing difficult due to low resolution and poor detail objects. Additional problems arise from varying pose and lighting conditions as well as non-cooperative subjects. All these constraints pose problems for automatic scene interpretation of surveillance video, including object detection, tracking and object recognition.Therefore, the aim of this thesis is to detect, enhance and recognise objects by incorporating a priori information and by using model based approaches. Motivated by the increasing demand for automatic methods for object detection, enhancement and recognition in video surveillance, different aspects of the video processing task are investigated with a focus on human faces. In particular, the challenge of fully automatic face pose and shape estimation by fitting a deformable 3D generic face model under varying pose and lighting conditions is tackled. Principal Component Analysis (PCA) is utilised to build an appearance model that is then used within a particle filter based approach to fit the 3D face mask to the image. This recovers face pose and person-specific shape information simultaneously. Experiments demonstrate the use in different resolution and under varying pose and lighting conditions. Following that, a combined tracking and super resolution approach enhances the quality of poor detail video objects. A 3D object mask is subdivided such that every mask triangle is smaller than a pixel when projected into the image and then used for model based tracking. The mask subdivision then allows for super resolution of the object by combining several video frames. This approach achieves better results than traditional super resolution methods without the use of interpolation or deblurring.Lastly, object recognition is performed in two different ways. The first recognition method is applied to characters and used for license plate recognition. A novel character model is proposed to create different appearances which are then matched with the image of unknown characters for recognition. This allows for simultaneous character segmentation and recognition and high recognition rates are achieved for low resolution characters down to only five pixels in size. While this approach is only feasible for objects with a limited number of different appearances, like characters, the second recognition method is applicable to any object, including human faces. Therefore, a generic 3D face model is automatically fitted to an image of a human face and recognition is performed on a mask level rather than image level. This approach does not require an initial pose estimation nor the selection of feature points, the face alignment is provided implicitly by the mask fitting process.
first_indexed 2025-11-14T05:45:18Z
format Thesis
id curtin-20.500.11937-585
institution Curtin University Malaysia
institution_category Local University
language English
last_indexed 2025-11-14T05:45:18Z
publishDate 2009
publisher Curtin University
recordtype eprints
repository_type Digital Repository
spelling curtin-20.500.11937-5852017-02-20T06:42:15Z Model based methods for locating, enhancing and recognising low resolution objects in video Kramer, Annika object recognition video surveillance visual perception object detection video processing person-specific shape information face pose 3D generic face model Visual perception is our most important sense which enables us to detect and recognise objects even in low detail video scenes. While humans are able to perform such object detection and recognition tasks reliably, most computer vision algorithms struggle with wide angle surveillance videos that make automatic processing difficult due to low resolution and poor detail objects. Additional problems arise from varying pose and lighting conditions as well as non-cooperative subjects. All these constraints pose problems for automatic scene interpretation of surveillance video, including object detection, tracking and object recognition.Therefore, the aim of this thesis is to detect, enhance and recognise objects by incorporating a priori information and by using model based approaches. Motivated by the increasing demand for automatic methods for object detection, enhancement and recognition in video surveillance, different aspects of the video processing task are investigated with a focus on human faces. In particular, the challenge of fully automatic face pose and shape estimation by fitting a deformable 3D generic face model under varying pose and lighting conditions is tackled. Principal Component Analysis (PCA) is utilised to build an appearance model that is then used within a particle filter based approach to fit the 3D face mask to the image. This recovers face pose and person-specific shape information simultaneously. Experiments demonstrate the use in different resolution and under varying pose and lighting conditions. Following that, a combined tracking and super resolution approach enhances the quality of poor detail video objects. A 3D object mask is subdivided such that every mask triangle is smaller than a pixel when projected into the image and then used for model based tracking. The mask subdivision then allows for super resolution of the object by combining several video frames. This approach achieves better results than traditional super resolution methods without the use of interpolation or deblurring.Lastly, object recognition is performed in two different ways. The first recognition method is applied to characters and used for license plate recognition. A novel character model is proposed to create different appearances which are then matched with the image of unknown characters for recognition. This allows for simultaneous character segmentation and recognition and high recognition rates are achieved for low resolution characters down to only five pixels in size. While this approach is only feasible for objects with a limited number of different appearances, like characters, the second recognition method is applicable to any object, including human faces. Therefore, a generic 3D face model is automatically fitted to an image of a human face and recognition is performed on a mask level rather than image level. This approach does not require an initial pose estimation nor the selection of feature points, the face alignment is provided implicitly by the mask fitting process. 2009 Thesis http://hdl.handle.net/20.500.11937/585 en Curtin University fulltext
spellingShingle object recognition
video surveillance
visual perception
object detection
video processing
person-specific shape information
face pose
3D generic face model
Kramer, Annika
Model based methods for locating, enhancing and recognising low resolution objects in video
title Model based methods for locating, enhancing and recognising low resolution objects in video
title_full Model based methods for locating, enhancing and recognising low resolution objects in video
title_fullStr Model based methods for locating, enhancing and recognising low resolution objects in video
title_full_unstemmed Model based methods for locating, enhancing and recognising low resolution objects in video
title_short Model based methods for locating, enhancing and recognising low resolution objects in video
title_sort model based methods for locating, enhancing and recognising low resolution objects in video
topic object recognition
video surveillance
visual perception
object detection
video processing
person-specific shape information
face pose
3D generic face model
url http://hdl.handle.net/20.500.11937/585