Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency

Visual attention is a biological mechanism of human vision systems to cope with rich and fast-changing visual information in surrounding environments. Visual saliency is a strategy, which recommends attentive spots to be visited in descending orders of interest or information amounts. This thesis ai...

Full description

Bibliographic Details
Main Author: Ngo, Anh Cat Le
Format: Thesis (University of Nottingham only)
Language:English
Published: 2015
Online Access:https://eprints.nottingham.ac.uk/30984/
_version_ 1848794103704190976
author Ngo, Anh Cat Le
author_facet Ngo, Anh Cat Le
author_sort Ngo, Anh Cat Le
building Nottingham Research Data Repository
collection Online Access
description Visual attention is a biological mechanism of human vision systems to cope with rich and fast-changing visual information in surrounding environments. Visual saliency is a strategy, which recommends attentive spots to be visited in descending orders of interest or information amounts. This thesis aims to utilize information theory in computational saliency models, assumed that more attention is drawn toward more informative locations. As visual media, i.e. images and videos, are high-dimensional data, information estimation is often computationally infeasible due to enormous requirement of computation and data samples. This thesis proposes and analyses three different practical and innovative information-based saliency models. The first model, called entropy-based saliency method (ENT), measures salient information with centre-surrounding operation by conditional entropy (ENT-CON) or Kullback-Leibler diver-gence (ENT-KLD). However, ENT only estimates information from local features offixed-size windows, it does not utilize multi-scale and global information of visual media, which are proven to be important in biological visual attention. To utilise multi-scale information, Wavelet-based Scale-Saliency (WSS), the second model, estimates information from power distribution of data across wavelet sub-bands basis descriptors in multiple dyadic scales. Though WSS has benefited from local features at multiple scales, it has not integrated information of global context or statistical characteristics of natural images. Multiscale Discriminant Saliency (MDIS), the third model, adopts Wavelet Hidden Markov Tree (WHMT) to unify both multiple-scale and global information for a comprehensive saliency method. All three models, ENT, WSS and MDIS are evaluated and compared against well-known saliency methods such as PSS, AIM, DIS, etc quantitatively by standard numerical tools (Normalized Scale Saliency (NSS), Linear Correlation Coefficient (LCC), Area Under Curver (AUC)) on N.Bruce’s, Kootstra’s and Judd’s databases with human eye-tracking ground-truth as well as qualitatively by visual examination of individual cases. Performances and comprehen-siveness of three models are reflected through numerical results of an experiment on Bruce’s database. As the latter model is designed in more comprehensive and computationally complex manner than the previous, all three quantitative evaluations (LCC,NSS,AUC) generally and computational time increase in that order. ENT WSS MDIS LCC 0.02263 -0.01731 0.02382 NSS -0.17533 0.31782 0.48019 AUC 0.78167 0.70292 0.88335 TIME(s/frame) 0.87040 1.26889 2.32734 Table 1: ENT,WSS,MDIS’s quantitative results on N.Bruce’s database
first_indexed 2025-11-14T19:10:52Z
format Thesis (University of Nottingham only)
id nottingham-30984
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T19:10:52Z
publishDate 2015
recordtype eprints
repository_type Digital Repository
spelling nottingham-309842025-02-28T13:22:12Z https://eprints.nottingham.ac.uk/30984/ Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency Ngo, Anh Cat Le Visual attention is a biological mechanism of human vision systems to cope with rich and fast-changing visual information in surrounding environments. Visual saliency is a strategy, which recommends attentive spots to be visited in descending orders of interest or information amounts. This thesis aims to utilize information theory in computational saliency models, assumed that more attention is drawn toward more informative locations. As visual media, i.e. images and videos, are high-dimensional data, information estimation is often computationally infeasible due to enormous requirement of computation and data samples. This thesis proposes and analyses three different practical and innovative information-based saliency models. The first model, called entropy-based saliency method (ENT), measures salient information with centre-surrounding operation by conditional entropy (ENT-CON) or Kullback-Leibler diver-gence (ENT-KLD). However, ENT only estimates information from local features offixed-size windows, it does not utilize multi-scale and global information of visual media, which are proven to be important in biological visual attention. To utilise multi-scale information, Wavelet-based Scale-Saliency (WSS), the second model, estimates information from power distribution of data across wavelet sub-bands basis descriptors in multiple dyadic scales. Though WSS has benefited from local features at multiple scales, it has not integrated information of global context or statistical characteristics of natural images. Multiscale Discriminant Saliency (MDIS), the third model, adopts Wavelet Hidden Markov Tree (WHMT) to unify both multiple-scale and global information for a comprehensive saliency method. All three models, ENT, WSS and MDIS are evaluated and compared against well-known saliency methods such as PSS, AIM, DIS, etc quantitatively by standard numerical tools (Normalized Scale Saliency (NSS), Linear Correlation Coefficient (LCC), Area Under Curver (AUC)) on N.Bruce’s, Kootstra’s and Judd’s databases with human eye-tracking ground-truth as well as qualitatively by visual examination of individual cases. Performances and comprehen-siveness of three models are reflected through numerical results of an experiment on Bruce’s database. As the latter model is designed in more comprehensive and computationally complex manner than the previous, all three quantitative evaluations (LCC,NSS,AUC) generally and computational time increase in that order. ENT WSS MDIS LCC 0.02263 -0.01731 0.02382 NSS -0.17533 0.31782 0.48019 AUC 0.78167 0.70292 0.88335 TIME(s/frame) 0.87040 1.26889 2.32734 Table 1: ENT,WSS,MDIS’s quantitative results on N.Bruce’s database 2015-07-25 Thesis (University of Nottingham only) NonPeerReviewed application/pdf en arr https://eprints.nottingham.ac.uk/30984/1/thesis.pdf Ngo, Anh Cat Le (2015) Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency. PhD thesis, University of Nottingham.
spellingShingle Ngo, Anh Cat Le
Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency
title Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency
title_full Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency
title_fullStr Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency
title_full_unstemmed Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency
title_short Digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency
title_sort digital system for bio-inspired visual attention processing fast and efficient information theoretic modelling of saliency
url https://eprints.nottingham.ac.uk/30984/