Combining residual networks with LSTMs for lipreading
We propose an end-to-end deep learning architecture for word level visual speech recognition. The system is a combination of spatiotemporal convolutional, residual and bidirectional Long Short-Term Memory networks. We trained and evaluated it on the Lipreading In-The-Wild benchmark, a challenging da...
| Main Authors: | Stafylakis, Themos, Tzimiropoulos, Georgios |
|---|---|
| Format: | Conference or Workshop Item |
| Published: |
2017
|
| Subjects: | |
| Online Access: | https://eprints.nottingham.ac.uk/44756/ |
Similar Items
Deep word embeddings for visual speech recognition
by: Stafylakis, Themos, et al.
Published: (2018)
by: Stafylakis, Themos, et al.
Published: (2018)
Speaker discriminability for visual speech modes
by: Kim, J., et al.
Published: (2009)
by: Kim, J., et al.
Published: (2009)
A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities
by: Wong, Y.W., et al.
Published: (2011)
by: Wong, Y.W., et al.
Published: (2011)
Motion Path Generation Using A Modified 6th Order Polynomial Function for Visual Speech Synthesis
by: Salleh, Siti Salwa
Published: (2008)
by: Salleh, Siti Salwa
Published: (2008)
Figurative language detection using deep and contextual features
by: Razali, Md Saifullah
Published: (2023)
by: Razali, Md Saifullah
Published: (2023)
A new penalty term for the BIC with respect to speaker diarization
by: Stafylakis, Themos, et al.
Published: (2010)
by: Stafylakis, Themos, et al.
Published: (2010)
A semi-automatic methodology for facial landmark annotation
by: Sagonas, Christos, et al.
Published: (2013)
by: Sagonas, Christos, et al.
Published: (2013)
A multi-filter system for speech enhancement under low signal-to-noise ratios
by: Yiu, Ka Fai, et al.
Published: (2009)
by: Yiu, Ka Fai, et al.
Published: (2009)
Web application for lipreading
by: Lau, Yee Lin
Published: (2024)
by: Lau, Yee Lin
Published: (2024)
Self-supervised learning for automatic speech recognition In low-resource environments
by: Fatehi, Kavan
Published: (2024)
by: Fatehi, Kavan
Published: (2024)
Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
by: Levy, Alban Hugo
Published: (2018)
by: Levy, Alban Hugo
Published: (2018)
Multichannel filters for speech recognition using a particle swarm optimization
by: Chan, Kit Yan, et al.
Published: (2012)
by: Chan, Kit Yan, et al.
Published: (2012)
Development of an Isolated Digit Speech Recognition Based on Multilayer Perceptron Model
by: Mohamad Hussin, Ummu Salmah
Published: (2004)
by: Mohamad Hussin, Ummu Salmah
Published: (2004)
Deep Learning Using Tiny Domain-Specific Datasets with Sparse Labels
by: Smith, Thomas J
Published: (2021)
by: Smith, Thomas J
Published: (2021)
Automatic speech recognition predicts speech intelligibility and comprehension for listeners with simulated age-related hearing loss
by: Fontan, Lionel, et al.
Published: (2017)
by: Fontan, Lionel, et al.
Published: (2017)
An investigation of deep learning for image processing applications
by: Hou, Xianxu
Published: (2018)
by: Hou, Xianxu
Published: (2018)
Speech recognition enhancement using beamforming and a genetic algorithm
by: Chan, Kit Yan, et al.
Published: (2009)
by: Chan, Kit Yan, et al.
Published: (2009)
Deep learning approach with image noise reduction to determine planting density and defected paddy seedlings
by: Mohamed Anuar, Mohamed Marzhar
Published: (2022)
by: Mohamed Anuar, Mohamed Marzhar
Published: (2022)
Deep tissue analysis: advancing optical techniques with interpretable deep learning and aberration correction
by: Kok, Yong En
Published: (2025)
by: Kok, Yong En
Published: (2025)
Improving understanding of EEG measurements using transparent machine learning models
by: Roadknight, Chris, et al.
Published: (2019)
by: Roadknight, Chris, et al.
Published: (2019)
Deep machine learning provides state-of-the-art performance in image-based plant phenotyping
by: Pound, Michael P., et al.
Published: (2017)
by: Pound, Michael P., et al.
Published: (2017)
A hybrid noise suppression filter for accuracy enhancement of commercial speech recognizers in varying noisy conditions
by: Chan, Kit, et al.
Published: (2014)
by: Chan, Kit, et al.
Published: (2014)
Enhancement of speech recognitions for control automation using an intelligent particle swarm optimization
by: Chan, Kit Yan, et al.
Published: (2012)
by: Chan, Kit Yan, et al.
Published: (2012)
EEG activity evoked in preparation for multi-talker listening by adults and children
by: Holmes, Emma, et al.
Published: (2016)
by: Holmes, Emma, et al.
Published: (2016)
Sensor fusion of motion-based sign language interpretation with deep learning
by: Lee, Boon Giin, et al.
Published: (2020)
by: Lee, Boon Giin, et al.
Published: (2020)
The Cosmic Evolution of Galaxy Structure and Morphology at 0.5 < z < 8
by: Ferreira, Leonardo
Published: (2023)
by: Ferreira, Leonardo
Published: (2023)
End-to-end DVB-S2X system design with deep learning-based channel estimation over satellite fading channels
by: Mfarej, Sumaya Dhari Awad
Published: (2021)
by: Mfarej, Sumaya Dhari Awad
Published: (2021)
Efficient online subspace learning with an indefinite kernel for visual tracking and recognition
by: Liwicki, Stephan, et al.
Published: (2012)
by: Liwicki, Stephan, et al.
Published: (2012)
A hybrid medical text classification framework: integrating attentive rule construction and neural network
by: Li, Xiang, et al.
Published: (2021)
by: Li, Xiang, et al.
Published: (2021)
Principal component analysis of image gradient orientations for face recognition
by: Tzimiropoulos, Georgios, et al.
Published: (2011)
by: Tzimiropoulos, Georgios, et al.
Published: (2011)
An affine invariant function using PCA bases with an application to within-class object recognition
by: Tzimiropoulos, Georgios, et al.
Published: (2007)
by: Tzimiropoulos, Georgios, et al.
Published: (2007)
Instance segmentation of front doors in mobile mapping system images
by: Klimkowska, Anna Maria
Published: (2020)
by: Klimkowska, Anna Maria
Published: (2020)
An investigation into image-based indoor localization using deep learning
by: Li, Qing
Published: (2020)
by: Li, Qing
Published: (2020)
Synthetic data driven deep learning for plant phenotyping
by: Hartley, Zane K.J.
Published: (2024)
by: Hartley, Zane K.J.
Published: (2024)
Multitasking deep neural network models for Arabic dialect sentiment analysis
by: Alali, Muath Mohammad Oqlah
Published: (2022)
by: Alali, Muath Mohammad Oqlah
Published: (2022)
Input matters: speed of word recognition in 2-year-olds exposed to multiple accents
by: Buckler, Helen, et al.
Published: (2017)
by: Buckler, Helen, et al.
Published: (2017)
Deep learning models of biological visual information processing
by: Turcsány, Diána
Published: (2016)
by: Turcsány, Diána
Published: (2016)
Large-scale detection, mapping, and initial health assessment of date palm trees using multiplatform remotely-sensed data and deep learning techniques
by: Gibril, Mohamed Barakat Abdelfatah
Published: (2023)
by: Gibril, Mohamed Barakat Abdelfatah
Published: (2023)
Portable form filling assistant for the visually impaired
by: Peng, En, et al.
Published: (2010)
by: Peng, En, et al.
Published: (2010)
Robust recognition of planar shapes under affine transforms using principal component analysis
by: Tzimiropoulos, Georgios, et al.
Published: (2007)
by: Tzimiropoulos, Georgios, et al.
Published: (2007)
Similar Items
-
Deep word embeddings for visual speech recognition
by: Stafylakis, Themos, et al.
Published: (2018) -
Speaker discriminability for visual speech modes
by: Kim, J., et al.
Published: (2009) -
A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities
by: Wong, Y.W., et al.
Published: (2011) -
Motion Path Generation Using A Modified 6th Order Polynomial Function for Visual Speech Synthesis
by: Salleh, Siti Salwa
Published: (2008) -
Figurative language detection using deep and contextual features
by: Razali, Md Saifullah
Published: (2023)