Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech

Specific locations within the brain contain neurons which respond, by firing action potentials (spikes), when a sound is played in the ear of a person or animal. The number and timing of these spikes encodes information about the sound; this code is the basis for us perceiving and understanding the...

Full description

Bibliographic Details
Main Author:	Levy, Alban Hugo
Format:	Thesis (University of Nottingham only)
Language:	English
Published:	2018
Subjects:	Neuroscience Auditory Neuroscience Machine Learning Speech Recognition Cochlear Nucleus
Online Access:	https://eprints.nottingham.ac.uk/52224/

_version_	1848798677277081600
author	Levy, Alban Hugo
author_facet	Levy, Alban Hugo
author_sort	Levy, Alban Hugo
building	Nottingham Research Data Repository
collection	Online Access
description	Specific locations within the brain contain neurons which respond, by firing action potentials (spikes), when a sound is played in the ear of a person or animal. The number and timing of these spikes encodes information about the sound; this code is the basis for us perceiving and understanding the acoustic world around us. To understand how the brain processes sound, we must understand this code. The difficulty then lies in evaluating the unknown neural code. This thesis applies Machine Learning to evaluate auditory coding of dynamic sounds by spike trains, with datasets of varying complexity. In the first part, a battery of Machine Learning (ML) algorithms are used to evaluate modulation frequency coding from the neural response to amplitude-modulated sinusoids in cat Cochlear Nucleus spike train data. It is found on this recognition task that, whilst absolute performance levels depend on the types of algorithms, their performance relative to each other is the same on different types of neurons. Thus a single powerful classification algorithm is sufficient for evaluating neural codes. Similarly, different performance measures are useful in understanding differences between ML algorithms, but they shed little light on different neural coding strategies. In contrast, the features used for classification are crucial; e.g. Vector Strength does not provide an accurate measure of the information contained in spike timing. Overall, different types of neurons do not encode the same amount of amplitude-modulation information. This emphasises the value of using powerful Machine Learning methods applied to raw spike timing information. In the second part, a more ecological and heterogeneous set of sounds — speech — is used. The application of Hidden Markov Model based Automatic Speech Recognition (ASR) is tested within the constraints of an electrophysiological experiment. The findings suggest that a continuous digit recognition task is amenable to a physiology experiment: using only 10 minutes of simulated recording to train statistical models of phonemes, an accuracy of 70% could be achieved. This result jumps to about 85% when using 200 minutes worth of simulated data. Using a digit recognition framework is sufficient to examine the influence on the performance of different aspects of the size and nature of a neural population and the role of spike timing. Previous results suggest, however, that this accuracy would be reduced if experimental Inferior Colliculus data were used instead of a guinea-pig cochlear model. On the other hand, a fully-fledged continuous ASR task on a large vocabulary with many speakers may result in insufficient phoneme accuracy (∼40%) to base an auditory coding-related investigation on. Overall this suggests that complex ML algorithms such as ASR can nevertheless be practically used to assess neural coding of speech, with careful selection of features.
first_indexed	2025-11-14T20:23:34Z
format	Thesis (University of Nottingham only)
id	nottingham-52224
institution	University of Nottingham Malaysia Campus
institution_category	Local University
language	English
last_indexed	2025-11-14T20:23:34Z
publishDate	2018
recordtype	eprints
repository_type	Digital Repository
spelling	nottingham-522242025-02-28T14:09:34Z https://eprints.nottingham.ac.uk/52224/ Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech Levy, Alban Hugo Specific locations within the brain contain neurons which respond, by firing action potentials (spikes), when a sound is played in the ear of a person or animal. The number and timing of these spikes encodes information about the sound; this code is the basis for us perceiving and understanding the acoustic world around us. To understand how the brain processes sound, we must understand this code. The difficulty then lies in evaluating the unknown neural code. This thesis applies Machine Learning to evaluate auditory coding of dynamic sounds by spike trains, with datasets of varying complexity. In the first part, a battery of Machine Learning (ML) algorithms are used to evaluate modulation frequency coding from the neural response to amplitude-modulated sinusoids in cat Cochlear Nucleus spike train data. It is found on this recognition task that, whilst absolute performance levels depend on the types of algorithms, their performance relative to each other is the same on different types of neurons. Thus a single powerful classification algorithm is sufficient for evaluating neural codes. Similarly, different performance measures are useful in understanding differences between ML algorithms, but they shed little light on different neural coding strategies. In contrast, the features used for classification are crucial; e.g. Vector Strength does not provide an accurate measure of the information contained in spike timing. Overall, different types of neurons do not encode the same amount of amplitude-modulation information. This emphasises the value of using powerful Machine Learning methods applied to raw spike timing information. In the second part, a more ecological and heterogeneous set of sounds — speech — is used. The application of Hidden Markov Model based Automatic Speech Recognition (ASR) is tested within the constraints of an electrophysiological experiment. The findings suggest that a continuous digit recognition task is amenable to a physiology experiment: using only 10 minutes of simulated recording to train statistical models of phonemes, an accuracy of 70% could be achieved. This result jumps to about 85% when using 200 minutes worth of simulated data. Using a digit recognition framework is sufficient to examine the influence on the performance of different aspects of the size and nature of a neural population and the role of spike timing. Previous results suggest, however, that this accuracy would be reduced if experimental Inferior Colliculus data were used instead of a guinea-pig cochlear model. On the other hand, a fully-fledged continuous ASR task on a large vocabulary with many speakers may result in insufficient phoneme accuracy (∼40%) to base an auditory coding-related investigation on. Overall this suggests that complex ML algorithms such as ASR can nevertheless be practically used to assess neural coding of speech, with careful selection of features. 2018-07-19 Thesis (University of Nottingham only) NonPeerReviewed application/pdf en arr https://eprints.nottingham.ac.uk/52224/1/Thesis_AlbanLevy_4219116_FinalVersion.pdf Levy, Alban Hugo (2018) Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech. PhD thesis, University of Nottingham. Neuroscience Auditory Neuroscience Machine Learning Speech Recognition Cochlear Nucleus
spellingShingle	Neuroscience Auditory Neuroscience Machine Learning Speech Recognition Cochlear Nucleus Levy, Alban Hugo Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title	Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_full	Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_fullStr	Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_full_unstemmed	Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_short	Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_sort	machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
topic	Neuroscience Auditory Neuroscience Machine Learning Speech Recognition Cochlear Nucleus
url	https://eprints.nottingham.ac.uk/52224/

Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech

Similar Items