Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech

Specific locations within the brain contain neurons which respond, by firing action potentials (spikes), when a sound is played in the ear of a person or animal. The number and timing of these spikes encodes information about the sound; this code is the basis for us perceiving and understanding the...

Full description

Bibliographic Details
Main Author: Levy, Alban Hugo
Format: Thesis (University of Nottingham only)
Language:English
Published: 2018
Subjects:
Online Access:https://eprints.nottingham.ac.uk/52224/
_version_ 1848798677277081600
author Levy, Alban Hugo
author_facet Levy, Alban Hugo
author_sort Levy, Alban Hugo
building Nottingham Research Data Repository
collection Online Access
description Specific locations within the brain contain neurons which respond, by firing action potentials (spikes), when a sound is played in the ear of a person or animal. The number and timing of these spikes encodes information about the sound; this code is the basis for us perceiving and understanding the acoustic world around us. To understand how the brain processes sound, we must understand this code. The difficulty then lies in evaluating the unknown neural code. This thesis applies Machine Learning to evaluate auditory coding of dynamic sounds by spike trains, with datasets of varying complexity. In the first part, a battery of Machine Learning (ML) algorithms are used to evaluate modulation frequency coding from the neural response to amplitude-modulated sinusoids in cat Cochlear Nucleus spike train data. It is found on this recognition task that, whilst absolute performance levels depend on the types of algorithms, their performance relative to each other is the same on different types of neurons. Thus a single powerful classification algorithm is sufficient for evaluating neural codes. Similarly, different performance measures are useful in understanding differences between ML algorithms, but they shed little light on different neural coding strategies. In contrast, the features used for classification are crucial; e.g. Vector Strength does not provide an accurate measure of the information contained in spike timing. Overall, different types of neurons do not encode the same amount of amplitude-modulation information. This emphasises the value of using powerful Machine Learning methods applied to raw spike timing information. In the second part, a more ecological and heterogeneous set of sounds — speech — is used. The application of Hidden Markov Model based Automatic Speech Recognition (ASR) is tested within the constraints of an electrophysiological experiment. The findings suggest that a continuous digit recognition task is amenable to a physiology experiment: using only 10 minutes of simulated recording to train statistical models of phonemes, an accuracy of 70% could be achieved. This result jumps to about 85% when using 200 minutes worth of simulated data. Using a digit recognition framework is sufficient to examine the influence on the performance of different aspects of the size and nature of a neural population and the role of spike timing. Previous results suggest, however, that this accuracy would be reduced if experimental Inferior Colliculus data were used instead of a guinea-pig cochlear model. On the other hand, a fully-fledged continuous ASR task on a large vocabulary with many speakers may result in insufficient phoneme accuracy (∼40%) to base an auditory coding-related investigation on. Overall this suggests that complex ML algorithms such as ASR can nevertheless be practically used to assess neural coding of speech, with careful selection of features.
first_indexed 2025-11-14T20:23:34Z
format Thesis (University of Nottingham only)
id nottingham-52224
institution University of Nottingham Malaysia Campus
institution_category Local University
language English
last_indexed 2025-11-14T20:23:34Z
publishDate 2018
recordtype eprints
repository_type Digital Repository
spelling nottingham-522242025-02-28T14:09:34Z https://eprints.nottingham.ac.uk/52224/ Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech Levy, Alban Hugo Specific locations within the brain contain neurons which respond, by firing action potentials (spikes), when a sound is played in the ear of a person or animal. The number and timing of these spikes encodes information about the sound; this code is the basis for us perceiving and understanding the acoustic world around us. To understand how the brain processes sound, we must understand this code. The difficulty then lies in evaluating the unknown neural code. This thesis applies Machine Learning to evaluate auditory coding of dynamic sounds by spike trains, with datasets of varying complexity. In the first part, a battery of Machine Learning (ML) algorithms are used to evaluate modulation frequency coding from the neural response to amplitude-modulated sinusoids in cat Cochlear Nucleus spike train data. It is found on this recognition task that, whilst absolute performance levels depend on the types of algorithms, their performance relative to each other is the same on different types of neurons. Thus a single powerful classification algorithm is sufficient for evaluating neural codes. Similarly, different performance measures are useful in understanding differences between ML algorithms, but they shed little light on different neural coding strategies. In contrast, the features used for classification are crucial; e.g. Vector Strength does not provide an accurate measure of the information contained in spike timing. Overall, different types of neurons do not encode the same amount of amplitude-modulation information. This emphasises the value of using powerful Machine Learning methods applied to raw spike timing information. In the second part, a more ecological and heterogeneous set of sounds — speech — is used. The application of Hidden Markov Model based Automatic Speech Recognition (ASR) is tested within the constraints of an electrophysiological experiment. The findings suggest that a continuous digit recognition task is amenable to a physiology experiment: using only 10 minutes of simulated recording to train statistical models of phonemes, an accuracy of 70% could be achieved. This result jumps to about 85% when using 200 minutes worth of simulated data. Using a digit recognition framework is sufficient to examine the influence on the performance of different aspects of the size and nature of a neural population and the role of spike timing. Previous results suggest, however, that this accuracy would be reduced if experimental Inferior Colliculus data were used instead of a guinea-pig cochlear model. On the other hand, a fully-fledged continuous ASR task on a large vocabulary with many speakers may result in insufficient phoneme accuracy (∼40%) to base an auditory coding-related investigation on. Overall this suggests that complex ML algorithms such as ASR can nevertheless be practically used to assess neural coding of speech, with careful selection of features. 2018-07-19 Thesis (University of Nottingham only) NonPeerReviewed application/pdf en arr https://eprints.nottingham.ac.uk/52224/1/Thesis_AlbanLevy_4219116_FinalVersion.pdf Levy, Alban Hugo (2018) Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech. PhD thesis, University of Nottingham. Neuroscience Auditory Neuroscience Machine Learning Speech Recognition Cochlear Nucleus
spellingShingle Neuroscience
Auditory Neuroscience
Machine Learning
Speech Recognition
Cochlear Nucleus
Levy, Alban Hugo
Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_full Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_fullStr Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_full_unstemmed Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_short Machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
title_sort machine learning for neural coding of sound envelopes: slithering from sinusoids to speech
topic Neuroscience
Auditory Neuroscience
Machine Learning
Speech Recognition
Cochlear Nucleus
url https://eprints.nottingham.ac.uk/52224/