Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model

SARS-CoV-2, a novel coronavirus mostly known as COVID-19 has created a global pandemic. The world is now immobilized by this infectious RNA virus. As of June 15, already more than 7.9 million people have been infected and 432k people died. This RNA virus has the ability to do the mutation in the hum...

Full description

Bibliographic Details
Main Authors: Pathan, R. K., Biswas, M., Khandaker, Mayeen Uddin *
Format: Article
Language:English
Published: Elsevier 2020
Subjects:
Online Access:http://eprints.sunway.edu.my/1625/
http://eprints.sunway.edu.my/1625/1/Mayeen%20Time%20Series.pdf
_version_ 1848802102048980992
author Pathan, R. K.
Biswas, M.
Khandaker, Mayeen Uddin *
author_facet Pathan, R. K.
Biswas, M.
Khandaker, Mayeen Uddin *
author_sort Pathan, R. K.
building SU Institutional Repository
collection Online Access
description SARS-CoV-2, a novel coronavirus mostly known as COVID-19 has created a global pandemic. The world is now immobilized by this infectious RNA virus. As of June 15, already more than 7.9 million people have been infected and 432k people died. This RNA virus has the ability to do the mutation in the human body. Accurate determination of mutation rates is essential to comprehend the evolution of this virus and to determine the risk of emergent infectious disease. This study explores the mutation rate of the whole genomic sequence gathered from the patient's dataset of different countries. The collected dataset is processed to determine the nucleotide mutation and codon mutation separately. Furthermore, based on the size of the dataset, the determined mutation rate is categorized for four different regions: China, Australia, the United States, and the rest of the World. It has been found that a huge amount of Thymine (T) and Adenine (A) are mutated to other nucleotides for all regions, but codons are not frequently mutating like nucleotides. A recurrent neural network-based Long Short Term Memory (LSTM) model has been applied to predict the future mutation rate of this virus. The LSTM model gives Root Mean Square Error (RMSE) of 0.06 in testing and 0.04 in training, which is an optimized value. Using this train and testing process, the nucleotide mutation rate of 400th patient in future time has been predicted. About 0.1% increment in mutation rate is found for mutating of nucleotides from T to C and G, C to G and G to T. While a decrement of 0.1% is seen for mutating of T to A, and A to C. It is found that this model can be used to predict day basis mutation rates if more patient data is available in updated time.
first_indexed 2025-11-14T21:18:00Z
format Article
id sunway-1625
institution Sunway University
institution_category Local University
language English
last_indexed 2025-11-14T21:18:00Z
publishDate 2020
publisher Elsevier
recordtype eprints
repository_type Digital Repository
spelling sunway-16252021-04-22T03:25:59Z http://eprints.sunway.edu.my/1625/ Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model Pathan, R. K. Biswas, M. Khandaker, Mayeen Uddin * R895-920 Medical Physics/Medical Radiology SARS-CoV-2, a novel coronavirus mostly known as COVID-19 has created a global pandemic. The world is now immobilized by this infectious RNA virus. As of June 15, already more than 7.9 million people have been infected and 432k people died. This RNA virus has the ability to do the mutation in the human body. Accurate determination of mutation rates is essential to comprehend the evolution of this virus and to determine the risk of emergent infectious disease. This study explores the mutation rate of the whole genomic sequence gathered from the patient's dataset of different countries. The collected dataset is processed to determine the nucleotide mutation and codon mutation separately. Furthermore, based on the size of the dataset, the determined mutation rate is categorized for four different regions: China, Australia, the United States, and the rest of the World. It has been found that a huge amount of Thymine (T) and Adenine (A) are mutated to other nucleotides for all regions, but codons are not frequently mutating like nucleotides. A recurrent neural network-based Long Short Term Memory (LSTM) model has been applied to predict the future mutation rate of this virus. The LSTM model gives Root Mean Square Error (RMSE) of 0.06 in testing and 0.04 in training, which is an optimized value. Using this train and testing process, the nucleotide mutation rate of 400th patient in future time has been predicted. About 0.1% increment in mutation rate is found for mutating of nucleotides from T to C and G, C to G and G to T. While a decrement of 0.1% is seen for mutating of T to A, and A to C. It is found that this model can be used to predict day basis mutation rates if more patient data is available in updated time. Elsevier 2020-09 Article PeerReviewed text en cc_by_nc_4 http://eprints.sunway.edu.my/1625/1/Mayeen%20Time%20Series.pdf Pathan, R. K. and Biswas, M. and Khandaker, Mayeen Uddin * (2020) Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model. Chaos, Solitons & Fractals, 138. p. 110018. ISSN 0960-0779 http://doi.org/10.1016/j.chaos.2020.110018 doi:10.1016/j.chaos.2020.110018
spellingShingle R895-920 Medical Physics/Medical Radiology
Pathan, R. K.
Biswas, M.
Khandaker, Mayeen Uddin *
Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model
title Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model
title_full Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model
title_fullStr Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model
title_full_unstemmed Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model
title_short Time series prediction of COVID-19 by mutation rate analysis using recurrent neural network-based LSTM model
title_sort time series prediction of covid-19 by mutation rate analysis using recurrent neural network-based lstm model
topic R895-920 Medical Physics/Medical Radiology
url http://eprints.sunway.edu.my/1625/
http://eprints.sunway.edu.my/1625/
http://eprints.sunway.edu.my/1625/
http://eprints.sunway.edu.my/1625/1/Mayeen%20Time%20Series.pdf