Integration of CNN and LSTM networks for behavior feature recognition: an analysis

This study explores an integration model combining convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) for behavior feature recognition. Initially, a straightforward three-dimensional deep CNN structure was introduced for behavior recognition, capturing static and dynami...

Full description

Bibliographic Details
Main Authors: Aris, Teh Noranis Mohd, Ningning, Chen, Mustapha, Norwati, Zolkepli, Maslina
Format: Article
Language:English
Published: Insight Society 2024
Online Access:http://psasir.upm.edu.my/id/eprint/117497/
http://psasir.upm.edu.my/id/eprint/117497/1/117497.pdf
_version_ 1848867265040089088
author Aris, Teh Noranis Mohd
Ningning, Chen
Mustapha, Norwati
Zolkepli, Maslina
author_facet Aris, Teh Noranis Mohd
Ningning, Chen
Mustapha, Norwati
Zolkepli, Maslina
author_sort Aris, Teh Noranis Mohd
building UPM Institutional Repository
collection Online Access
description This study explores an integration model combining convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) for behavior feature recognition. Initially, a straightforward three-dimensional deep CNN structure was introduced for behavior recognition, capturing static and dynamic characteristics, and analyzing the network's convergence speed. Subsequent experiments utilize the VGG16 CNN model, substituting the fully connected layer with global average pooling. Then, a comparative experiment was conducted on the MSRC-12 behavior dataset between the models. Due to the complexity of LSTM, a simpler GRU model with similar effectiveness was used for comparison. The experimental results showed that the GRU-CNN model performed best, outperforming other algorithms in the literature on the same dataset. Under the same experimental parameters, the GRU-CNN model converges significantly faster than the LSTM-CNN model, with speedier training speed. In addition, the best accuracy is achieved by adjusting the dropout and epoch. Due to cross-validation in this study, the GRU-CNN models achieved good experimental results when the hidden node dropout rate was 0.5. The epoch size had negligible impact on the GRU-CNN model. Still, the accuracy of the CNN and CNN-GRU models increased significantly with more epochs, further validating the effectiveness of the GRU-CNN model. These experiments also indicate that convolutional neural networks based on deep learning are superior to traditional machine learning methods for human behavior recognition. Using depth images instead of conventional images allows for better extraction of spatial features, and the integration with long short-term memory networks enhances the extraction of temporal features from sequences.
first_indexed 2025-11-15T14:33:45Z
format Article
id upm-117497
institution Universiti Putra Malaysia
institution_category Local University
language English
last_indexed 2025-11-15T14:33:45Z
publishDate 2024
publisher Insight Society
recordtype eprints
repository_type Digital Repository
spelling upm-1174972025-05-27T08:47:50Z http://psasir.upm.edu.my/id/eprint/117497/ Integration of CNN and LSTM networks for behavior feature recognition: an analysis Aris, Teh Noranis Mohd Ningning, Chen Mustapha, Norwati Zolkepli, Maslina This study explores an integration model combining convolutional neural networks (CNNs) and long short-term memory networks (LSTMs) for behavior feature recognition. Initially, a straightforward three-dimensional deep CNN structure was introduced for behavior recognition, capturing static and dynamic characteristics, and analyzing the network's convergence speed. Subsequent experiments utilize the VGG16 CNN model, substituting the fully connected layer with global average pooling. Then, a comparative experiment was conducted on the MSRC-12 behavior dataset between the models. Due to the complexity of LSTM, a simpler GRU model with similar effectiveness was used for comparison. The experimental results showed that the GRU-CNN model performed best, outperforming other algorithms in the literature on the same dataset. Under the same experimental parameters, the GRU-CNN model converges significantly faster than the LSTM-CNN model, with speedier training speed. In addition, the best accuracy is achieved by adjusting the dropout and epoch. Due to cross-validation in this study, the GRU-CNN models achieved good experimental results when the hidden node dropout rate was 0.5. The epoch size had negligible impact on the GRU-CNN model. Still, the accuracy of the CNN and CNN-GRU models increased significantly with more epochs, further validating the effectiveness of the GRU-CNN model. These experiments also indicate that convolutional neural networks based on deep learning are superior to traditional machine learning methods for human behavior recognition. Using depth images instead of conventional images allows for better extraction of spatial features, and the integration with long short-term memory networks enhances the extraction of temporal features from sequences. Insight Society 2024 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/117497/1/117497.pdf Aris, Teh Noranis Mohd and Ningning, Chen and Mustapha, Norwati and Zolkepli, Maslina (2024) Integration of CNN and LSTM networks for behavior feature recognition: an analysis. International Journal on Advanced Science, Engineering and Information Technology, 14 (5). pp. 1793-1799. ISSN 2460-6952; eISSN: 2088-5334 https://ijaseit.insightsociety.org/index.php/ijaseit/article/view/10116 10.18517/ijaseit.14.5.10116
spellingShingle Aris, Teh Noranis Mohd
Ningning, Chen
Mustapha, Norwati
Zolkepli, Maslina
Integration of CNN and LSTM networks for behavior feature recognition: an analysis
title Integration of CNN and LSTM networks for behavior feature recognition: an analysis
title_full Integration of CNN and LSTM networks for behavior feature recognition: an analysis
title_fullStr Integration of CNN and LSTM networks for behavior feature recognition: an analysis
title_full_unstemmed Integration of CNN and LSTM networks for behavior feature recognition: an analysis
title_short Integration of CNN and LSTM networks for behavior feature recognition: an analysis
title_sort integration of cnn and lstm networks for behavior feature recognition: an analysis
url http://psasir.upm.edu.my/id/eprint/117497/
http://psasir.upm.edu.my/id/eprint/117497/
http://psasir.upm.edu.my/id/eprint/117497/
http://psasir.upm.edu.my/id/eprint/117497/1/117497.pdf