Leveraging 3D skeleton video extraction and deep learning for real-time sign language recognition model

Sign language recognition is recognized as key research for reducing communication barriers between deaf and hearing people. Over the past two decades, researchers have shown great interest in sign language recognition due to technological advances. Researchers have conducted extensive studies on si...

Full description

Bibliographic Details
Main Author:	Ang, Zi Ying
Format:	Final Year Project / Dissertation / Thesis
Published:	2024
Subjects:	Q Science (General) QM Human anatomy T Technology (General)
Online Access:	http://eprints.utar.edu.my/6489/ http://eprints.utar.edu.my/6489/1/Ang_Zi_Ying_Full_Report.pdf

Description
Summary:	Sign language recognition is recognized as key research for reducing communication barriers between deaf and hearing people. Over the past two decades, researchers have shown great interest in sign language recognition due to technological advances. Researchers have conducted extensive studies on sign language recognition, but developing a highly accurate real-time model is still difficult due to the time-consuming nature of sign language video recognition. Due to the lack of a Malaysian Sign Language dataset, a video-based Malaysian Sign Language dataset (MSL10) was created and will further validate the results with the Argentinean Sign Language dataset (LSA64). This study aims to propose a combination that maintains high accuracy and reduces computational time, which consists of key points of important features, and a deep learning recurrent neural network model, a high-accuracy and low-computational model suitable for real-time sign language recognition. MediaPipe's 3D skeleton video helps in removing unnecessary information while reducing computation time. Compared to whole-body feature analysis, this study shows that hand features can effectively reduce computation time and improve accuracy. In the study, it was also found that the two-layer BiLSTM iv model has the best performance in terms of accuracy and computation time as compared to the LSTM and three-layer BiLSTM models.

Leveraging 3D skeleton video extraction and deep learning for real-time sign language recognition model

Similar Items