<strong>Paper Title</strong><br>

INDIAN SIGN LANGUAGE INTERPRETATION USING VIDEO CLASSIFICATION<br>

<br>


<strong>Abstract</strong><br>

With an emphasis on spatial and temporal gesture patterns, this study investigates deep learning models for Indian Sign Language (ISL) interpretation through video categorisation. Using MediaPipe's pose solution, videos were preprocessed by extracting 543 landmarks (21 right and left hands, 468 face, and 33 posture landmarks) from each frame, with labels based on folder organization. Labels were encoded for compatibility and the data was separated into training and testing sets. Four models were assessed: Attention Based Bi-Directional LSTM Convolutional Network, Pose-Guided Graph Convolutional Network (PGCN), High Order Graph Convolutional Network (HOGCN), and Dynamic Graph Convolutional Network (DGCN). With a ROC-AUC of 0.9989 and an accuracy of 95.87%, Attention Based Bi-Directional LSTM performed best. The outcomes show how well Attention Based Bi-Directional LSTM can capture intricate ISL motions, enabling real-world ISL interpretation applications.

Keywords - Attention Based Bi-Directional LSTM Convolutional Network, Pose-Guided Graph Convolutional Network, High Order Graph Convolutional Network, Dynamic Graph Convolutional Network, Graph Convolutional Neural Network, Bi-Directional Long Short-Term Memory, Attention Model,  Indian Sign Language Interpretation, Video Classification,  MediaPipe, Pose Landmarks, Face Landmarks, Hands Landmarks