SLR has evolved as one of the most important areas in human–computer interaction and assistive communication technologies. With the rapid development of deep learning, SLR systems have evolved from traditional handcrafted features to highly efficient data-driven models capable of grasping complex spatial and temporal patterns in sign sequences. A wide variety of recent works have investigated different approaches that range from CNNs and recurrent architectures to GCNs for skeletal modelling and state-of-the-art transformer frameworks for long-range sequence understanding. Besides, multimodal systems that incorporate RGB, depth, skeletal information, and radio-frequency signals demonstrate enhanced robustness under difficult real-world conditions. This survey provides an in-depth review of modern advances in SLR by underlining methodological novelties, commonly used datasets, architectural enhancements, and corresponding performance results. Shared challenges regarding signer variability, limited diversity in datasets, occlusions, and constraints on real-time processing are discussed in detail. The survey concludes by underlining emerging trends and future research directions oriented to the development of scalable, accurate, and context-aware SLR systems that can be effectively used in practical assistive applications.
Keywords
Sign Language Recognition, Deep Learning, CNN, GCN, Transformers, Pose Estimation, Multimodal Fusion, Continuous SLR, Word-Level SLR, Human–Computer Interaction.