File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Applying (3+2+1)D Residual Neural Network with Frame Selection for Hong Kong Sign Language Recognition

TitleApplying (3+2+1)D Residual Neural Network with Frame Selection for Hong Kong Sign Language Recognition
Authors
KeywordsSign Language Recognition
Residual Neural Network
Video Recognition
Issue Date2021
PublisherIEEE, Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000545
Citation
Proceedings of 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10-15 January 2021, p. 4296-4302 How to Cite?
AbstractAs reported by Hong Kong Government in 2017, there are more than 1.5 million residents suffering from hearing impairment in Hong Kong. Most of them rely on Hong Kong Sign Language for daily communication while there are only 63 registered sign language interpreters in Hong Kong. To address this specific social issue and also facilitate the effective communication between the hearing impaired and other people, this paper introduces a word-level Hong Kong Sign Language(HKSL) dataset which currently includes 45 isolated words and at least 30 sign videos per word performed by different signers(more than 1500 videos in total now and still enlarging). Based on this dataset, this paper systemically compares the performances of various deep learning approaches, including (1) 2D histogram of oriented gradients(HOG) feature/pose estimation/feature extraction with long-short term memory(LSTM) layer; (2) 3D Residual Neural Network(ResNet) (3) (2+1)D Residual Neural Network, in HKSL recognition. Meanwhile, to further improve the accuracy of sign language recognition, this paper proposes a novel method called (3+2+1)D ResNet Model with Frame Selection which adopts blurriness detection with Laplacian kernel to construct high-quality video clips and also combines both (2+1)D and 3D ResNet for recognizing the sign language. At the end, the experimental results show that the proposed method outperforms other deep learning approaches and attains an impressive accuracy of 94.6% in our dataset.
Persistent Identifierhttp://hdl.handle.net/10722/304345
ISSN
2020 SCImago Journal Rankings: 0.276
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorZhou, Z-
dc.contributor.authorWong Lui, KS-
dc.contributor.authorTam, VWL-
dc.contributor.authorLam, EYM-
dc.date.accessioned2021-09-23T08:58:45Z-
dc.date.available2021-09-23T08:58:45Z-
dc.date.issued2021-
dc.identifier.citationProceedings of 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10-15 January 2021, p. 4296-4302-
dc.identifier.issn1051-4651-
dc.identifier.urihttp://hdl.handle.net/10722/304345-
dc.description.abstractAs reported by Hong Kong Government in 2017, there are more than 1.5 million residents suffering from hearing impairment in Hong Kong. Most of them rely on Hong Kong Sign Language for daily communication while there are only 63 registered sign language interpreters in Hong Kong. To address this specific social issue and also facilitate the effective communication between the hearing impaired and other people, this paper introduces a word-level Hong Kong Sign Language(HKSL) dataset which currently includes 45 isolated words and at least 30 sign videos per word performed by different signers(more than 1500 videos in total now and still enlarging). Based on this dataset, this paper systemically compares the performances of various deep learning approaches, including (1) 2D histogram of oriented gradients(HOG) feature/pose estimation/feature extraction with long-short term memory(LSTM) layer; (2) 3D Residual Neural Network(ResNet) (3) (2+1)D Residual Neural Network, in HKSL recognition. Meanwhile, to further improve the accuracy of sign language recognition, this paper proposes a novel method called (3+2+1)D ResNet Model with Frame Selection which adopts blurriness detection with Laplacian kernel to construct high-quality video clips and also combines both (2+1)D and 3D ResNet for recognizing the sign language. At the end, the experimental results show that the proposed method outperforms other deep learning approaches and attains an impressive accuracy of 94.6% in our dataset.-
dc.languageeng-
dc.publisherIEEE, Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000545-
dc.relation.ispartofInternational Conference on Pattern Recognition-
dc.rightsInternational Conference on Pattern Recognition. Copyright © IEEE, Computer Society.-
dc.rights©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.-
dc.subjectSign Language Recognition-
dc.subjectResidual Neural Network-
dc.subjectVideo Recognition-
dc.titleApplying (3+2+1)D Residual Neural Network with Frame Selection for Hong Kong Sign Language Recognition-
dc.typeConference_Paper-
dc.identifier.emailTam, VWL: vtam@hkucc.hku.hk-
dc.identifier.emailLam, EYM: elam@eee.hku.hk-
dc.identifier.authorityWong Lui, KS=rp00188-
dc.identifier.authorityTam, VWL=rp00173-
dc.identifier.authorityLam, EYM=rp00131-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/ICPR48806.2021.9412075-
dc.identifier.scopuseid_2-s2.0-85110455764-
dc.identifier.hkuros325008-
dc.identifier.spage4296-
dc.identifier.epage4302-
dc.identifier.isiWOS:000678409204056-
dc.publisher.placeUnited States-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats