Applying (3+2+1)D Residual Neural Network with Frame Selection for Hong Kong Sign Language Recognition

Zhou, Z; Wong Lui, KS; Tam, VWL; Lam, EYM

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/ICPR48806.2021.9412075
Scopus: eid_2-s2.0-85110455764
WOS: WOS:000678409204056
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Electrical & Electronic Engineering: Conference papers

Conference Paper: Applying (3+2+1)D Residual Neural Network with Frame Selection for Hong Kong Sign Language Recognition

Title	Applying (3+2+1)D Residual Neural Network with Frame Selection for Hong Kong Sign Language Recognition
Authors	Zhou, Z Wong Lui, KS Tam, VWL Lam, EYM
Keywords	Sign Language Recognition Residual Neural Network Video Recognition
Issue Date	2021
Publisher	IEEE, Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000545
Citation	Proceedings of 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10-15 January 2021, p. 4296-4302 How to Cite? DOI: http://dx.doi.org/10.1109/ICPR48806.2021.9412075
Abstract	As reported by Hong Kong Government in 2017, there are more than 1.5 million residents suffering from hearing impairment in Hong Kong. Most of them rely on Hong Kong Sign Language for daily communication while there are only 63 registered sign language interpreters in Hong Kong. To address this specific social issue and also facilitate the effective communication between the hearing impaired and other people, this paper introduces a word-level Hong Kong Sign Language(HKSL) dataset which currently includes 45 isolated words and at least 30 sign videos per word performed by different signers(more than 1500 videos in total now and still enlarging). Based on this dataset, this paper systemically compares the performances of various deep learning approaches, including (1) 2D histogram of oriented gradients(HOG) feature/pose estimation/feature extraction with long-short term memory(LSTM) layer; (2) 3D Residual Neural Network(ResNet) (3) (2+1)D Residual Neural Network, in HKSL recognition. Meanwhile, to further improve the accuracy of sign language recognition, this paper proposes a novel method called (3+2+1)D ResNet Model with Frame Selection which adopts blurriness detection with Laplacian kernel to construct high-quality video clips and also combines both (2+1)D and 3D ResNet for recognizing the sign language. At the end, the experimental results show that the proposed method outperforms other deep learning approaches and attains an impressive accuracy of 94.6% in our dataset.
Persistent Identifier	http://hdl.handle.net/10722/304345
ISSN	1051-4651 2023 SCImago Journal Rankings: 0.584
ISI Accession Number ID	WOS:000678409204056

DC Field	Value	Language
dc.contributor.author	Zhou, Z	-
dc.contributor.author	Wong Lui, KS	-
dc.contributor.author	Tam, VWL	-
dc.contributor.author	Lam, EYM	-
dc.date.accessioned	2021-09-23T08:58:45Z	-
dc.date.available	2021-09-23T08:58:45Z	-
dc.date.issued	2021	-
dc.identifier.citation	Proceedings of 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10-15 January 2021, p. 4296-4302	-
dc.identifier.issn	1051-4651	-
dc.identifier.uri	http://hdl.handle.net/10722/304345	-
dc.description.abstract	As reported by Hong Kong Government in 2017, there are more than 1.5 million residents suffering from hearing impairment in Hong Kong. Most of them rely on Hong Kong Sign Language for daily communication while there are only 63 registered sign language interpreters in Hong Kong. To address this specific social issue and also facilitate the effective communication between the hearing impaired and other people, this paper introduces a word-level Hong Kong Sign Language(HKSL) dataset which currently includes 45 isolated words and at least 30 sign videos per word performed by different signers(more than 1500 videos in total now and still enlarging). Based on this dataset, this paper systemically compares the performances of various deep learning approaches, including (1) 2D histogram of oriented gradients(HOG) feature/pose estimation/feature extraction with long-short term memory(LSTM) layer; (2) 3D Residual Neural Network(ResNet) (3) (2+1)D Residual Neural Network, in HKSL recognition. Meanwhile, to further improve the accuracy of sign language recognition, this paper proposes a novel method called (3+2+1)D ResNet Model with Frame Selection which adopts blurriness detection with Laplacian kernel to construct high-quality video clips and also combines both (2+1)D and 3D ResNet for recognizing the sign language. At the end, the experimental results show that the proposed method outperforms other deep learning approaches and attains an impressive accuracy of 94.6% in our dataset.	-
dc.language	eng	-
dc.publisher	IEEE, Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000545	-
dc.relation.ispartof	International Conference on Pattern Recognition	-
dc.rights	International Conference on Pattern Recognition. Copyright © IEEE, Computer Society.	-
dc.rights	©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	-
dc.subject	Sign Language Recognition	-
dc.subject	Residual Neural Network	-
dc.subject	Video Recognition	-
dc.title	Applying (3+2+1)D Residual Neural Network with Frame Selection for Hong Kong Sign Language Recognition	-
dc.type	Conference_Paper	-
dc.identifier.email	Tam, VWL: vtam@hkucc.hku.hk	-
dc.identifier.email	Lam, EYM: elam@eee.hku.hk	-
dc.identifier.authority	Wong Lui, KS=rp00188	-
dc.identifier.authority	Tam, VWL=rp00173	-
dc.identifier.authority	Lam, EYM=rp00131	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/ICPR48806.2021.9412075	-
dc.identifier.scopus	eid_2-s2.0-85110455764	-
dc.identifier.hkuros	325008	-
dc.identifier.spage	4296	-
dc.identifier.epage	4302	-
dc.identifier.isi	WOS:000678409204056	-
dc.publisher.place	United States	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Applying (3+2+1)D Residual Neural Network with Frame Selection for Hong Kong Sign Language Recognition

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats