File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.21437/Interspeech.2016-986
- Scopus: eid_2-s2.0-84994335417
- WOS: WOS:000409394401236
- Find via
Supplementary
- Citations:
- Appears in Collections:
Conference Paper: Deep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale
Title | Deep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale |
---|---|
Authors | |
Keywords | Automatic assessment DBN GRBAS MLP Voice quality |
Issue Date | 2016 |
Publisher | International Speech Communication Association (ISCA). |
Citation | Proceedings of the 17th INTERSPEECH conference 2016 , San Francisco, USA, 8-12 September 2016, p. 2656-2660 How to Cite? |
Abstract | In the field of voice therapy, perceptual evaluation is widely used by expert listeners as a way to evaluate pathological and normal voice quality. This approach is understandably subjective as it is subject to listeners’ bias which high inter- and intra-listeners variability can be found. As such, research on automatic assessment of pathological voices using a combination of subjective and objective analyses emerged. The present study aimed to develop a complementary automatic assessment system for voice quality based on the well-known GRBAS scale by using a battery of multidimensional acoustical measures through Deep Neural Networks. A total of 44 dimensionality parameters including Mel-frequency Cepstral Coefficients, Smoothed Cepstral Peak Prominence and Long-Term Average Spectrum was adopted. In addition, the state-of-the-art automatic assessment system based on Modulation Spectrum (MS) features and GMM classifiers was used as comparison system. The classification results using the proposed method revealed a moderate correlation with subjective GRBAS scores of dysphonic severity, and yielded a better performance than MS-GMM system, with the best accuracy around 81.53%. The findings indicate that such assessment system can be used as an appropriate evaluation tool in determining the presence and severity of voice disorders.
|
Description | Poster Presentation - Session: Learning, Education and Different Speech - no. Sun-P-7-3-3, paper ID 986 |
Persistent Identifier | http://hdl.handle.net/10722/260889 |
ISSN | 2020 SCImago Journal Rankings: 0.689 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Xie, S | - |
dc.contributor.author | Yan, N | - |
dc.contributor.author | Yu, P | - |
dc.contributor.author | Ng, ML | - |
dc.contributor.author | Wang, L | - |
dc.contributor.author | Ji, Z | - |
dc.date.accessioned | 2018-09-14T08:49:04Z | - |
dc.date.available | 2018-09-14T08:49:04Z | - |
dc.date.issued | 2016 | - |
dc.identifier.citation | Proceedings of the 17th INTERSPEECH conference 2016 , San Francisco, USA, 8-12 September 2016, p. 2656-2660 | - |
dc.identifier.issn | 1990-9772 | - |
dc.identifier.uri | http://hdl.handle.net/10722/260889 | - |
dc.description | Poster Presentation - Session: Learning, Education and Different Speech - no. Sun-P-7-3-3, paper ID 986 | - |
dc.description.abstract | In the field of voice therapy, perceptual evaluation is widely used by expert listeners as a way to evaluate pathological and normal voice quality. This approach is understandably subjective as it is subject to listeners’ bias which high inter- and intra-listeners variability can be found. As such, research on automatic assessment of pathological voices using a combination of subjective and objective analyses emerged. The present study aimed to develop a complementary automatic assessment system for voice quality based on the well-known GRBAS scale by using a battery of multidimensional acoustical measures through Deep Neural Networks. A total of 44 dimensionality parameters including Mel-frequency Cepstral Coefficients, Smoothed Cepstral Peak Prominence and Long-Term Average Spectrum was adopted. In addition, the state-of-the-art automatic assessment system based on Modulation Spectrum (MS) features and GMM classifiers was used as comparison system. The classification results using the proposed method revealed a moderate correlation with subjective GRBAS scores of dysphonic severity, and yielded a better performance than MS-GMM system, with the best accuracy around 81.53%. The findings indicate that such assessment system can be used as an appropriate evaluation tool in determining the presence and severity of voice disorders. | - |
dc.language | eng | - |
dc.publisher | International Speech Communication Association (ISCA). | - |
dc.relation.ispartof | Interspeech Conference Proceedings | - |
dc.subject | Automatic assessment | - |
dc.subject | DBN | - |
dc.subject | GRBAS | - |
dc.subject | MLP | - |
dc.subject | Voice quality | - |
dc.title | Deep Neural Networks for Voice Quality Assessment Based on the GRBAS Scale | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Ng, ML: manwa@hku.hk | - |
dc.identifier.authority | Ng, ML=rp00942 | - |
dc.identifier.doi | 10.21437/Interspeech.2016-986 | - |
dc.identifier.scopus | eid_2-s2.0-84994335417 | - |
dc.identifier.hkuros | 290499 | - |
dc.identifier.spage | 2656 | - |
dc.identifier.epage | 2660 | - |
dc.identifier.isi | WOS:000409394401236 | - |
dc.publisher.place | United States | - |
dc.identifier.issnl | 1990-9772 | - |