A Hilbert-fine-structure-derived physical metric for predicting the intelligibility of noise-distorted and noise-suppressed speech

Chen, FF; Wong, LLN; Hu, Y

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/j.specom.2013.06.016
Scopus: eid_2-s2.0-84881406707
WOS: WOS:000324227500007
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Division of Speech & Hearing Sciences: Journal/Magazine Articles

Article: A Hilbert-fine-structure-derived physical metric for predicting the intelligibility of noise-distorted and noise-suppressed speech

Title	A Hilbert-fine-structure-derived physical metric for predicting the intelligibility of noise-distorted and noise-suppressed speech
Authors	Chen, FF Wong, LLN Hu, Y
Keywords	Speech intelligibility Hilbert fine-structure signal Speech transmission index
Issue Date	2013
Publisher	Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/specom
Citation	Speech Communication, 2013, v. 55 n. 10, p. 1011-1020 How to Cite? DOI: http://dx.doi.org/10.1016/j.specom.2013.06.016
Abstract	Despite the established importance of temporal fine-structure (TFS) on speech perception in noise, existing speech transmission metrics use primarily envelope information to model speech intelligibility variance. This study proposes a new physical metric for predicting speech intelligibility using information obtained from the Hilbert-derived TFS waveform. It is found that by making explicit use of coherence information contained in the complex spectra of the Hilbert-derived TFS waveforms of the clean and corrupted speech signals, and assessing the extent to which the coherence in the Hilbert fine structure is affected following the linear or non-linear processing (e.g., noise distortion, speech enhancement, etc.) of the stimulus, the predictive power of the intelligibility measure can be significantly improved for noise-distorted and noise-suppressed speech signals. When evaluated with speech recognition scores obtained with normal-hearing listeners, including a total of sixty-four noise-suppressed conditions with nonlinear distortions and eight noisy conditions without subsequent noise reduction, the proposed TFS-based measure was found to predict speech intelligibility better than most envelope- and coherence-based measures. High correlation was maintained for all types of maskers tested, with a maximum correlation of r = 0.95 achieved in car and street noise conditions.
Persistent Identifier	http://hdl.handle.net/10722/184718
ISSN	0167-6393 2023 Impact Factor: 2.4 2023 SCImago Journal Rankings: 0.769
ISI Accession Number ID	WOS:000324227500007

DC Field	Value	Language
dc.contributor.author	Chen, FF	-
dc.contributor.author	Wong, LLN	-
dc.contributor.author	Hu, Y	-
dc.date.accessioned	2013-07-15T10:05:38Z	-
dc.date.available	2013-07-15T10:05:38Z	-
dc.date.issued	2013	-
dc.identifier.citation	Speech Communication, 2013, v. 55 n. 10, p. 1011-1020	-
dc.identifier.issn	0167-6393	-
dc.identifier.uri	http://hdl.handle.net/10722/184718	-
dc.description.abstract	Despite the established importance of temporal fine-structure (TFS) on speech perception in noise, existing speech transmission metrics use primarily envelope information to model speech intelligibility variance. This study proposes a new physical metric for predicting speech intelligibility using information obtained from the Hilbert-derived TFS waveform. It is found that by making explicit use of coherence information contained in the complex spectra of the Hilbert-derived TFS waveforms of the clean and corrupted speech signals, and assessing the extent to which the coherence in the Hilbert fine structure is affected following the linear or non-linear processing (e.g., noise distortion, speech enhancement, etc.) of the stimulus, the predictive power of the intelligibility measure can be significantly improved for noise-distorted and noise-suppressed speech signals. When evaluated with speech recognition scores obtained with normal-hearing listeners, including a total of sixty-four noise-suppressed conditions with nonlinear distortions and eight noisy conditions without subsequent noise reduction, the proposed TFS-based measure was found to predict speech intelligibility better than most envelope- and coherence-based measures. High correlation was maintained for all types of maskers tested, with a maximum correlation of r = 0.95 achieved in car and street noise conditions.	-
dc.language	eng	-
dc.publisher	Elsevier BV. The Journal's web site is located at http://www.elsevier.com/locate/specom	-
dc.relation.ispartof	Speech Communication	-
dc.subject	Speech intelligibility	-
dc.subject	Hilbert fine-structure signal	-
dc.subject	Speech transmission index	-
dc.title	A Hilbert-fine-structure-derived physical metric for predicting the intelligibility of noise-distorted and noise-suppressed speech	-
dc.type	Article	-
dc.identifier.email	Chen, FF: feichen1@hku.hk	-
dc.identifier.email	Wong, LLN: llnwong@hku.hk	-
dc.identifier.authority	Chen, FF=rp01593	-
dc.identifier.authority	Wong, LLN=rp00975	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1016/j.specom.2013.06.016	-
dc.identifier.scopus	eid_2-s2.0-84881406707	-
dc.identifier.hkuros	216595	-
dc.identifier.hkuros	219367	-
dc.identifier.volume	55	-
dc.identifier.issue	10	-
dc.identifier.spage	1011	-
dc.identifier.epage	1020	-
dc.identifier.isi	WOS:000324227500007	-
dc.publisher.place	Netherlands	-
dc.identifier.issnl	0167-6393	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: A Hilbert-fine-structure-derived physical metric for predicting the intelligibility of noise-distorted and noise-suppressed speech

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats