An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech

Yang, Yudong; Su, Rongfeng; Zhao, Shaofeng; Wei, Jianguo; Ng, Manwa Lawrence; Yan, Nan; Wang, Lan

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1038/s41597-025-04917-w
Find via

Supplementary

Citations:
Appears in Collections:
- Faculty of Education: Journal/Magazine Articles

Article: An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech

Title	An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech
Authors	Yang, Yudong Su, Rongfeng Zhao, Shaofeng Wei, Jianguo Ng, Manwa Lawrence Yan, Nan Wang, Lan
Issue Date	11-Apr-2025
Publisher	Nature Research
Citation	Scientific Data, 2025, v. 12 How to Cite? DOI: http://dx.doi.org/10.1038/s41597-025-04917-w
Abstract	Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to construct a multimodal database designed for Mandarin speech. The dataset integrates synchronized ultrasound images of lingual movement, and the corresponding audio recordings and text annotations elicited from 43 healthy speakers and 11 patients with dysarthria through speech tasks (including vowels, monosyllables, and sentences), with a total duration of 22.31 hours. In addition, a customized helmet structure was employed to stabilize the ultrasound probe, precisely controlling for head movement and minimizing displacement interference, The proposed database carries apparent values in automatic speech recognition, silent interface development, and research in speech pathology and linguistics.
Persistent Identifier	http://hdl.handle.net/10722/359157
ISSN	2052-4463 2023 Impact Factor: 5.8 2023 SCImago Journal Rankings: 1.937

DC Field	Value	Language
dc.contributor.author	Yang, Yudong	-
dc.contributor.author	Su, Rongfeng	-
dc.contributor.author	Zhao, Shaofeng	-
dc.contributor.author	Wei, Jianguo	-
dc.contributor.author	Ng, Manwa Lawrence	-
dc.contributor.author	Yan, Nan	-
dc.contributor.author	Wang, Lan	-
dc.date.accessioned	2025-08-22T00:30:38Z	-
dc.date.available	2025-08-22T00:30:38Z	-
dc.date.issued	2025-04-11	-
dc.identifier.citation	Scientific Data, 2025, v. 12	-
dc.identifier.issn	2052-4463	-
dc.identifier.uri	http://hdl.handle.net/10722/359157	-
dc.description.abstract	<p>Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to construct a multimodal database designed for Mandarin speech. The dataset integrates synchronized ultrasound images of lingual movement, and the corresponding audio recordings and text annotations elicited from 43 healthy speakers and 11 patients with dysarthria through speech tasks (including vowels, monosyllables, and sentences), with a total duration of 22.31 hours. In addition, a customized helmet structure was employed to stabilize the ultrasound probe, precisely controlling for head movement and minimizing displacement interference, The proposed database carries apparent values in automatic speech recognition, silent interface development, and research in speech pathology and linguistics.<br></p>	-
dc.language	eng	-
dc.publisher	Nature Research	-
dc.relation.ispartof	Scientific Data	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.title	An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech	-
dc.type	Article	-
dc.identifier.doi	10.1038/s41597-025-04917-w	-
dc.identifier.volume	12	-
dc.identifier.eissn	2052-4463	-
dc.identifier.issnl	2052-4463	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats