File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: An Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech

TitleAn Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech
Authors
Issue Date11-Apr-2025
PublisherNature Research
Citation
Scientific Data, 2025, v. 12 How to Cite?
Abstract

Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to construct a multimodal database designed for Mandarin speech. The dataset integrates synchronized ultrasound images of lingual movement, and the corresponding audio recordings and text annotations elicited from 43 healthy speakers and 11 patients with dysarthria through speech tasks (including vowels, monosyllables, and sentences), with a total duration of 22.31 hours. In addition, a customized helmet structure was employed to stabilize the ultrasound probe, precisely controlling for head movement and minimizing displacement interference, The proposed database carries apparent values in automatic speech recognition, silent interface development, and research in speech pathology and linguistics.


Persistent Identifierhttp://hdl.handle.net/10722/359157
ISSN
2023 Impact Factor: 5.8
2023 SCImago Journal Rankings: 1.937

 

DC FieldValueLanguage
dc.contributor.authorYang, Yudong-
dc.contributor.authorSu, Rongfeng-
dc.contributor.authorZhao, Shaofeng-
dc.contributor.authorWei, Jianguo-
dc.contributor.authorNg, Manwa Lawrence-
dc.contributor.authorYan, Nan-
dc.contributor.authorWang, Lan-
dc.date.accessioned2025-08-22T00:30:38Z-
dc.date.available2025-08-22T00:30:38Z-
dc.date.issued2025-04-11-
dc.identifier.citationScientific Data, 2025, v. 12-
dc.identifier.issn2052-4463-
dc.identifier.urihttp://hdl.handle.net/10722/359157-
dc.description.abstract<p>Ultrasound imaging has been widely adopted in speech research to visualize dynamic tongue movements during speech production. These images are universally used as visual feedback in interventions for articulation disorders or visual cues in speech recognition. Nevertheless, the availability of high-quality audio-ultrasound datasets remains scarce. The present study, therefore, aims to construct a multimodal database designed for Mandarin speech. The dataset integrates synchronized ultrasound images of lingual movement, and the corresponding audio recordings and text annotations elicited from 43 healthy speakers and 11 patients with dysarthria through speech tasks (including vowels, monosyllables, and sentences), with a total duration of 22.31 hours. In addition, a customized helmet structure was employed to stabilize the ultrasound probe, precisely controlling for head movement and minimizing displacement interference, The proposed database carries apparent values in automatic speech recognition, silent interface development, and research in speech pathology and linguistics.<br></p>-
dc.languageeng-
dc.publisherNature Research-
dc.relation.ispartofScientific Data-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.titleAn Audio-Ultrasound Synchronized Database of Tongue Movement for Mandarin speech-
dc.typeArticle-
dc.identifier.doi10.1038/s41597-025-04917-w-
dc.identifier.volume12-
dc.identifier.eissn2052-4463-
dc.identifier.issnl2052-4463-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats