File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Perception of synthesized voice quality in connected speech by Cantonese speakers

TitlePerception of synthesized voice quality in connected speech by Cantonese speakers
Authors
KeywordsPhysics
Sound
Issue Date2002
PublisherAcoustical Society of America. The Journal's web site is located at http://asa.aip.org/jasa.html
Citation
Journal of the Acoustical Society of America, 2002, v. 112 n. 3, p. 1091-1101 How to Cite?
AbstractPerceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these "anchor" signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the "built-in" synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or "anchors" to improve the reliability of clinical perceptual voice evaluation. © 2002 Acoustical Society of America.
Persistent Identifierhttp://hdl.handle.net/10722/45331
ISSN
2023 Impact Factor: 2.1
2023 SCImago Journal Rankings: 0.687
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorYiu, EMLen_HK
dc.contributor.authorMurdoch, Ben_HK
dc.contributor.authorHird, Ken_HK
dc.contributor.authorLau, Pen_HK
dc.date.accessioned2007-10-30T06:23:02Z-
dc.date.available2007-10-30T06:23:02Z-
dc.date.issued2002en_HK
dc.identifier.citationJournal of the Acoustical Society of America, 2002, v. 112 n. 3, p. 1091-1101-
dc.identifier.issn0001-4966en_HK
dc.identifier.urihttp://hdl.handle.net/10722/45331-
dc.description.abstractPerceptual voice analysis is a subjective process. However, despite reports of varying degrees of intrajudge and interjudge reliability, it is widely used in clinical voice evaluation. One of the ways to improve the reliability of this procedure is to provide judges with signals as external standards so that comparison can be made in relation to these "anchor" signals. The present study used a Klatt speech synthesizer to create a set of speech signals with varying degree of three different voice qualities based on a Cantonese sentence. The primary objective of the study was to determine whether different abnormal voice qualities could be synthesized using the "built-in" synthesis parameters using a perceptual study. The second objective was to determine the relationship between acoustic characteristics of the synthesized signals and perceptual judgment. Twenty Cantonese-speaking speech pathologists with at least three years of clinical experience in perceptual voice evaluation were asked to undertake two tasks. The first was to decide whether the voice quality of the synthesized signals was normal or not. The second was to decide whether the abnormal signals should be described as rough, breathy, or vocal fry. The results showed that signals generated with a small degree of aspiration noise were perceived as breathiness while signals with a small degree of flutter or double pulsing were perceived as roughness. When the flutter or double pulsing increased further, tremor and vocal fry, rather than roughness, were perceived. Furthermore, the amount of aspiration noise, flutter, or double pulsing required for male voice stimuli was different from that required for the female voice stimuli with a similar level of perceptual breathiness and roughness. These findings showed that changes in perceived vocal quality could be achieved by systematic modifications of synthesis parameters. This opens up the possibility of using synthesized voice signals as external standards or "anchors" to improve the reliability of clinical perceptual voice evaluation. © 2002 Acoustical Society of America.en_HK
dc.format.extent170240 bytes-
dc.format.extent2411 bytes-
dc.format.mimetypeapplication/pdf-
dc.format.mimetypetext/plain-
dc.languageengen_HK
dc.publisherAcoustical Society of America. The Journal's web site is located at http://asa.aip.org/jasa.htmlen_HK
dc.relation.ispartofJournal of the Acoustical Society of Americaen_HK
dc.rightsCopyright 2002 Acoustical Society of America. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the Acoustical Society of America. The following article appeared in Journal of the Acoustical Society of America, 2002, v. 112 n. 3, p. 1091-1101 and may be found at https://doi.org/10.1121/1.1500753-
dc.subjectPhysicsen_HK
dc.subjectSounden_HK
dc.titlePerception of synthesized voice quality in connected speech by Cantonese speakersen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0001-4966&volume=112&issue=3&spage=1091&epage=1101&date=2002&atitle=Perception+of+synthesized+voice+quality+in+connected+speech+by+Cantonese+speakersen_HK
dc.identifier.emailYiu, EML: eyiu@hku.hken_HK
dc.identifier.authorityYiu, EML=rp00981en_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1121/1.1500753en_HK
dc.identifier.pmid12243157en_HK
dc.identifier.scopuseid_2-s2.0-0036711748en_HK
dc.identifier.hkuros78478-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-0036711748&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume112en_HK
dc.identifier.issue3-
dc.identifier.spage1091en_HK
dc.identifier.epage1101en_HK
dc.identifier.isiWOS:000177996400032-
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridYiu, EML=7003337895en_HK
dc.identifier.scopusauthoridMurdoch, B=7005161745en_HK
dc.identifier.scopusauthoridHird, K=6701518192en_HK
dc.identifier.scopusauthoridLau, P=23768195700en_HK
dc.identifier.issnl0001-4966-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats