File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: Attention-LSTM Autoencoder for Phonotactics Learning from Raw Audio Input
Title | Attention-LSTM Autoencoder for Phonotactics Learning from Raw Audio Input |
---|---|
Authors | |
Issue Date | 29-Jun-2024 |
Abstract | Infants develop phonemic awareness by 6 to 8 months and phonotactic knowledge by 8 to 10 months. They have statistical learning capabilities and prefer sequences with higher transitional probabilities. However, it's unclear how these abilities are present in early phonological acquisition. This study investigates the ability of a neural network model to acquire phonotactic knowledge using a raw audio corpus. The model is designed without prior knowledge of phonemes or rules and relies solely on raw audio sequences as input. The study focuses on the aspiration alternation in English voiceless stop consonants occurring after the sibilant fricative /s/. A subset of the LibriSpeech corpus is used, with word-initial voiceless stops and /s/-stop sequences. The data is transformed into Mel-spectrograms, and an autoencoder model is trained to compress and decode the input. Ten models are trained and evaluated, and attention matrices are analyzed to measure the model's focus on different segments. The study finds that the model exhibits sensitivity to contrast points and allocates more attention to the /s/ segment when reconstructing the following plosive. The model also shows the ability to differentiate between stops that follow an /s/ and those that do not. Overall, the study demonstrates how an autoencoder model implicitly learns phonotactic knowledge from raw audio data, resembling early stages of language acquisition. |
Persistent Identifier | http://hdl.handle.net/10722/342847 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Do, Youngah | - |
dc.contributor.author | Tan, Lihui | - |
dc.date.accessioned | 2024-05-02T03:06:19Z | - |
dc.date.available | 2024-05-02T03:06:19Z | - |
dc.date.issued | 2024-06-29 | - |
dc.identifier.uri | http://hdl.handle.net/10722/342847 | - |
dc.description.abstract | <p>Infants develop phonemic awareness by 6 to 8 months and phonotactic knowledge by 8 to 10 months. They have statistical learning capabilities and prefer sequences with higher transitional probabilities. However, it's unclear how these abilities are present in early phonological acquisition. This study investigates the ability of a neural network model to acquire phonotactic knowledge using a raw audio corpus. The model is designed without prior knowledge of phonemes or rules and relies solely on raw audio sequences as input. The study focuses on the aspiration alternation in English voiceless stop consonants occurring after the sibilant fricative /s/. A subset of the LibriSpeech corpus is used, with word-initial voiceless stops and /s/-stop sequences. The data is transformed into Mel-spectrograms, and an autoencoder model is trained to compress and decode the input. Ten models are trained and evaluated, and attention matrices are analyzed to measure the model's focus on different segments. The study finds that the model exhibits sensitivity to contrast points and allocates more attention to the /s/ segment when reconstructing the following plosive. The model also shows the ability to differentiate between stops that follow an /s/ and those that do not. Overall, the study demonstrates how an autoencoder model implicitly learns phonotactic knowledge from raw audio data, resembling early stages of language acquisition.</p> | - |
dc.language | eng | - |
dc.relation.ispartof | Laboratory Phonology 19 (26/06/2024-29/06/2024, , , Seoul) | - |
dc.title | Attention-LSTM Autoencoder for Phonotactics Learning from Raw Audio Input | - |
dc.type | Conference_Paper | - |