File Download

There are no files associated with this item.

Supplementary

Conference Paper: Do large language models resolve semantic ambiguities in the same way as humans? The case of word segmentation in Chinese sentence reading

TitleDo large language models resolve semantic ambiguities in the same way as humans? The case of word segmentation in Chinese sentence reading
Authors
Issue Date24-Jul-2024
Abstract

Large language models (LLMs) were trained to predict words without having explicit semantic word representations as humans do. Here we compared LLMs and humans in resolving semantic ambiguities at the word/token level by examining the case of segmenting overlapping ambiguous strings in Chinese sentence reading, where three characters “ABC” could be segmented in either “AB/C” or “A/BC” depending on the context. We showed that although LLMs performed worse than humans, they demonstrated a similar interaction effect between segmentation structure and word frequency order, suggesting that this effect observed in humans could be accounted for by statistical learning of word/token occurrence regularities without assuming an explicit semantic word representation. Nevertheless, across stimuli LLMs’ responses were not correlated with any human performance or eye movement measures, suggesting differences in the underlying processing mechanisms. Thus, it is essential to understand these differences through XAI methods to facilitate LLM adoption.


Persistent Identifierhttp://hdl.handle.net/10722/343639

 

DC FieldValueLanguage
dc.contributor.authorLiao, W-
dc.contributor.authorWang, Z-
dc.contributor.authorShum, K-
dc.contributor.authorChan, AB-
dc.contributor.authorHsiao, J-
dc.date.accessioned2024-05-24T04:12:39Z-
dc.date.available2024-05-24T04:12:39Z-
dc.date.issued2024-07-24-
dc.identifier.urihttp://hdl.handle.net/10722/343639-
dc.description.abstract<p>Large language models (LLMs) were trained to predict words without having explicit semantic word representations as humans do. Here we compared LLMs and humans in resolving semantic ambiguities at the word/token level by examining the case of segmenting overlapping ambiguous strings in Chinese sentence reading, where three characters “ABC” could be segmented in either “AB/C” or “A/BC” depending on the context. We showed that although LLMs performed worse than humans, they demonstrated a similar interaction effect between segmentation structure and word frequency order, suggesting that this effect observed in humans could be accounted for by statistical learning of word/token occurrence regularities without assuming an explicit semantic word representation. Nevertheless, across stimuli LLMs’ responses were not correlated with any human performance or eye movement measures, suggesting differences in the underlying processing mechanisms. Thus, it is essential to understand these differences through XAI methods to facilitate LLM adoption.</p>-
dc.languageeng-
dc.relation.ispartofCogSci 2024 (24/07/2024-27/07/2024, , , Rotterdam)-
dc.titleDo large language models resolve semantic ambiguities in the same way as humans? The case of word segmentation in Chinese sentence reading-
dc.typeConference_Paper-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats