Do large language models resolve semantic ambiguities in the same way as humans? The case of word segmentation in Chinese sentence reading

Liao, W; Wang, Z; Shum, K; Chan, AB; Hsiao, J

File Download

There are no files associated with this item.

Supplementary

Citations:
Appears in Collections:
- Psychology: Conference papers

Conference Paper: Do large language models resolve semantic ambiguities in the same way as humans? The case of word segmentation in Chinese sentence reading

Title	Do large language models resolve semantic ambiguities in the same way as humans? The case of word segmentation in Chinese sentence reading
Authors	Liao, W Wang, Z Shum, K Chan, AB Hsiao, J
Issue Date	24-Jul-2024
Abstract	Large language models (LLMs) were trained to predict words without having explicit semantic word representations as humans do. Here we compared LLMs and humans in resolving semantic ambiguities at the word/token level by examining the case of segmenting overlapping ambiguous strings in Chinese sentence reading, where three characters “ABC” could be segmented in either “AB/C” or “A/BC” depending on the context. We showed that although LLMs performed worse than humans, they demonstrated a similar interaction effect between segmentation structure and word frequency order, suggesting that this effect observed in humans could be accounted for by statistical learning of word/token occurrence regularities without assuming an explicit semantic word representation. Nevertheless, across stimuli LLMs’ responses were not correlated with any human performance or eye movement measures, suggesting differences in the underlying processing mechanisms. Thus, it is essential to understand these differences through XAI methods to facilitate LLM adoption.
Persistent Identifier	http://hdl.handle.net/10722/343639

DC Field	Value	Language
dc.contributor.author	Liao, W	-
dc.contributor.author	Wang, Z	-
dc.contributor.author	Shum, K	-
dc.contributor.author	Chan, AB	-
dc.contributor.author	Hsiao, J	-
dc.date.accessioned	2024-05-24T04:12:39Z	-
dc.date.available	2024-05-24T04:12:39Z	-
dc.date.issued	2024-07-24	-
dc.identifier.uri	http://hdl.handle.net/10722/343639	-
dc.description.abstract	<p>Large language models (LLMs) were trained to predict words without having explicit semantic word representations as humans do. Here we compared LLMs and humans in resolving semantic ambiguities at the word/token level by examining the case of segmenting overlapping ambiguous strings in Chinese sentence reading, where three characters “ABC” could be segmented in either “AB/C” or “A/BC” depending on the context. We showed that although LLMs performed worse than humans, they demonstrated a similar interaction effect between segmentation structure and word frequency order, suggesting that this effect observed in humans could be accounted for by statistical learning of word/token occurrence regularities without assuming an explicit semantic word representation. Nevertheless, across stimuli LLMs’ responses were not correlated with any human performance or eye movement measures, suggesting differences in the underlying processing mechanisms. Thus, it is essential to understand these differences through XAI methods to facilitate LLM adoption.</p>	-
dc.language	eng	-
dc.relation.ispartof	CogSci 2024 (24/07/2024-27/07/2024, , , Rotterdam)	-
dc.title	Do large language models resolve semantic ambiguities in the same way as humans? The case of word segmentation in Chinese sentence reading	-
dc.type	Conference_Paper	-

File Download

Supplementary

Conference Paper: Do large language models resolve semantic ambiguities in the same way as humans? The case of word segmentation in Chinese sentence reading

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats