What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions

Zhang, Liyi; Li, Michael Y.; Thomas Mccoy, R.; Sumers, Theodore R.; Zhu, Jian Qiao; Griffiths, Thomas L.

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Scopus: eid_2-s2.0-105011731541

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Psychology: Journal/Magazine Articles

Article: What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions

Title	What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions
Authors	Zhang, Liyi Li, Michael Y.Thomas Mccoy, R.Sumers, Theodore R.Zhu, Jian Qiao Griffiths, Thomas L.
Issue Date	2025
Citation	Transactions on Machine Learning Research, 2025, v. July-2025 How to Cite?
Abstract	Autoregressive language models have demonstrated a remarkable ability to extract latent structure from text. The embeddings from large language models have been shown to capture aspects of the syntax and semantics of language. But what should embeddings represent? We show that the embeddings from autoregressive models correspond to predictive sufficient statistics. By identifying settings where the predictive sufficient statistics are interpretable distributions over latent variables, including exchangeable models and latent state models, we show that embeddings of autoregressive models encode these explainable quantities of interest. We conduct empirical probing studies to extract information from transformers about latent generating distributions. Furthermore, we show that these embeddings generalize to out-of-distribution cases, do not exhibit token memorization, and that the information we identify is more easily recovered than other related measures. Next, we extend our analysis of exchangeable models to more realistic scenarios where the predictive sufficient statistic is difficult to identify by focusing on an interpretable subcomponent of language, topics. We show that large language models encode topic mixtures inferred by latent Dirichlet allocation (LDA) in both synthetic datasets and natural corpora.
Persistent Identifier	http://hdl.handle.net/10722/367863

DC Field	Value	Language
dc.contributor.author	Zhang, Liyi	-
dc.contributor.author	Li, Michael Y.	-
dc.contributor.author	Thomas Mccoy, R.	-
dc.contributor.author	Sumers, Theodore R.	-
dc.contributor.author	Zhu, Jian Qiao	-
dc.contributor.author	Griffiths, Thomas L.	-
dc.date.accessioned	2025-12-19T08:00:03Z	-
dc.date.available	2025-12-19T08:00:03Z	-
dc.date.issued	2025	-
dc.identifier.citation	Transactions on Machine Learning Research, 2025, v. July-2025	-
dc.identifier.uri	http://hdl.handle.net/10722/367863	-
dc.description.abstract	Autoregressive language models have demonstrated a remarkable ability to extract latent structure from text. The embeddings from large language models have been shown to capture aspects of the syntax and semantics of language. But what should embeddings represent? We show that the embeddings from autoregressive models correspond to predictive sufficient statistics. By identifying settings where the predictive sufficient statistics are interpretable distributions over latent variables, including exchangeable models and latent state models, we show that embeddings of autoregressive models encode these explainable quantities of interest. We conduct empirical probing studies to extract information from transformers about latent generating distributions. Furthermore, we show that these embeddings generalize to out-of-distribution cases, do not exhibit token memorization, and that the information we identify is more easily recovered than other related measures. Next, we extend our analysis of exchangeable models to more realistic scenarios where the predictive sufficient statistic is difficult to identify by focusing on an interpretable subcomponent of language, topics. We show that large language models encode topic mixtures inferred by latent Dirichlet allocation (LDA) in both synthetic datasets and natural corpora.	-
dc.language	eng	-
dc.relation.ispartof	Transactions on Machine Learning Research	-
dc.title	What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.scopus	eid_2-s2.0-105011731541	-
dc.identifier.volume	July-2025	-
dc.identifier.eissn	2835-8856	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats