BRAINSCUBA: FINE-GRAINED NATURAL LANGUAGE CAPTIONS OF VISUAL CORTEX SELECTIVITY

Luo, Andrew F.; Henderson, Margaret M.; Tarr, Michael J.; Wehbe, Leila

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Scopus: eid_2-s2.0-85200586853

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: BRAINSCUBA: FINE-GRAINED NATURAL LANGUAGE CAPTIONS OF VISUAL CORTEX SELECTIVITY

Title	BRAINSCUBA: FINE-GRAINED NATURAL LANGUAGE CAPTIONS OF VISUAL CORTEX SELECTIVITY
Authors	Luo, Andrew F.Henderson, Margaret M.Tarr, Michael J.Wehbe, Leila
Issue Date	2024
Citation	12th International Conference on Learning Representations, ICLR 2024, 2024 How to Cite?
Abstract	Understanding the functional organization of higher visual cortex is a central focus in neuroscience. Past studies have primarily mapped the visual and semantic selectivity of neural populations using hand-selected stimuli, which may potentially bias results towards pre-existing hypotheses of visual cortex functionality. Moving beyond conventional approaches, we introduce a data-driven method that generates natural language descriptions for images predicted to maximally activate individual voxels of interest. Our method - Semantic Captioning Using Brain Alignments (“BrainSCUBA”) - builds upon the rich embedding space learned by a contrastive vision-language model and utilizes a pre-trained large language model to generate interpretable captions. We validate our method through fine-grained voxel-level captioning across higher-order visual regions. We further perform text-conditioned image synthesis with the captions, and show that our images are semantically coherent and yield high predicted activations. Finally, to demonstrate how our method enables scientific discovery, we perform exploratory investigations on the distribution of “person” representations in the brain, and discover fine-grained semantic selectivity in body-selective areas. Unlike earlier studies that decode text, our method derives voxel-wise captions of semantic selectivity. Our results show that BrainSCUBA is a promising means for understanding functional preferences in the brain, and provides motivation for further hypothesis-driven investigation of visual cortex. Code and project site: https://www.cs.cmu.edu/~afluo/BrainSCUBA.
Persistent Identifier	http://hdl.handle.net/10722/352502

DC Field	Value	Language
dc.contributor.author	Luo, Andrew F.	-
dc.contributor.author	Henderson, Margaret M.	-
dc.contributor.author	Tarr, Michael J.	-
dc.contributor.author	Wehbe, Leila	-
dc.date.accessioned	2024-12-16T03:59:29Z	-
dc.date.available	2024-12-16T03:59:29Z	-
dc.date.issued	2024	-
dc.identifier.citation	12th International Conference on Learning Representations, ICLR 2024, 2024	-
dc.identifier.uri	http://hdl.handle.net/10722/352502	-
dc.description.abstract	Understanding the functional organization of higher visual cortex is a central focus in neuroscience. Past studies have primarily mapped the visual and semantic selectivity of neural populations using hand-selected stimuli, which may potentially bias results towards pre-existing hypotheses of visual cortex functionality. Moving beyond conventional approaches, we introduce a data-driven method that generates natural language descriptions for images predicted to maximally activate individual voxels of interest. Our method - Semantic Captioning Using Brain Alignments (“BrainSCUBA”) - builds upon the rich embedding space learned by a contrastive vision-language model and utilizes a pre-trained large language model to generate interpretable captions. We validate our method through fine-grained voxel-level captioning across higher-order visual regions. We further perform text-conditioned image synthesis with the captions, and show that our images are semantically coherent and yield high predicted activations. Finally, to demonstrate how our method enables scientific discovery, we perform exploratory investigations on the distribution of “person” representations in the brain, and discover fine-grained semantic selectivity in body-selective areas. Unlike earlier studies that decode text, our method derives voxel-wise captions of semantic selectivity. Our results show that BrainSCUBA is a promising means for understanding functional preferences in the brain, and provides motivation for further hypothesis-driven investigation of visual cortex. Code and project site: https://www.cs.cmu.edu/~afluo/BrainSCUBA.	-
dc.language	eng	-
dc.relation.ispartof	12th International Conference on Learning Representations, ICLR 2024	-
dc.title	BRAINSCUBA: FINE-GRAINED NATURAL LANGUAGE CAPTIONS OF VISUAL CORTEX SELECTIVITY	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.scopus	eid_2-s2.0-85200586853	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: BRAINSCUBA: FINE-GRAINED NATURAL LANGUAGE CAPTIONS OF VISUAL CORTEX SELECTIVITY

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats