File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?

TitleLabeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?
Authors
Issue Date2-Jul-2025
PublisherCambridge University Press
Citation
Political Science Research and Methods, 2025 How to Cite?
AbstractThe increasing multimodality (e.g., images, videos, links) of social media data presents opportunities and challenges. But text-as-data methods continue to dominate as modes of classification, as multimodal social media data are costly to collect and label. Researchers who face a budget constraint may need to make informed decisions regarding whether to collect and label only the textual content of social media data or their full multimodal content. In this article, we develop five measures and an experimental framework to assist with these decisions. We propose five performance metrics to measure the costs and benefits of multimodal labeling: average time per post, average time per valid response, valid response rate, intercoder agreement, and classifier's predictive power. To estimate these measures, we introduce an experimental framework to evaluate coders' performance under text-only and multimodal labeling conditions. We illustrate the method with a tweet labeling experiment.
Persistent Identifierhttp://hdl.handle.net/10722/366620
ISSN
2023 Impact Factor: 2.5
2023 SCImago Journal Rankings: 2.431

 

DC FieldValueLanguage
dc.contributor.authorChen, Haohan-
dc.contributor.authorBisbee, James-
dc.contributor.authorTucker, Joshua A.-
dc.contributor.authorNagler, Jonathan-
dc.date.accessioned2025-11-25T04:20:35Z-
dc.date.available2025-11-25T04:20:35Z-
dc.date.issued2025-07-02-
dc.identifier.citationPolitical Science Research and Methods, 2025-
dc.identifier.issn2049-8470-
dc.identifier.urihttp://hdl.handle.net/10722/366620-
dc.description.abstractThe increasing multimodality (e.g., images, videos, links) of social media data presents opportunities and challenges. But text-as-data methods continue to dominate as modes of classification, as multimodal social media data are costly to collect and label. Researchers who face a budget constraint may need to make informed decisions regarding whether to collect and label only the textual content of social media data or their full multimodal content. In this article, we develop five measures and an experimental framework to assist with these decisions. We propose five performance metrics to measure the costs and benefits of multimodal labeling: average time per post, average time per valid response, valid response rate, intercoder agreement, and classifier's predictive power. To estimate these measures, we introduce an experimental framework to evaluate coders' performance under text-only and multimodal labeling conditions. We illustrate the method with a tweet labeling experiment.-
dc.languageeng-
dc.publisherCambridge University Press-
dc.relation.ispartofPolitical Science Research and Methods-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.titleLabeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?-
dc.typeArticle-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1017/psrm.2025.10010-
dc.identifier.eissn2049-8489-
dc.identifier.issnl2049-8470-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats