| Published Version Download ( PDF | 2MB) | License: Creative Commons Attribution 4.0 |
GPT-4o for Visual Political Communication: Toward Automated Image Type Analysis
Achmann-Denkler, Michael
, Haim, Mario and Wolff, Christian
(2025)
GPT-4o for Visual Political Communication: Toward Automated Image Type Analysis.
In: Websci '25 : 17th ACM Web Science Conference 2025, May 20 - 24, 2025, NJ, New Brunswick, USA.
Date of publication of this fulltext: 29 Jul 2025 04:28
Conference or workshop item
DOI to cite this document: 10.5283/epub.77445
Abstract
This study explores the potential of multimodal large language models (LLMs), specifically GPT-4o, for automating visual political communication analysis on social media. Using a hierarchical decision tree, we guided non-expert annotators in categorizing Instagram campaign images, achieving reliable annotations (Krippendorff’s α = 0.66–0.86). The annotated dataset was used to test GPT-4o’s ...
This study explores the potential of multimodal large language models (LLMs), specifically GPT-4o, for automating visual political communication analysis on social media. Using a hierarchical decision tree, we guided non-expert annotators in categorizing Instagram campaign images, achieving reliable annotations (Krippendorff’s α = 0.66–0.86). The annotated dataset was used to test GPT-4o’s ability to classify images through prompts reflecting either a hierarchical structure or flat descriptions. Overall, classification for dominant categories like Campaign Event and Collage reached high F1 scores (0.89-0.90), while hierarchies in prompts influenced the outcome minimally. These findings demonstrate that LLMs can effectively assist in classifying selected image types, reducing the workload for human annotators.
Alternative links to fulltext
Involved Institutions
Details
| Item type | Conference or workshop item (Paper) | ||||
| ISBN | 979-8-4007-1483-2 | ||||
| Title of Book: | Websci '25: Proceedings of the 17th ACM Web Science Conference 2025 | ||||
|---|---|---|---|---|---|
| Publisher: | Association for Computating Machinery | ||||
| Place of Publication: | New York | ||||
| Page Range: | pp. 504-509 | ||||
| Date | 2025 | ||||
| Institutions | Languages and Literatures > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) Informatics and Data Science > Department Human-Centered Computing > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) | ||||
| Identification Number |
| ||||
| Keywords | Multimodal Large Language Models, Visual Political Communication, Image Classification, Political Campaign Analysis, Social Media Analysis | ||||
| Dewey Decimal Classification | 000 Computer science, information & general works > 004 Computer science | ||||
| Status | Published | ||||
| Refereed | Yes, this version has been refereed | ||||
| Created at the University of Regensburg | Partially | ||||
| URN of the UB Regensburg | urn:nbn:de:bvb:355-epub-774459 | ||||
| Item ID | 77445 |
Download Statistics
Download Statistics