| Veröffentlichte Version Download ( PDF | 2MB) | Lizenz: Creative Commons Namensnennung 4.0 International |
GPT-4o for Visual Political Communication: Toward Automated Image Type Analysis
Achmann-Denkler, Michael
, Haim, Mario und Wolff, Christian
(2025)
GPT-4o for Visual Political Communication: Toward Automated Image Type Analysis.
In: Websci '25 : 17th ACM Web Science Conference 2025, May 20 - 24, 2025, NJ, New Brunswick, USA.
Veröffentlichungsdatum dieses Volltextes: 29 Jul 2025 04:28
Konferenz- oder Workshop-Beitrag
DOI zum Zitieren dieses Dokuments: 10.5283/epub.77445
Zusammenfassung
This study explores the potential of multimodal large language models (LLMs), specifically GPT-4o, for automating visual political communication analysis on social media. Using a hierarchical decision tree, we guided non-expert annotators in categorizing Instagram campaign images, achieving reliable annotations (Krippendorff’s α = 0.66–0.86). The annotated dataset was used to test GPT-4o’s ...
This study explores the potential of multimodal large language models (LLMs), specifically GPT-4o, for automating visual political communication analysis on social media. Using a hierarchical decision tree, we guided non-expert annotators in categorizing Instagram campaign images, achieving reliable annotations (Krippendorff’s α = 0.66–0.86). The annotated dataset was used to test GPT-4o’s ability to classify images through prompts reflecting either a hierarchical structure or flat descriptions. Overall, classification for dominant categories like Campaign Event and Collage reached high F1 scores (0.89-0.90), while hierarchies in prompts influenced the outcome minimally. These findings demonstrate that LLMs can effectively assist in classifying selected image types, reducing the workload for human annotators.
Alternative Links zum Volltext
Beteiligte Einrichtungen
Details
| Dokumentenart | Konferenz- oder Workshop-Beitrag (Paper) | ||||
| ISBN | 979-8-4007-1483-2 | ||||
| Buchtitel: | Websci '25: Proceedings of the 17th ACM Web Science Conference 2025 | ||||
|---|---|---|---|---|---|
| Verlag: | Association for Computating Machinery | ||||
| Ort der Veröffentlichung: | New York | ||||
| Seitenbereich: | S. 504-509 | ||||
| Datum | 2025 | ||||
| Institutionen | Sprach- und Literatur- und Kulturwissenschaften > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) Informatik und Data Science > Fachbereich Menschzentrierte Informatik > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) | ||||
| Identifikationsnummer |
| ||||
| Stichwörter / Keywords | Multimodal Large Language Models, Visual Political Communication, Image Classification, Political Campaign Analysis, Social Media Analysis | ||||
| Dewey-Dezimal-Klassifikation | 000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik | ||||
| Status | Veröffentlicht | ||||
| Begutachtet | Ja, diese Version wurde begutachtet | ||||
| An der Universität Regensburg entstanden | Zum Teil | ||||
| URN der UB Regensburg | urn:nbn:de:bvb:355-epub-774459 | ||||
| Dokumenten-ID | 77445 |
Downloadstatistik
Downloadstatistik