Direkt zum Inhalt

Schmidt, Thomas ; Schiller, Fabian ; Götz, Mathias ; Wolff, Christian

A Corpus of Memes from Reddit: Acquisition, Preparation and First Case Studies

Schmidt, Thomas , Schiller, Fabian, Götz, Mathias und Wolff, Christian (2023) A Corpus of Memes from Reddit: Acquisition, Preparation and First Case Studies. In: Klein, Maike und Krupka, Daniel und Winter, Cornelia und Wohlgemuth, Volker, (eds.) INFORMATIK 2023. Designing Futures: Zukünfte gestalten. Lecture Notes in Informatics (LNI), 337. Gesellschaft für Informatik e.V. (GI), Bonn, S. 795-804. ISBN 978-3-88579-731-9.

Veröffentlichungsdatum dieses Volltextes: 28 Mai 2024 11:04
Buchkapitel


Zusammenfassung

We present a corpus of memes and their textual components that were acquired from the popular meme platform r\memes, a subreddit of Reddit and one of the major outlets of online meme culture. The corpus consists of the most popular memes from 2013-2021 on the platform and we acquired 11,701 memes and 280,351 text tokens. We conduct several case studies focused on diachronic analysis to highlight ...

We present a corpus of memes and their textual components that were acquired from the popular meme platform r\memes, a subreddit of Reddit and one of the major outlets of online meme culture. The corpus consists of the most popular memes from 2013-2021 on the platform and we acquired 11,701 memes and 280,351 text tokens. We conduct several case studies focused on diachronic analysis to highlight the possibilities of the corpus for research in internet studies and online culture. We examine the general activity on the platform throughout the years and identify a significant increase in meme production beginning 2017. Results of sentiment analysis show a tendency towards memes with positively classified texts. The analysis of most frequent words per half-year spotlights the importance of certain cultural events for meme culture (e.g. the 2016 US election). Using the LIWC to analyze swear and sexual words shows an overall decrease in the usage of these words pointing to an increased moderation of the platform. The corpus is publicly available for the research community for further studies.



Beteiligte Einrichtungen


Details

DokumentenartBuchkapitel
ISBN978-3-88579-731-9
Buchtitel:INFORMATIK 2023. Designing Futures: Zukünfte gestalten
Verlag:Gesellschaft für Informatik e.V. (GI)
Ort der Veröffentlichung:Bonn
Sonstige Reihe:Lecture Notes in Informatics (LNI)
Band:337
Seitenbereich:S. 795-804
DatumSeptember 2023
InstitutionenSprach- und Literatur- und Kulturwissenschaften > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Informatik und Data Science > Fachbereich Menschzentrierte Informatik > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Identifikationsnummer
WertTyp
10.18420/inf2023_89DOI
Verwandte URLs
URLURL Typ
https://github.com/lauchblatt/reddit_memesZusätzliches Material / Supplementary Material
Stichwörter / Keywordsmemes, internet studies, corpus, natural language processing, sentiment analysis, Reddit
Dewey-Dezimal-Klassifikation000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik
StatusVeröffentlicht
BegutachtetJa, diese Version wurde begutachtet
An der Universität Regensburg entstandenJa
URN der UB Regensburgurn:nbn:de:bvb:355-epub-582593
Dokumenten-ID58259

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

nach oben