Nur für Besitzer und Autoren: Kontrollseite des Eintrags

Schmidt, Thomas ; Dennerlein, Katrin ; Wolff, Christian

Emotion Classification in German Plays with Transformer-based Language Models Pretrained on Historical and Contemporary Language

Schmidt, Thomas, Dennerlein, Katrin und Wolff, Christian

(2021) Emotion Classification in German Plays with Transformer-based Language Models Pretrained on Historical and Contemporary Language. In: The 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, November 11, 2021, Punta Cana, Dominican Republic (online).

Veröffentlichungsdatum dieses Volltextes: 03 Feb 2022 09:06
Konferenz- oder Workshop-Beitrag
DOI zum Zitieren dieses Dokuments: 10.5283/epub.51576

Vorschau

Download ( PDF | 279kB)

Lizenz: Creative Commons Namensnennung 4.0 International

Zusammenfassung

We present results of a project on emotion classification on historical German plays of Enlightenment, Storm and Stress, and German Classicism. We have developed a hierarchical annotation scheme consisting of 13 sub-emotions like suffering, love and joy that sum up to 6 main and 2 polarity classes (positive/negative). We have conducted textual annotations on 11 German plays and have acquired over 13,000 emotion annotations by two annotators per play. We have evaluated multiple traditional machine learning approaches as well as transformer-based models pretrained on historical and contemporary language for a single-label text sequence emotion classification for the different emotion categories. The evaluation is carried out on three different instances of the corpus: (1) taking all annotations, (2) filtering overlapping annotations by annotators, (3) applying a heuristic for speech-based analysis. Best results are achieved on the filtered corpus with the best models being large transformer-based models pretrained on contemporary German language. For the polarity classification accuracies of up to 90% are achieved. The accuracies become lower for settings with a higher number of classes, achieving 66% for 13 sub-emotions. Further pretraining of a historical model with a corpus of dramatic texts led to no improvements.

Alternative Links zum Volltext

DOIexterner Link, öffnet neues Fenster

Beteiligte Einrichtungen

Sprach- und Literatur- und Kulturwissenschaften > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) Informatik und Data Science > Fachbereich Menschzentrierte Informatik > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Browse Publikationen

Details

Dokumentenart

Konferenz- oder Workshop-Beitrag (Paper)

Verlag:

Association for Computational Linguistics

Open Access Art:

Gold (ohne APC)

Seitenbereich:

S. 67-79

Datum

2021

Zusätzliche Informationen (Öffentlich)

Proceedings of the 5th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

Institutionen

Sprach- und Literatur- und Kulturwissenschaften > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Informatik und Data Science > Fachbereich Menschzentrierte Informatik > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)

Identifikationsnummer

Wert	Typ
10.18653/v1/2021.latechclfl-1.8	DOI

Dewey-Dezimal-Klassifikation

000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik

Status

Veröffentlicht

Begutachtet

Ja, diese Version wurde begutachtet

An der Universität Regensburg entstanden

URN der UB Regensburg

urn:nbn:de:bvb:355-epub-515760

Dokumenten-ID

51576

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

Downloadstatistik

Altmetric

Alternative Statistik (altmetrics)

Weitere Literatur (mittels CORE)

nach oben