Direkt zum Inhalt

Hellwig, Nils Constantin ; Fehle, Jakob ; Wolff, Christian

Exploring large language models for the generation of synthetic training samples for aspect-based sentiment analysis in low resource settings

Hellwig, Nils Constantin , Fehle, Jakob and Wolff, Christian (2024) Exploring large language models for the generation of synthetic training samples for aspect-based sentiment analysis in low resource settings. Expert Systems with Applications 261, p. 125514.

Date of publication of this fulltext: 28 Oct 2024 12:45
Article
DOI to cite this document: 10.5283/epub.59433


Abstract

Aspect-Based Sentiment Analysis (ABSA) is a fine-grained task in sentiment analysis, aiming to identify sentiment expressed towards specific aspects of an entity. This paper explores the use of Large Language Models (LLMs), specifically GPT-3.5-turbo and Llama-3-70B, for generating annotated data in Aspect-Based Sentiment Analysis (ABSA), aiming to address the scarcity of labelled datasets in the ...

Aspect-Based Sentiment Analysis (ABSA) is a fine-grained task in sentiment analysis, aiming to identify sentiment expressed towards specific aspects of an entity. This paper explores the use of Large Language Models (LLMs), specifically GPT-3.5-turbo and Llama-3-70B, for generating annotated data in Aspect-Based Sentiment Analysis (ABSA), aiming to address the scarcity of labelled datasets in the field. Two low-resource scenarios are considered, with 25 and 500 manually annotated examples available. In the 25-example scenario, adding synthetic examples generated through few-shot prompting resulted in F1 scores of 81.33 for Aspect Category Detection (ACD) and 71.71 for Aspect Category Sentiment Analysis (ACSA). For the 500-example scenario, synthetic data augmentation showed a notable gain only for the ACSA task, raising the F1 score from 84.54 to 86.70.



Involved Institutions


Details

Item typeArticle
Journal or Publication TitleExpert Systems with Applications
Publisher:Elsevier
Volume:261
Page Range:p. 125514
Date17 October 2024
InstitutionsLanguages and Literatures > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Informatics and Data Science > Department Human-Centered Computing > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Identification Number
ValueType
10.1016/j.eswa.2024.125514DOI
KeywordsNatural language processing (NLP), Sentiment analysis (SA), Aspect-based sentiment analysis (ABSA), Large language models (LLMs), Synthetic data generation, Low-resource settings, Data augmentation
Dewey Decimal Classification000 Computer science, information & general works > 004 Computer science
StatusPublished
RefereedYes, this version has been refereed
Created at the University of RegensburgYes
URN of the UB Regensburgurn:nbn:de:bvb:355-epub-594331
Item ID59433

Export bibliographical data

Owner only: item control page

nach oben