| Published Version Download ( PDF | 1MB) | License: Creative Commons Attribution 4.0 |
Exploring large language models for the generation of synthetic training samples for aspect-based sentiment analysis in low resource settings
Hellwig, Nils Constantin
, Fehle, Jakob
and Wolff, Christian
(2024)
Exploring large language models for the generation of synthetic training samples for aspect-based sentiment analysis in low resource settings.
Expert Systems with Applications 261, p. 125514.
Date of publication of this fulltext: 28 Oct 2024 12:45
Article
DOI to cite this document: 10.5283/epub.59433
Abstract
Aspect-Based Sentiment Analysis (ABSA) is a fine-grained task in sentiment analysis, aiming to identify sentiment expressed towards specific aspects of an entity. This paper explores the use of Large Language Models (LLMs), specifically GPT-3.5-turbo and Llama-3-70B, for generating annotated data in Aspect-Based Sentiment Analysis (ABSA), aiming to address the scarcity of labelled datasets in the ...
Aspect-Based Sentiment Analysis (ABSA) is a fine-grained task in sentiment analysis, aiming to identify sentiment expressed towards specific aspects of an entity. This paper explores the use of Large Language Models (LLMs), specifically GPT-3.5-turbo and Llama-3-70B, for generating annotated data in Aspect-Based Sentiment Analysis (ABSA), aiming to address the scarcity of labelled datasets in the field. Two low-resource scenarios are considered, with 25 and 500 manually annotated examples available. In the 25-example scenario, adding synthetic examples generated through few-shot prompting resulted in F1 scores of 81.33 for Aspect Category Detection (ACD) and 71.71 for Aspect Category Sentiment Analysis (ACSA). For the 500-example scenario, synthetic data augmentation showed a notable gain only for the ACSA task, raising the F1 score from 84.54 to 86.70.
Alternative links to fulltext
Involved Institutions
Details
| Item type | Article | ||||
| Journal or Publication Title | Expert Systems with Applications | ||||
| Publisher: | Elsevier | ||||
|---|---|---|---|---|---|
| Volume: | 261 | ||||
| Page Range: | p. 125514 | ||||
| Date | 17 October 2024 | ||||
| Institutions | Languages and Literatures > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) Informatics and Data Science > Department Human-Centered Computing > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) | ||||
| Identification Number |
| ||||
| Keywords | Natural language processing (NLP), Sentiment analysis (SA), Aspect-based sentiment analysis (ABSA), Large language models (LLMs), Synthetic data generation, Low-resource settings, Data augmentation | ||||
| Dewey Decimal Classification | 000 Computer science, information & general works > 004 Computer science | ||||
| Status | Published | ||||
| Refereed | Yes, this version has been refereed | ||||
| Created at the University of Regensburg | Yes | ||||
| URN of the UB Regensburg | urn:nbn:de:bvb:355-epub-594331 | ||||
| Item ID | 59433 |
Download Statistics
Download Statistics