Direkt zum Inhalt

Burghardt, Manuel ; Granvogl, Daniel ; Wolff, Christian

Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing

Burghardt, Manuel , Granvogl, Daniel and Wolff, Christian (2016) Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing. In: LREC 2016, Tenth Int. Conf. on Language Resources and Evaluation : May 23-28, 2016, Portorož, Slovenia; Proc. European Language Resources Association, Paris, pp. 2029-2033. ISBN 978-2-9517408-9-1.

Date of publication of this fulltext: 29 May 2017 12:00
Book section
DOI to cite this document: 10.5283/epub.35701


Abstract

Data acquisition in dialectology is typically a tedious task, as dialect samples of spoken language have to be collected via questionnaires or interviews. In this article, we suggest to use the “web as a corpus” approach for dialectology. We present a case study that demonstrates how authentic language data for the Bavarian dialect (ISO 639-3:bar) can be collected automatically from the social ...

Data acquisition in dialectology is typically a tedious task, as dialect samples of spoken language have to be collected via questionnaires or interviews. In this article, we suggest to use the “web as a corpus” approach for dialectology. We present a case study that demonstrates how authentic language data for the Bavarian dialect (ISO 639-3:bar) can be collected automatically from the social network Facebook. We also show that Facebook can be used effectively as a crowdsourcing platform, where users are willing to translate dialect words
collaboratively in order to create a common lexicon of their Bavarian dialect. Key insights from the case study are summarized as “lessons learned”, together with suggestions for future enhancements of the lexicon creation approach.



Involved Institutions


Details

Item typeBook section
ISBN978-2-9517408-9-1
Title of Book:LREC 2016, Tenth Int. Conf. on Language Resources and Evaluation : May 23-28, 2016, Portorož, Slovenia; Proc.
Publisher:European Language Resources Association
Place of Publication:Paris
Page Range:pp. 2029-2033
Date2016
InstitutionsLanguages and Literatures > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Informatics and Data Science > Department Human-Centered Computing > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Keywordsdialectology, Bavarian, ISO 639-3:bar, dialect lexicon, crowdsourcing, social media, Facebook
Dewey Decimal Classification000 Computer science, information & general works > 004 Computer science
000 Computer science, information & general works > 020 Library & information sciences
StatusPublished
RefereedYes, this version has been refereed
Created at the University of RegensburgYes
URN of the UB Regensburgurn:nbn:de:bvb:355-epub-357012
Item ID35701

Export bibliographical data

Owner only: item control page

nach oben