| Published Version Download ( PDF | 315kB) | License: Creative Commons Attribution Non-commercial 4.0 |
Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing
Burghardt, Manuel
, Granvogl, Daniel and Wolff, Christian
(2016)
Creating a Lexicon of Bavarian Dialect by Means of Facebook Language Data and Crowdsourcing.
In:
LREC 2016, Tenth Int. Conf. on Language Resources and Evaluation : May 23-28, 2016, Portorož, Slovenia; Proc.
European Language Resources Association, Paris, pp. 2029-2033.
ISBN 978-2-9517408-9-1.
Date of publication of this fulltext: 29 May 2017 12:00
Book section
DOI to cite this document: 10.5283/epub.35701
Abstract
Data acquisition in dialectology is typically a tedious task, as dialect samples of spoken language have to be collected via questionnaires or interviews. In this article, we suggest to use the “web as a corpus” approach for dialectology. We present a case study that demonstrates how authentic language data for the Bavarian dialect (ISO 639-3:bar) can be collected automatically from the social ...
Data acquisition in dialectology is typically a tedious task, as dialect samples of spoken language have to be collected via questionnaires or interviews. In this article, we suggest to use the “web as a corpus” approach for dialectology. We present a case study that demonstrates how authentic language data for the Bavarian dialect (ISO 639-3:bar) can be collected automatically from the social network Facebook. We also show that Facebook can be used effectively as a crowdsourcing platform, where users are willing to translate dialect words
collaboratively in order to create a common lexicon of their Bavarian dialect. Key insights from the case study are summarized as “lessons learned”, together with suggestions for future enhancements of the lexicon creation approach.
Alternative links to fulltext
Involved Institutions
Details
| Item type | Book section |
| ISBN | 978-2-9517408-9-1 |
| Title of Book: | LREC 2016, Tenth Int. Conf. on Language Resources and Evaluation : May 23-28, 2016, Portorož, Slovenia; Proc. |
|---|---|
| Publisher: | European Language Resources Association |
| Place of Publication: | Paris |
| Page Range: | pp. 2029-2033 |
| Date | 2016 |
| Institutions | Languages and Literatures > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) Informatics and Data Science > Department Human-Centered Computing > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) |
| Keywords | dialectology, Bavarian, ISO 639-3:bar, dialect lexicon, crowdsourcing, social media, Facebook |
| Dewey Decimal Classification | 000 Computer science, information & general works > 004 Computer science 000 Computer science, information & general works > 020 Library & information sciences |
| Status | Published |
| Refereed | Yes, this version has been refereed |
| Created at the University of Regensburg | Yes |
| URN of the UB Regensburg | urn:nbn:de:bvb:355-epub-357012 |
| Item ID | 35701 |
Download Statistics
Download Statistics