Biemann, Christian ; Böhm, Karsten ; Quasthoff, Uwe ; Wolff, Christian

Automatic Discovery and Aggregation of Compound Names for the Use in Knowledge Representations

Biemann, Christian, Böhm, Karsten, Quasthoff, Uwe und Wolff, Christian (2003) Automatic Discovery and Aggregation of Compound Names for the Use in Knowledge Representations. J.UCS – Journal of Universal Computer Science 9 (6), S. 530-541.

Veröffentlichungsdatum dieses Volltextes: 18 Sep 2009 10:33
Artikel
DOI zum Zitieren dieses Dokuments: 10.5283/epub.6837

Vorschau

Download ( PDF | 182kB)

Zusammenfassung

Automatic acquisition of information structures like Topic Maps or semantic networks from large document collections is an important issue in knowledge management. An inherent problem with automatic approaches is the treatment of multiword terms as single semantic entities. Taking company names as an example, we present a method for learning multiword terms from large text corpora exploiting their internal structure. Through the iteration of a search step and a verification step the single words typically forming company names are learnt. These name elements are used for recognizing compounds in order to use them for further processing. We give some evaluation of experiments on company name extraction and discuss some applications.

Alternative Links zum Volltext

Beteiligte Einrichtungen

Sprach- und Literatur- und Kulturwissenschaften > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff) Informatik und Data Science > Fachbereich Menschzentrierte Informatik > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Browse Publikationen

Details

Dokumentenart

Artikel

Titel eines Journals oder einer Zeitschrift

J.UCS – Journal of Universal Computer Science

Band:

Nummer des Zeitschriftenheftes oder des Kapitels:

Seitenbereich:

S. 530-541

Datum

2003

Zusätzliche Informationen (Öffentlich)

Proc. I-Know03, Graz, Juli 2003

Institutionen

Sprach- und Literatur- und Kulturwissenschaften > Institut für Information und Medien, Sprache und Kultur (I:IMSK) > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)
Informatik und Data Science > Fachbereich Menschzentrierte Informatik > Lehrstuhl für Medieninformatik (Prof. Dr. Christian Wolff)

Identifikationsnummer

Wert	Typ
10.3217/jucs-009-06-0530	DOI

Klassifikation

Notation	Art
H.3.3	CCS
H.5.3	CCS
I.2.6	Nicht ausgewählt

Stichwörter / Keywords

corpora, knowledge management, named entity extraction, semantic relations, text mining, topic maps

Dewey-Dezimal-Klassifikation

400 Sprache > 400 Sprachwissenschaft, Linguistik
000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik

Status

Veröffentlicht

Begutachtet

Ja, diese Version wurde begutachtet

An der Universität Regensburg entstanden

Zum Teil

URN der UB Regensburg

urn:nbn:de:bvb:355-epub-68376

Dokumenten-ID

6837

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

Downloadstatistik

Altmetric

Alternative Statistik (altmetrics)

Weitere Literatur (mittels CORE)

nach oben