Abstract
1. Research Context and previous work
Recently, a growing amount of systems that allow personal content annotation (tagging) are being
created, ranging from personal sites for organising bookmarks (del.icio.us), photos (flickr.com) or
videos (video.google.com, youtube.com) to systems for managing bibliographies for scientific research
projects (citeulike.org, connotea.org). Simultaneously, a ...
Abstract
1. Research Context and previous work
Recently, a growing amount of systems that allow personal content annotation (tagging) are being
created, ranging from personal sites for organising bookmarks (del.icio.us), photos (flickr.com) or
videos (video.google.com, youtube.com) to systems for managing bibliographies for scientific research
projects (citeulike.org, connotea.org). Simultaneously, a debate on the pro and cons of allowing users
to add personal keywords to digital content has arisen.
One recurrent point-of-discussion is whether tagging can solve the well-known vocabulary problem: In
order to support successful retrieval in complex environments, it is necessary to index an object with a
variety of aliases (cf. Furnas 1987). In this spirit, social tagging enhances the pool of rigid, traditional
keywording by adding user-created retrieval vocabularies. Furthermore, tagging goes beyond simple
personal content-based keywords by providing meta-keywords like funny or interesting that “identify
qualities or characteristics” (Golder and Huberman 2006, Kipp and Campbell 2006, Kipp 2007,
Feinberg 2006, Kroski 2005). Contrarily, tagging systems are claimed to lead to semantic difficulties
that may hinder the precision and recall of tagging systems (e.g. the polysemy problem, cf. Marlow
2006, Lakoff 2005, Golder and Huberman 2006).
Empirical research on social tagging is still rare and mostly from a computer linguistics or librarian
point-of-view (Voß 2007) which focus either on the automatic statistical analyses of large data sets, or
intellectually inspect single cases of tag usage: Some scientists studied the evolution of tag
vocabularies and tag distribution in specific systems (Golder and Huberman 2006, Hammond 2005).
Others concentrate on tagging behaviour and tagger characteristics in collaborative systems.
(Hammond 2005, Kipp and Campbell 2007, Feinberg 2006, Sen 2006). However, little research has
been conducted on the functional and linguistic characteristics of tags.1 An analysis of these patterns
could show differences between user wording and conventional keywording. In order to provide a
reasonable basis for comparison, a classification system for existing tags is needed.
Therefore our main research questions are as follows:
• Is it possible to discover regular patterns in tag usage and to establish a stable category model?
• Does a specific tagging language comparable to internet slang or chatspeak evolve?
• How do social tags differ from traditional (author / expert) keywords?
• To what degree are social tags taken from or findable in the full text of the tagged resource?
• Do tags in a research literature context go beyond simple content description (e.g. tags indicating
time or task-related information, cf. Kipp et al. 2006)?