Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models

Waack, Stephan and Keller, Oliver and Asper, Roman and Brodag, Thomas and Damm, Carsten and Fricke, Wolfgang Florian and Surovcik, Katharina and Meinicke, Peter and Merkl, Rainer (2006) Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models. BMC Bioinformatics 7, p. 142.

[img]
Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
455Kb

Other URL: http://www.biomedcentral.com/1471-2105/7/142

Abstract

Background: Horizontal gene transfer (HGT) is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid
adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs) or more specifically pathogenicity or symbiotic islands.

Results: We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU) of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format. It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods.

Conclusion: SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to
interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired
genes.

Item Type:Article
Institutions: Biology, Preclinical Medicine > Institut für Biophysik und physikalische Biochemie > Prof. Dr. Reinhard Sterner > Arbeitsgruppe PD Dr. Rainer Merkl
Identification Number:
ValueType
10.1186/1471-2105-7-142DOI
Subjects:500 Science > 570 Life sciences
Status:Published
Refereed:Yes, this version has been refereed
Created at the University of Regensburg:Partially
Owner:Rainer Merkl
Deposited On:13 Nov 2009 11:16
Last Modified:21 Jul 2011 00:07
Item ID:10938
Owner Only: item control page