Using statistical properties of short subsequences in microbial identification

Sergei Chumakov, Catherine Putonti, B. Montgomery Pettitt, George Fox, Richard C. Willson, Yuriy Fofanov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

The comparative analysis of distributions of the presence/absence of short subsequences of different length ("n-mers", n = 5 - 20) in more than 100 microbial genomes has been performed. Our results show that for organisms, which are not close relatives of each other, the presence/absence of different 10-20-mers in their genomes are not correlated. For close biological relatives, some correlation of the presence of n-mers appears, but is not as strong as expected. Suppressed correlations among the n-mers present in different genomes lead to the possibility of using random sets of n-mers (with appropriately chosen n) to discriminate genomes of different organisms with a low probability of error. We have performed in silico experiments to demonstrate that the presence/absence pattern of 1000 random oligomers of length 12-13 in a bacterial genome is sufficiently characteristic to readily and unambiguously distinguish any known bacterial genome from any other.

Original languageEnglish (US)
Title of host publicationProceedings of the International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences, METMBS'04
EditorsF. Valafar, H. Valafar
Pages363-367
Number of pages5
StatePublished - 2004
Externally publishedYes
EventProceedings of the International Conference on Mathematics and Engineering Techniques in medicine and Biological Sciences, METMBS'04 - Las Vegas, NV, United States
Duration: Jun 21 2004Jun 24 2004

Publication series

NameProceedings of the International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences, METMBS'04

Other

OtherProceedings of the International Conference on Mathematics and Engineering Techniques in medicine and Biological Sciences, METMBS'04
Country/TerritoryUnited States
CityLas Vegas, NV
Period6/21/046/24/04

Keywords

  • Microarray
  • Pathogen identification

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Using statistical properties of short subsequences in microbial identification'. Together they form a unique fingerprint.

Cite this