A computational tool for the genomic identification of regions of unusual compositional properties and its utilization in the detection of horizontally transferred sequences

Catherine Putonti, Yi Luo, Charles Katili, Sergey Chumakov, George E. Fox, Dan Graur, Yuriy Fofanov

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Similarity Plot (S-plot) is a Windows-based application for large-scale comparisons and 2-dimensional visualization of compositional similarities between genomic sequences. This application combines 2 approaches widely used in genomics: window analysis of statistical characteristics along genomes and dot-plot visual representation. S-plot is effective in identifying highly similar regions between genomes as well as regions with unusual compositional properties (RUCPs) within a single genome, which may be indicative of horizontal gene transfer or of locus-specific selective forces. We use S-plot to identify regions that may have originated through horizontal gene transfer through a 2-step approach, by first comparing a genomic sequence to itself and, subsequently, comparing it to the genomic sequence of a closely related taxon. Moreover, by comparing these suspect sequences to one another, we can estimate a minimum number of sources for these putative xenologous sequences. We illustrate the uses of S-plot in a comparison involving Escherichia coli K12 and E. coli O157:H7. In O157:H7, we found 145 regions that have most probably originated through horizontal gene transfer. By using S-plot to compare each of these regions with 277 completely sequenced prokaryotic genomes, 1 sequence was found to have similar compositional properties to the Yersinia pseudotuberculosis genome, indicating a transfer from a Yersinia or Yersinia relative. Based upon our analysis of RUCPs in O157:H7, we infer that there were at least 53 sources of horizontally transferred sequences.

Original languageEnglish (US)
Pages (from-to)1863-1868
Number of pages6
JournalMolecular Biology and Evolution
Volume23
Issue number10
DOIs
StatePublished - Oct 2006
Externally publishedYes

Fingerprint

genomics
Gene transfer
genome
Genes
Horizontal Gene Transfer
Genome
gene transfer
Yersinia
Escherichia coli
Yersinia pseudotuberculosis
Escherichia coli K12
Escherichia coli O157
Genomics
visualization
statistical analysis
Visualization
detection
loci
horizontal gene transfer
analysis

Keywords

  • Escherichia coli K12
  • Escherichia coli O157:H7
  • Horizontal (lateral) gene transfer
  • Sequence composition

ASJC Scopus subject areas

  • Genetics
  • Biochemistry
  • Genetics(clinical)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Ecology, Evolution, Behavior and Systematics
  • Agricultural and Biological Sciences (miscellaneous)
  • Molecular Biology

Cite this

A computational tool for the genomic identification of regions of unusual compositional properties and its utilization in the detection of horizontally transferred sequences. / Putonti, Catherine; Luo, Yi; Katili, Charles; Chumakov, Sergey; Fox, George E.; Graur, Dan; Fofanov, Yuriy.

In: Molecular Biology and Evolution, Vol. 23, No. 10, 10.2006, p. 1863-1868.

Research output: Contribution to journalArticle

@article{16f1dcbdb2c040208179c67697ace6d1,
title = "A computational tool for the genomic identification of regions of unusual compositional properties and its utilization in the detection of horizontally transferred sequences",
abstract = "Similarity Plot (S-plot) is a Windows-based application for large-scale comparisons and 2-dimensional visualization of compositional similarities between genomic sequences. This application combines 2 approaches widely used in genomics: window analysis of statistical characteristics along genomes and dot-plot visual representation. S-plot is effective in identifying highly similar regions between genomes as well as regions with unusual compositional properties (RUCPs) within a single genome, which may be indicative of horizontal gene transfer or of locus-specific selective forces. We use S-plot to identify regions that may have originated through horizontal gene transfer through a 2-step approach, by first comparing a genomic sequence to itself and, subsequently, comparing it to the genomic sequence of a closely related taxon. Moreover, by comparing these suspect sequences to one another, we can estimate a minimum number of sources for these putative xenologous sequences. We illustrate the uses of S-plot in a comparison involving Escherichia coli K12 and E. coli O157:H7. In O157:H7, we found 145 regions that have most probably originated through horizontal gene transfer. By using S-plot to compare each of these regions with 277 completely sequenced prokaryotic genomes, 1 sequence was found to have similar compositional properties to the Yersinia pseudotuberculosis genome, indicating a transfer from a Yersinia or Yersinia relative. Based upon our analysis of RUCPs in O157:H7, we infer that there were at least 53 sources of horizontally transferred sequences.",
keywords = "Escherichia coli K12, Escherichia coli O157:H7, Horizontal (lateral) gene transfer, Sequence composition",
author = "Catherine Putonti and Yi Luo and Charles Katili and Sergey Chumakov and Fox, {George E.} and Dan Graur and Yuriy Fofanov",
year = "2006",
month = "10",
doi = "10.1093/molbev/msl053",
language = "English (US)",
volume = "23",
pages = "1863--1868",
journal = "Molecular Biology and Evolution",
issn = "0737-4038",
publisher = "Oxford University Press",
number = "10",

}

TY - JOUR

T1 - A computational tool for the genomic identification of regions of unusual compositional properties and its utilization in the detection of horizontally transferred sequences

AU - Putonti, Catherine

AU - Luo, Yi

AU - Katili, Charles

AU - Chumakov, Sergey

AU - Fox, George E.

AU - Graur, Dan

AU - Fofanov, Yuriy

PY - 2006/10

Y1 - 2006/10

N2 - Similarity Plot (S-plot) is a Windows-based application for large-scale comparisons and 2-dimensional visualization of compositional similarities between genomic sequences. This application combines 2 approaches widely used in genomics: window analysis of statistical characteristics along genomes and dot-plot visual representation. S-plot is effective in identifying highly similar regions between genomes as well as regions with unusual compositional properties (RUCPs) within a single genome, which may be indicative of horizontal gene transfer or of locus-specific selective forces. We use S-plot to identify regions that may have originated through horizontal gene transfer through a 2-step approach, by first comparing a genomic sequence to itself and, subsequently, comparing it to the genomic sequence of a closely related taxon. Moreover, by comparing these suspect sequences to one another, we can estimate a minimum number of sources for these putative xenologous sequences. We illustrate the uses of S-plot in a comparison involving Escherichia coli K12 and E. coli O157:H7. In O157:H7, we found 145 regions that have most probably originated through horizontal gene transfer. By using S-plot to compare each of these regions with 277 completely sequenced prokaryotic genomes, 1 sequence was found to have similar compositional properties to the Yersinia pseudotuberculosis genome, indicating a transfer from a Yersinia or Yersinia relative. Based upon our analysis of RUCPs in O157:H7, we infer that there were at least 53 sources of horizontally transferred sequences.

AB - Similarity Plot (S-plot) is a Windows-based application for large-scale comparisons and 2-dimensional visualization of compositional similarities between genomic sequences. This application combines 2 approaches widely used in genomics: window analysis of statistical characteristics along genomes and dot-plot visual representation. S-plot is effective in identifying highly similar regions between genomes as well as regions with unusual compositional properties (RUCPs) within a single genome, which may be indicative of horizontal gene transfer or of locus-specific selective forces. We use S-plot to identify regions that may have originated through horizontal gene transfer through a 2-step approach, by first comparing a genomic sequence to itself and, subsequently, comparing it to the genomic sequence of a closely related taxon. Moreover, by comparing these suspect sequences to one another, we can estimate a minimum number of sources for these putative xenologous sequences. We illustrate the uses of S-plot in a comparison involving Escherichia coli K12 and E. coli O157:H7. In O157:H7, we found 145 regions that have most probably originated through horizontal gene transfer. By using S-plot to compare each of these regions with 277 completely sequenced prokaryotic genomes, 1 sequence was found to have similar compositional properties to the Yersinia pseudotuberculosis genome, indicating a transfer from a Yersinia or Yersinia relative. Based upon our analysis of RUCPs in O157:H7, we infer that there were at least 53 sources of horizontally transferred sequences.

KW - Escherichia coli K12

KW - Escherichia coli O157:H7

KW - Horizontal (lateral) gene transfer

KW - Sequence composition

UR - http://www.scopus.com/inward/record.url?scp=33748778231&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33748778231&partnerID=8YFLogxK

U2 - 10.1093/molbev/msl053

DO - 10.1093/molbev/msl053

M3 - Article

C2 - 16829541

AN - SCOPUS:33748778231

VL - 23

SP - 1863

EP - 1868

JO - Molecular Biology and Evolution

JF - Molecular Biology and Evolution

SN - 0737-4038

IS - 10

ER -