A visual data mining tool that facilitates reconstruction of transcription regulatory networks

Daniel Jupiter, Vincent VanBuren

Research output: Contribution to journalArticle

20 Citations (Scopus)

Abstract

Background: Although the use of microarray technology has seen exponential growth, analysis of microarray data remains a challenge to many investigations. One difficulty lies in the interpretation of a list of differentially expressed genes, or in how to plan new experiments given that knowledge. Clustering methods can be used to identify groups of genes with similar expression patterns, and genes with unknown function can be provisionally annotated based on the concept of "guilt by association", where function is tentatively inferred from the known functions of genes with similar expression patterns. These methods frequently suffer from two limitations: (1) visualization usually only gives access to group membership, rather than specific information about nearest neighbors, and (2) the resolution or quality of the relationship are not easily inferred. Methodology/Principal Findings: We have addressed these issues by improving the precision of similarity detection over that of a single experiment and by creating a tool to visualize tractable networks: we (1) performed meta-analysis computation of correlation coefficients for all gene pairs in a heterogeneous data set collection from 2,145 publicly available microarray samples in mouse, (2) filtered the resulting distribution of over 130 million correlation coefficients to build new, more tractable distribution from the strongest correlations, and (3) designed and implemented a new Web based tool (StarNet, http://vanburenlab.medicine.tamhsc.edu/stamet.html) for visualization of sub-networks of the correlation coefficients built according to user specified parameters. Conclusion/Significance: Correlations were calculated across a heterogeneous collection of publicly available microarray data. Users can access this analysis using a new freely available Web-based application for visualizing tractable correlation networks that are flexibly specified by the user. This new resource enables rapid hypothesis development for transcription regulatory relationships.

Original languageEnglish (US)
Article numbere1717
JournalPLoS One
Volume3
Issue number3
DOIs
StatePublished - Mar 5 2008
Externally publishedYes

Fingerprint

Data Mining
Transcription
Data mining
Microarrays
transcription (genetics)
Genes
genes
Visualization
Guilt
Microarray Analysis
Cluster Analysis
Meta-Analysis
microarray technology
Medicine
Technology
Gene Expression
meta-analysis
Experiments
Association reactions
medicine

ASJC Scopus subject areas

  • Agricultural and Biological Sciences(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Medicine(all)

Cite this

A visual data mining tool that facilitates reconstruction of transcription regulatory networks. / Jupiter, Daniel; VanBuren, Vincent.

In: PLoS One, Vol. 3, No. 3, e1717, 05.03.2008.

Research output: Contribution to journalArticle

@article{7f2cb36ba21e4823bc53fb2d53ba3913,
title = "A visual data mining tool that facilitates reconstruction of transcription regulatory networks",
abstract = "Background: Although the use of microarray technology has seen exponential growth, analysis of microarray data remains a challenge to many investigations. One difficulty lies in the interpretation of a list of differentially expressed genes, or in how to plan new experiments given that knowledge. Clustering methods can be used to identify groups of genes with similar expression patterns, and genes with unknown function can be provisionally annotated based on the concept of {"}guilt by association{"}, where function is tentatively inferred from the known functions of genes with similar expression patterns. These methods frequently suffer from two limitations: (1) visualization usually only gives access to group membership, rather than specific information about nearest neighbors, and (2) the resolution or quality of the relationship are not easily inferred. Methodology/Principal Findings: We have addressed these issues by improving the precision of similarity detection over that of a single experiment and by creating a tool to visualize tractable networks: we (1) performed meta-analysis computation of correlation coefficients for all gene pairs in a heterogeneous data set collection from 2,145 publicly available microarray samples in mouse, (2) filtered the resulting distribution of over 130 million correlation coefficients to build new, more tractable distribution from the strongest correlations, and (3) designed and implemented a new Web based tool (StarNet, http://vanburenlab.medicine.tamhsc.edu/stamet.html) for visualization of sub-networks of the correlation coefficients built according to user specified parameters. Conclusion/Significance: Correlations were calculated across a heterogeneous collection of publicly available microarray data. Users can access this analysis using a new freely available Web-based application for visualizing tractable correlation networks that are flexibly specified by the user. This new resource enables rapid hypothesis development for transcription regulatory relationships.",
author = "Daniel Jupiter and Vincent VanBuren",
year = "2008",
month = "3",
day = "5",
doi = "10.1371/journal.pone.0001717",
language = "English (US)",
volume = "3",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "3",

}

TY - JOUR

T1 - A visual data mining tool that facilitates reconstruction of transcription regulatory networks

AU - Jupiter, Daniel

AU - VanBuren, Vincent

PY - 2008/3/5

Y1 - 2008/3/5

N2 - Background: Although the use of microarray technology has seen exponential growth, analysis of microarray data remains a challenge to many investigations. One difficulty lies in the interpretation of a list of differentially expressed genes, or in how to plan new experiments given that knowledge. Clustering methods can be used to identify groups of genes with similar expression patterns, and genes with unknown function can be provisionally annotated based on the concept of "guilt by association", where function is tentatively inferred from the known functions of genes with similar expression patterns. These methods frequently suffer from two limitations: (1) visualization usually only gives access to group membership, rather than specific information about nearest neighbors, and (2) the resolution or quality of the relationship are not easily inferred. Methodology/Principal Findings: We have addressed these issues by improving the precision of similarity detection over that of a single experiment and by creating a tool to visualize tractable networks: we (1) performed meta-analysis computation of correlation coefficients for all gene pairs in a heterogeneous data set collection from 2,145 publicly available microarray samples in mouse, (2) filtered the resulting distribution of over 130 million correlation coefficients to build new, more tractable distribution from the strongest correlations, and (3) designed and implemented a new Web based tool (StarNet, http://vanburenlab.medicine.tamhsc.edu/stamet.html) for visualization of sub-networks of the correlation coefficients built according to user specified parameters. Conclusion/Significance: Correlations were calculated across a heterogeneous collection of publicly available microarray data. Users can access this analysis using a new freely available Web-based application for visualizing tractable correlation networks that are flexibly specified by the user. This new resource enables rapid hypothesis development for transcription regulatory relationships.

AB - Background: Although the use of microarray technology has seen exponential growth, analysis of microarray data remains a challenge to many investigations. One difficulty lies in the interpretation of a list of differentially expressed genes, or in how to plan new experiments given that knowledge. Clustering methods can be used to identify groups of genes with similar expression patterns, and genes with unknown function can be provisionally annotated based on the concept of "guilt by association", where function is tentatively inferred from the known functions of genes with similar expression patterns. These methods frequently suffer from two limitations: (1) visualization usually only gives access to group membership, rather than specific information about nearest neighbors, and (2) the resolution or quality of the relationship are not easily inferred. Methodology/Principal Findings: We have addressed these issues by improving the precision of similarity detection over that of a single experiment and by creating a tool to visualize tractable networks: we (1) performed meta-analysis computation of correlation coefficients for all gene pairs in a heterogeneous data set collection from 2,145 publicly available microarray samples in mouse, (2) filtered the resulting distribution of over 130 million correlation coefficients to build new, more tractable distribution from the strongest correlations, and (3) designed and implemented a new Web based tool (StarNet, http://vanburenlab.medicine.tamhsc.edu/stamet.html) for visualization of sub-networks of the correlation coefficients built according to user specified parameters. Conclusion/Significance: Correlations were calculated across a heterogeneous collection of publicly available microarray data. Users can access this analysis using a new freely available Web-based application for visualizing tractable correlation networks that are flexibly specified by the user. This new resource enables rapid hypothesis development for transcription regulatory relationships.

UR - http://www.scopus.com/inward/record.url?scp=45849086962&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=45849086962&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0001717

DO - 10.1371/journal.pone.0001717

M3 - Article

VL - 3

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 3

M1 - e1717

ER -