The role of complementary bipartite visual analytical representations in the analysis of SNPs

A case study in ancestral informative markers

Suresh Bhavnani, Gowtham Bellala, Sundar Victor, Kevin E. Bassler, Shyam Visweswaran

Research output: Contribution to journalArticle

7 Citations (Scopus)

Abstract

Objective: Several studies have shown how sets of single-nucleotide polymorphisms (SNPs) can help to classify subjects on the basis of their continental origins, with applications to caseecontrol studies and population genetics. However, most of these studies use dimensionality-reduction methods, such as principal component analysis, or clustering methods that result in unipartite (either subjects or SNPs) representations of the data. Such analyses conceal important bipartite relationships, such as how subject and SNP clusters relate to each other, and the genotypes that determine their cluster memberships. Methods: To overcome the limitations of current methods of analyzing SNP data, the authors used three bipartite analytical representations (bipartite network, heat map with dendrograms, and Circos ideogram) that enable the simultaneous visualization and analysis of subjects, SNPs, and subject attributes. Results: The results demonstrate (1) novel insights into SNP data that are difficult to derive from purely unipartite views of the data, (2) the strengths and limitations of each method, revealing the role that each play in revealing novel insights, and (3) implications for how the methods can be used for the analysis of SNPs in genomic studies associated with disease. Conclusion: The results suggest that bipartite representations can reveal new patterns in SNP data compared with existing unipartite representations. However, the novel insights require multiple representations to discover, verify, and comprehend the complex relationships. The results therefore motivate the need for a complementary visual analytical framework that guides the use of multiple bipartite representations to analyze complex relationships in SNP data.

Original languageEnglish (US)
JournalJournal of the American Medical Informatics Association
Volume19
Issue numberE1
DOIs
StatePublished - Jun 2012

Fingerprint

Single Nucleotide Polymorphism
Population Genetics
Principal Component Analysis
Cluster Analysis
Hot Temperature
Genotype

ASJC Scopus subject areas

  • Health Informatics

Cite this

The role of complementary bipartite visual analytical representations in the analysis of SNPs : A case study in ancestral informative markers. / Bhavnani, Suresh; Bellala, Gowtham; Victor, Sundar; Bassler, Kevin E.; Visweswaran, Shyam.

In: Journal of the American Medical Informatics Association, Vol. 19, No. E1, 06.2012.

Research output: Contribution to journalArticle

@article{43371b3b9b5348f9b59145a9b315cde3,
title = "The role of complementary bipartite visual analytical representations in the analysis of SNPs: A case study in ancestral informative markers",
abstract = "Objective: Several studies have shown how sets of single-nucleotide polymorphisms (SNPs) can help to classify subjects on the basis of their continental origins, with applications to caseecontrol studies and population genetics. However, most of these studies use dimensionality-reduction methods, such as principal component analysis, or clustering methods that result in unipartite (either subjects or SNPs) representations of the data. Such analyses conceal important bipartite relationships, such as how subject and SNP clusters relate to each other, and the genotypes that determine their cluster memberships. Methods: To overcome the limitations of current methods of analyzing SNP data, the authors used three bipartite analytical representations (bipartite network, heat map with dendrograms, and Circos ideogram) that enable the simultaneous visualization and analysis of subjects, SNPs, and subject attributes. Results: The results demonstrate (1) novel insights into SNP data that are difficult to derive from purely unipartite views of the data, (2) the strengths and limitations of each method, revealing the role that each play in revealing novel insights, and (3) implications for how the methods can be used for the analysis of SNPs in genomic studies associated with disease. Conclusion: The results suggest that bipartite representations can reveal new patterns in SNP data compared with existing unipartite representations. However, the novel insights require multiple representations to discover, verify, and comprehend the complex relationships. The results therefore motivate the need for a complementary visual analytical framework that guides the use of multiple bipartite representations to analyze complex relationships in SNP data.",
author = "Suresh Bhavnani and Gowtham Bellala and Sundar Victor and Bassler, {Kevin E.} and Shyam Visweswaran",
year = "2012",
month = "6",
doi = "10.1136/amiajnl-2011-000745",
language = "English (US)",
volume = "19",
journal = "Journal of the American Medical Informatics Association : JAMIA",
issn = "1067-5027",
publisher = "Oxford University Press",
number = "E1",

}

TY - JOUR

T1 - The role of complementary bipartite visual analytical representations in the analysis of SNPs

T2 - A case study in ancestral informative markers

AU - Bhavnani, Suresh

AU - Bellala, Gowtham

AU - Victor, Sundar

AU - Bassler, Kevin E.

AU - Visweswaran, Shyam

PY - 2012/6

Y1 - 2012/6

N2 - Objective: Several studies have shown how sets of single-nucleotide polymorphisms (SNPs) can help to classify subjects on the basis of their continental origins, with applications to caseecontrol studies and population genetics. However, most of these studies use dimensionality-reduction methods, such as principal component analysis, or clustering methods that result in unipartite (either subjects or SNPs) representations of the data. Such analyses conceal important bipartite relationships, such as how subject and SNP clusters relate to each other, and the genotypes that determine their cluster memberships. Methods: To overcome the limitations of current methods of analyzing SNP data, the authors used three bipartite analytical representations (bipartite network, heat map with dendrograms, and Circos ideogram) that enable the simultaneous visualization and analysis of subjects, SNPs, and subject attributes. Results: The results demonstrate (1) novel insights into SNP data that are difficult to derive from purely unipartite views of the data, (2) the strengths and limitations of each method, revealing the role that each play in revealing novel insights, and (3) implications for how the methods can be used for the analysis of SNPs in genomic studies associated with disease. Conclusion: The results suggest that bipartite representations can reveal new patterns in SNP data compared with existing unipartite representations. However, the novel insights require multiple representations to discover, verify, and comprehend the complex relationships. The results therefore motivate the need for a complementary visual analytical framework that guides the use of multiple bipartite representations to analyze complex relationships in SNP data.

AB - Objective: Several studies have shown how sets of single-nucleotide polymorphisms (SNPs) can help to classify subjects on the basis of their continental origins, with applications to caseecontrol studies and population genetics. However, most of these studies use dimensionality-reduction methods, such as principal component analysis, or clustering methods that result in unipartite (either subjects or SNPs) representations of the data. Such analyses conceal important bipartite relationships, such as how subject and SNP clusters relate to each other, and the genotypes that determine their cluster memberships. Methods: To overcome the limitations of current methods of analyzing SNP data, the authors used three bipartite analytical representations (bipartite network, heat map with dendrograms, and Circos ideogram) that enable the simultaneous visualization and analysis of subjects, SNPs, and subject attributes. Results: The results demonstrate (1) novel insights into SNP data that are difficult to derive from purely unipartite views of the data, (2) the strengths and limitations of each method, revealing the role that each play in revealing novel insights, and (3) implications for how the methods can be used for the analysis of SNPs in genomic studies associated with disease. Conclusion: The results suggest that bipartite representations can reveal new patterns in SNP data compared with existing unipartite representations. However, the novel insights require multiple representations to discover, verify, and comprehend the complex relationships. The results therefore motivate the need for a complementary visual analytical framework that guides the use of multiple bipartite representations to analyze complex relationships in SNP data.

UR - http://www.scopus.com/inward/record.url?scp=84863543326&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863543326&partnerID=8YFLogxK

U2 - 10.1136/amiajnl-2011-000745

DO - 10.1136/amiajnl-2011-000745

M3 - Article

VL - 19

JO - Journal of the American Medical Informatics Association : JAMIA

JF - Journal of the American Medical Informatics Association : JAMIA

SN - 1067-5027

IS - E1

ER -