A novel significance score for gene selection and ranking

Yufei Xiao, Tzu Hung Hsiao, Uthra Suresh, Hung I.Harry Chen, Xiaowu Wu, Steven Wolf, Yidong Chen

Research output: Contribution to journalArticle

56 Citations (Scopus)

Abstract

Motivation: When identifying differentially expressed (DE) genes from high-throughput gene expression measurements, we would like to take both statistical significance (such as P-value) and biological relevance (such as fold change) into consideration. In gene set enrichment analysis (GSEA), a score that can combine fold change and P-value together is needed for better gene ranking.Results: We defined a gene significance score π-value by combining expression fold change and statistical significance (P-value), and explored its statistical properties. When compared to various existing methods, π-value based approach is more robust in selecting DE genes, with the largest area under curve in its receiver operating characteristic curve. We applied π-value to GSEA and found it comparable to P-value and t-statistic based methods, with added protection against false discovery in certain situations. Finally, in a gene functional study of breast cancer profiles, we showed that using π-value helps elucidating otherwise overlooked important biological functions.

Original languageEnglish (US)
Pages (from-to)801-807
Number of pages7
JournalBioinformatics
Volume30
Issue number6
DOIs
StatePublished - Mar 1 2014
Externally publishedYes

Fingerprint

Gene Selection
Ranking
Genes
Gene
Fold
Statistical Significance
Receiver Operating Characteristic Curve
Breast Cancer
Gene expression
ROC Curve
Statistical property
High Throughput
Area Under Curve
Gene Expression
Statistic
Throughput
Statistics
Breast Neoplasms
Curve

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Xiao, Y., Hsiao, T. H., Suresh, U., Chen, H. I. H., Wu, X., Wolf, S., & Chen, Y. (2014). A novel significance score for gene selection and ranking. Bioinformatics, 30(6), 801-807. https://doi.org/10.1093/bioinformatics/btr671

A novel significance score for gene selection and ranking. / Xiao, Yufei; Hsiao, Tzu Hung; Suresh, Uthra; Chen, Hung I.Harry; Wu, Xiaowu; Wolf, Steven; Chen, Yidong.

In: Bioinformatics, Vol. 30, No. 6, 01.03.2014, p. 801-807.

Research output: Contribution to journalArticle

Xiao, Y, Hsiao, TH, Suresh, U, Chen, HIH, Wu, X, Wolf, S & Chen, Y 2014, 'A novel significance score for gene selection and ranking', Bioinformatics, vol. 30, no. 6, pp. 801-807. https://doi.org/10.1093/bioinformatics/btr671
Xiao, Yufei ; Hsiao, Tzu Hung ; Suresh, Uthra ; Chen, Hung I.Harry ; Wu, Xiaowu ; Wolf, Steven ; Chen, Yidong. / A novel significance score for gene selection and ranking. In: Bioinformatics. 2014 ; Vol. 30, No. 6. pp. 801-807.
@article{36c8a1f1f50e41be82ad36b7c620c21d,
title = "A novel significance score for gene selection and ranking",
abstract = "Motivation: When identifying differentially expressed (DE) genes from high-throughput gene expression measurements, we would like to take both statistical significance (such as P-value) and biological relevance (such as fold change) into consideration. In gene set enrichment analysis (GSEA), a score that can combine fold change and P-value together is needed for better gene ranking.Results: We defined a gene significance score π-value by combining expression fold change and statistical significance (P-value), and explored its statistical properties. When compared to various existing methods, π-value based approach is more robust in selecting DE genes, with the largest area under curve in its receiver operating characteristic curve. We applied π-value to GSEA and found it comparable to P-value and t-statistic based methods, with added protection against false discovery in certain situations. Finally, in a gene functional study of breast cancer profiles, we showed that using π-value helps elucidating otherwise overlooked important biological functions.",
author = "Yufei Xiao and Hsiao, {Tzu Hung} and Uthra Suresh and Chen, {Hung I.Harry} and Xiaowu Wu and Steven Wolf and Yidong Chen",
year = "2014",
month = "3",
day = "1",
doi = "10.1093/bioinformatics/btr671",
language = "English (US)",
volume = "30",
pages = "801--807",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "6",

}

TY - JOUR

T1 - A novel significance score for gene selection and ranking

AU - Xiao, Yufei

AU - Hsiao, Tzu Hung

AU - Suresh, Uthra

AU - Chen, Hung I.Harry

AU - Wu, Xiaowu

AU - Wolf, Steven

AU - Chen, Yidong

PY - 2014/3/1

Y1 - 2014/3/1

N2 - Motivation: When identifying differentially expressed (DE) genes from high-throughput gene expression measurements, we would like to take both statistical significance (such as P-value) and biological relevance (such as fold change) into consideration. In gene set enrichment analysis (GSEA), a score that can combine fold change and P-value together is needed for better gene ranking.Results: We defined a gene significance score π-value by combining expression fold change and statistical significance (P-value), and explored its statistical properties. When compared to various existing methods, π-value based approach is more robust in selecting DE genes, with the largest area under curve in its receiver operating characteristic curve. We applied π-value to GSEA and found it comparable to P-value and t-statistic based methods, with added protection against false discovery in certain situations. Finally, in a gene functional study of breast cancer profiles, we showed that using π-value helps elucidating otherwise overlooked important biological functions.

AB - Motivation: When identifying differentially expressed (DE) genes from high-throughput gene expression measurements, we would like to take both statistical significance (such as P-value) and biological relevance (such as fold change) into consideration. In gene set enrichment analysis (GSEA), a score that can combine fold change and P-value together is needed for better gene ranking.Results: We defined a gene significance score π-value by combining expression fold change and statistical significance (P-value), and explored its statistical properties. When compared to various existing methods, π-value based approach is more robust in selecting DE genes, with the largest area under curve in its receiver operating characteristic curve. We applied π-value to GSEA and found it comparable to P-value and t-statistic based methods, with added protection against false discovery in certain situations. Finally, in a gene functional study of breast cancer profiles, we showed that using π-value helps elucidating otherwise overlooked important biological functions.

UR - http://www.scopus.com/inward/record.url?scp=84897892578&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897892578&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btr671

DO - 10.1093/bioinformatics/btr671

M3 - Article

C2 - 22321699

AN - SCOPUS:84897892578

VL - 30

SP - 801

EP - 807

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 6

ER -