Significance-testing of periodogram for short time series

Andrzej Kudlicki, Malgorzata Rowicka-Kudlicka, Zbyszek Otwinowski

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.

Original languageEnglish (US)
Title of host publicationProceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008
Pages424-430
Number of pages7
StatePublished - 2008
Externally publishedYes
Event2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008 - Las Vegas, NV, United States
Duration: Jul 14 2008Jul 17 2008

Other

Other2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008
CountryUnited States
CityLas Vegas, NV
Period7/14/087/17/08

Fingerprint

Time series
Testing
Periodicity
Microarrays
Gene expression
Genes
Gene Expression
Computer simulation
Datasets

Keywords

  • Microarray
  • Periodogram
  • Statistical significance
  • Time series

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Biomedical Engineering
  • Health Informatics

Cite this

Kudlicki, A., Rowicka-Kudlicka, M., & Otwinowski, Z. (2008). Significance-testing of periodogram for short time series. In Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008 (pp. 424-430)

Significance-testing of periodogram for short time series. / Kudlicki, Andrzej; Rowicka-Kudlicka, Malgorzata; Otwinowski, Zbyszek.

Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008. 2008. p. 424-430.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kudlicki, A, Rowicka-Kudlicka, M & Otwinowski, Z 2008, Significance-testing of periodogram for short time series. in Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008. pp. 424-430, 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008, Las Vegas, NV, United States, 7/14/08.
Kudlicki A, Rowicka-Kudlicka M, Otwinowski Z. Significance-testing of periodogram for short time series. In Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008. 2008. p. 424-430
Kudlicki, Andrzej ; Rowicka-Kudlicka, Malgorzata ; Otwinowski, Zbyszek. / Significance-testing of periodogram for short time series. Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008. 2008. pp. 424-430
@inproceedings{767c84f452dc460eb4fd7d07a32ca543,
title = "Significance-testing of periodogram for short time series",
abstract = "Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.",
keywords = "Microarray, Periodogram, Statistical significance, Time series",
author = "Andrzej Kudlicki and Malgorzata Rowicka-Kudlicka and Zbyszek Otwinowski",
year = "2008",
language = "English (US)",
isbn = "1601320558",
pages = "424--430",
booktitle = "Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008",

}

TY - GEN

T1 - Significance-testing of periodogram for short time series

AU - Kudlicki, Andrzej

AU - Rowicka-Kudlicka, Malgorzata

AU - Otwinowski, Zbyszek

PY - 2008

Y1 - 2008

N2 - Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.

AB - Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.

KW - Microarray

KW - Periodogram

KW - Statistical significance

KW - Time series

UR - http://www.scopus.com/inward/record.url?scp=62649133412&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=62649133412&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:62649133412

SN - 1601320558

SN - 9781601320551

SP - 424

EP - 430

BT - Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008

ER -