Abstract
Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008 |
Pages | 424-430 |
Number of pages | 7 |
State | Published - 2008 |
Externally published | Yes |
Event | 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008 - Las Vegas, NV, United States Duration: Jul 14 2008 → Jul 17 2008 |
Other
Other | 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008 |
---|---|
Country | United States |
City | Las Vegas, NV |
Period | 7/14/08 → 7/17/08 |
Fingerprint
Keywords
- Microarray
- Periodogram
- Statistical significance
- Time series
ASJC Scopus subject areas
- Computer Science Applications
- Software
- Biomedical Engineering
- Health Informatics
Cite this
Significance-testing of periodogram for short time series. / Kudlicki, Andrzej; Rowicka-Kudlicka, Malgorzata; Otwinowski, Zbyszek.
Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008. 2008. p. 424-430.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
}
TY - GEN
T1 - Significance-testing of periodogram for short time series
AU - Kudlicki, Andrzej
AU - Rowicka-Kudlicka, Malgorzata
AU - Otwinowski, Zbyszek
PY - 2008
Y1 - 2008
N2 - Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.
AB - Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.
KW - Microarray
KW - Periodogram
KW - Statistical significance
KW - Time series
UR - http://www.scopus.com/inward/record.url?scp=62649133412&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=62649133412&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:62649133412
SN - 1601320558
SN - 9781601320551
SP - 424
EP - 430
BT - Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008
ER -