### Abstract

Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.

Original language | English (US) |
---|---|

Title of host publication | Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008 |

Pages | 424-430 |

Number of pages | 7 |

State | Published - 2008 |

Externally published | Yes |

Event | 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008 - Las Vegas, NV, United States Duration: Jul 14 2008 → Jul 17 2008 |

### Other

Other | 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008 |
---|---|

Country | United States |

City | Las Vegas, NV |

Period | 7/14/08 → 7/17/08 |

### Fingerprint

### Keywords

- Microarray
- Periodogram
- Statistical significance
- Time series

### ASJC Scopus subject areas

- Computer Science Applications
- Software
- Biomedical Engineering
- Health Informatics

### Cite this

*Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008*(pp. 424-430)

**Significance-testing of periodogram for short time series.** / Kudlicki, Andrzej; Rowicka-Kudlicka, Malgorzata; Otwinowski, Zbyszek.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008.*pp. 424-430, 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008, Las Vegas, NV, United States, 7/14/08.

}

TY - GEN

T1 - Significance-testing of periodogram for short time series

AU - Kudlicki, Andrzej

AU - Rowicka-Kudlicka, Malgorzata

AU - Otwinowski, Zbyszek

PY - 2008

Y1 - 2008

N2 - Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.

AB - Periodicity detection in timecourse gene expression data is usually based on periodogram method and a pre-defined significance threshold. Existing periodogram significance testing formulae are often inaccurate for short time series. We demonstrate that the magnitude of the error resulting from using the theoretical approximations is not negligible in such case and discuss how it depends on different factors. Our conclusions are illustrated by examples of short time series typically produced by microarray studies. In these examples, we show a substantial discrepancy between different theoretical formulae and results of numerical simulations in the number of periodic genes found. We demonstrate that the accuracy of simulations is much higher than that of any of the theoretical approximations and conclude that estimation of significance should be based on comparisons with simulated datasets rather than theoretical approximations whenever feasible.

KW - Microarray

KW - Periodogram

KW - Statistical significance

KW - Time series

UR - http://www.scopus.com/inward/record.url?scp=62649133412&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=62649133412&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:62649133412

SN - 1601320558

SN - 9781601320551

SP - 424

EP - 430

BT - Proceedings of the 2008 International Conference on Bioinformatics and Computational Biology, BIOCOMP 2008

ER -