TY - GEN
T1 - Performance comparison of tumor classification based on linear and non-linear dimensionality reduction methods
AU - Wang, Shu Lin
AU - You, Hong Zhu
AU - Lei, Ying Ke
AU - Li, Xue Ling
PY - 2010
Y1 - 2010
N2 - Gene expression profiles play more and more important roles in accurate tumor diagnosis and treatment. However, the curse of dimensionality that the number of genes far exceeds the number of samples issues the challenges to the traditional dimensionality reduction methods. Here based on two-stage dimensionality reduction model we design 18 tumor classification methods by combining two classical gene filters with three common dimensionality reduction methods: principal component analysis (PCA), linear discriminative analysis (LDA) and multidimensional scaling (MDS) method to extract discriminative features and use three common machine learning methods to evaluate the prediction accuracy of the extracted features on six tumor datasets, respectively. Although gene expression presents the non-linear characteristics, non-linear dimensionality reduction method MDS is not always the best in prediction accuracy among the three dimensionality reductions on all six tumor datasets. Moreover, the performance comparison indicates that no single dimensionality reduction is always superior to the others on all of the six tumor datasets. Our results also suggest that the prediction accuracy obtained depends strongly on the dataset, and less on the gene selection and classification methods.
AB - Gene expression profiles play more and more important roles in accurate tumor diagnosis and treatment. However, the curse of dimensionality that the number of genes far exceeds the number of samples issues the challenges to the traditional dimensionality reduction methods. Here based on two-stage dimensionality reduction model we design 18 tumor classification methods by combining two classical gene filters with three common dimensionality reduction methods: principal component analysis (PCA), linear discriminative analysis (LDA) and multidimensional scaling (MDS) method to extract discriminative features and use three common machine learning methods to evaluate the prediction accuracy of the extracted features on six tumor datasets, respectively. Although gene expression presents the non-linear characteristics, non-linear dimensionality reduction method MDS is not always the best in prediction accuracy among the three dimensionality reductions on all six tumor datasets. Moreover, the performance comparison indicates that no single dimensionality reduction is always superior to the others on all of the six tumor datasets. Our results also suggest that the prediction accuracy obtained depends strongly on the dataset, and less on the gene selection and classification methods.
KW - Gene expression profiles
KW - dimensionality reduction
KW - linear discriminative analysis
KW - multidimensional scaling
KW - principal component analysis
KW - tumor classification
UR - http://www.scopus.com/inward/record.url?scp=77958460554&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77958460554&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-14922-1_37
DO - 10.1007/978-3-642-14922-1_37
M3 - Conference contribution
AN - SCOPUS:77958460554
SN - 3642149219
SN - 9783642149214
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 291
EP - 300
BT - Advanced Intelligent Computing Theories and Applications - 6th International Conference on Intelligent Computing, ICIC 2010, Proceedings
T2 - 6th International Conference on Intelligent Computing, ICIC 2010
Y2 - 18 August 2010 through 21 August 2010
ER -