Neighborhood rough set model based gene selection for multi-subtype tumor classification

Shulin Wang, Xueling Li, Shanwen Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Citations (Scopus)

Abstract

Multi-subtype tumor diagnosis based on gene expression profiles is promising in clinical medicine application. Therefore, a great deal of research on tumor classification based on gene expression profiles has been developed, where various machine learning approaches were applied to constructing the best tumor classification model to improve the classification performance as much as possible. To achieve this goal, extracting features or finding informative genes that have good classification ability is crucial. We propose a novel gene selection approach, which adopts Kruskal-Wallis rank sum test to rank all genes and then apply an algorithm based on neighborhood rough set model to gene reduction to obtain gene subsets with fewer genes and more classification ability. Experiments on a small round blue cell tumor (SRBCT) dataset show that our approach can achieve very high classification accuracy with only three or four genes as evaluated by three classifiers: support vector machines, K-nearest neighbor and neighborhood classifier, respectively.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages146-158
Number of pages13
Volume5226 LNCS
DOIs
StatePublished - 2008
Externally publishedYes
Event4th International Conference on Intelligent Computing, ICIC 2008 - Shanghai, China
Duration: Sep 15 2008Sep 18 2008

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5226 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other4th International Conference on Intelligent Computing, ICIC 2008
CountryChina
CityShanghai
Period9/15/089/18/08

Fingerprint

Gene Selection
Rough Set
Tumors
Tumor
Genes
Model-based
Gene
Gene Expression Profile
Gene expression
Classifier
Classifiers
Medicine
Nearest Neighbor
Support Vector Machine
Machine Learning
Support vector machines
Learning systems
Subset
Cell
Model

Keywords

  • Gene expression profiles
  • K-nearest neighbor
  • Neighborhood classifier
  • Support vector machines
  • Tumor classification

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Wang, S., Li, X., & Zhang, S. (2008). Neighborhood rough set model based gene selection for multi-subtype tumor classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5226 LNCS, pp. 146-158). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5226 LNCS). https://doi.org/10.1007/978-3-540-87442-3_20

Neighborhood rough set model based gene selection for multi-subtype tumor classification. / Wang, Shulin; Li, Xueling; Zhang, Shanwen.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5226 LNCS 2008. p. 146-158 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5226 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wang, S, Li, X & Zhang, S 2008, Neighborhood rough set model based gene selection for multi-subtype tumor classification. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 5226 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5226 LNCS, pp. 146-158, 4th International Conference on Intelligent Computing, ICIC 2008, Shanghai, China, 9/15/08. https://doi.org/10.1007/978-3-540-87442-3_20
Wang S, Li X, Zhang S. Neighborhood rough set model based gene selection for multi-subtype tumor classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5226 LNCS. 2008. p. 146-158. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-540-87442-3_20
Wang, Shulin ; Li, Xueling ; Zhang, Shanwen. / Neighborhood rough set model based gene selection for multi-subtype tumor classification. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5226 LNCS 2008. pp. 146-158 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{86ac9668c5754ec090c0f7d800ac3241,
title = "Neighborhood rough set model based gene selection for multi-subtype tumor classification",
abstract = "Multi-subtype tumor diagnosis based on gene expression profiles is promising in clinical medicine application. Therefore, a great deal of research on tumor classification based on gene expression profiles has been developed, where various machine learning approaches were applied to constructing the best tumor classification model to improve the classification performance as much as possible. To achieve this goal, extracting features or finding informative genes that have good classification ability is crucial. We propose a novel gene selection approach, which adopts Kruskal-Wallis rank sum test to rank all genes and then apply an algorithm based on neighborhood rough set model to gene reduction to obtain gene subsets with fewer genes and more classification ability. Experiments on a small round blue cell tumor (SRBCT) dataset show that our approach can achieve very high classification accuracy with only three or four genes as evaluated by three classifiers: support vector machines, K-nearest neighbor and neighborhood classifier, respectively.",
keywords = "Gene expression profiles, K-nearest neighbor, Neighborhood classifier, Support vector machines, Tumor classification",
author = "Shulin Wang and Xueling Li and Shanwen Zhang",
year = "2008",
doi = "10.1007/978-3-540-87442-3_20",
language = "English (US)",
isbn = "3540874402",
volume = "5226 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "146--158",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Neighborhood rough set model based gene selection for multi-subtype tumor classification

AU - Wang, Shulin

AU - Li, Xueling

AU - Zhang, Shanwen

PY - 2008

Y1 - 2008

N2 - Multi-subtype tumor diagnosis based on gene expression profiles is promising in clinical medicine application. Therefore, a great deal of research on tumor classification based on gene expression profiles has been developed, where various machine learning approaches were applied to constructing the best tumor classification model to improve the classification performance as much as possible. To achieve this goal, extracting features or finding informative genes that have good classification ability is crucial. We propose a novel gene selection approach, which adopts Kruskal-Wallis rank sum test to rank all genes and then apply an algorithm based on neighborhood rough set model to gene reduction to obtain gene subsets with fewer genes and more classification ability. Experiments on a small round blue cell tumor (SRBCT) dataset show that our approach can achieve very high classification accuracy with only three or four genes as evaluated by three classifiers: support vector machines, K-nearest neighbor and neighborhood classifier, respectively.

AB - Multi-subtype tumor diagnosis based on gene expression profiles is promising in clinical medicine application. Therefore, a great deal of research on tumor classification based on gene expression profiles has been developed, where various machine learning approaches were applied to constructing the best tumor classification model to improve the classification performance as much as possible. To achieve this goal, extracting features or finding informative genes that have good classification ability is crucial. We propose a novel gene selection approach, which adopts Kruskal-Wallis rank sum test to rank all genes and then apply an algorithm based on neighborhood rough set model to gene reduction to obtain gene subsets with fewer genes and more classification ability. Experiments on a small round blue cell tumor (SRBCT) dataset show that our approach can achieve very high classification accuracy with only three or four genes as evaluated by three classifiers: support vector machines, K-nearest neighbor and neighborhood classifier, respectively.

KW - Gene expression profiles

KW - K-nearest neighbor

KW - Neighborhood classifier

KW - Support vector machines

KW - Tumor classification

UR - http://www.scopus.com/inward/record.url?scp=56549111659&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56549111659&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-87442-3_20

DO - 10.1007/978-3-540-87442-3_20

M3 - Conference contribution

AN - SCOPUS:56549111659

SN - 3540874402

SN - 9783540874409

VL - 5226 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 146

EP - 158

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -