Weighted neighborhood classifier for the classification of imbalanced tumor dataset

Shu Lin Wang, Xueling Li, Jun Feng Xia, Xiao Ping Zhang

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Machine learning is widely applied to gene expression profiles based molecular tumor classification, but sample imbalance problem is often overlooked. This paper proposed a subclass-weighted neighborhood classifier to address the imbalanced sample set problem and a novel neighborhood rough set model to select informative genes for classification performance improvement. Experiments on three publicly available tumor datasets demonstrated that the proposed method is obviously effective on imbalanced dataset with obscure boundary between two subtypes and informative gene selection and it can achieve higher cross-validation accuracy with much fewer tumor-related genes.

Original languageEnglish (US)
Pages (from-to)259-273
Number of pages15
JournalJournal of Circuits, Systems and Computers
Volume19
Issue number1
DOIs
StatePublished - Feb 1 2010

Keywords

  • Gene expression profiles
  • Imbalanced dataset
  • Kruskal-Wallis rank sum test
  • Molecular tumor classification
  • Neighborhood rough set model
  • Weighted neighborhood classifier

ASJC Scopus subject areas

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Weighted neighborhood classifier for the classification of imbalanced tumor dataset'. Together they form a unique fingerprint.

Cite this