A novel gene selection algorithm for cancer identification based on random forest and particle swarm optimization

Elnaz Pashaei, Mustafa Ozen, Nizamettin Aydin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Citations (Scopus)

Abstract

In order to achieve informative gene from thousands of candidate genes contributing to the symptom of cancer, two novel gene selection approaches for classification of multiclass microarray datasets are proposed. In the first, method we use k-means clustering to remove redundancy, and then apply Random Forest (RF) to rank each gene in every cluster to remove irrelevance. The top scored genes from each cluster is gathered and a new feature subset (filtered genes) is generated. At the last stage filtered genes is used as input to eight benchmark classification methods. In the second approach we develop a novel method utilizing Particle Swarm Optimization combined with BoostedC5.0 decision tree as the classifier. We apply filtered genes that achieved by first proposed method as input to PSO+BoostedC5.0 classifier and compare the performance of it with 8 classifiers. Experimental results show that by using clustering technique and RF ranking we can give a true pattern which select a smaller number of feature subset and obtain better classification accuracy. Also by applying this method on ten microarray datasets and using filtered genes as input for 9 classifiers we showed that proposed PSO+BoostedC5.0 simplifies features effectively and obtains higher classification accuracy compared to the other classification methods.

Original languageEnglish
Title of host publication2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781479969265
DOIs
Publication statusPublished - 16 Oct 2015
Externally publishedYes
EventIEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015 - Niagara Falls, Canada
Duration: 12 Aug 201515 Aug 2015

Publication series

Name2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015

Conference

ConferenceIEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2015
Country/TerritoryCanada
CityNiagara Falls
Period12/08/1515/08/15

Bibliographical note

Publisher Copyright:
© 2015 IEEE.

Keywords

  • Decision tree classifier
  • Gene expression
  • Particle swarm optimization
  • Random Forest

Fingerprint

Dive into the research topics of 'A novel gene selection algorithm for cancer identification based on random forest and particle swarm optimization'. Together they form a unique fingerprint.

Cite this