Co-training with relevant random subspaces

Yusuf Yaslan*, Zehra Cataltepe

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

78 Citations (Scopus)

Abstract

We introduce the relevant random subspace Co-training (Rel-RASCO) algorithm which produces relevant random subspaces and then does semi-supervised ensemble learning using those subspaces and unlabeled data. Ensemble learning algorithms may benefit from diversity of classifiers used. However, for high dimensional data choosing subspaces randomly, as in RASCO (Random Subspace Method for Co-training, Wang et al. 2008 [5]) algorithm, may produce diverse but inaccurate classifiers. We produce relevant random subspaces by means of drawing features with probabilities proportional to their relevances measured by the mutual information between features and class labels. We show that Rel-RASCO achieves better accuracy by this relevant and random subspace selection scheme. Experiments on five real and one synthetic datasets show that Rel-RASCO algorithm outperforms both RASCO and Co-training in terms of the accuracy achieved at the end of Co-training.

Original languageEnglish
Pages (from-to)1652-1661
Number of pages10
JournalNeurocomputing
Volume73
Issue number10-12
DOIs
Publication statusPublished - Jun 2010

Funding

This work was partially supported by Tubitak (The Scientific and Technological Research Foundation of Turkey) research project 109E162 and Istanbul Technical University BAP (Scientific Research Projects) ‘Co-training on High Dimensional Datasets’ project. Authors also would like to thank the anonymous reviewers whose comments greatly improved the quality of the paper.

FundersFunder number
Istanbul Technical University BAP
Scientific and Technological Research Foundation of Turkey109E162

    Keywords

    • Co-training
    • Multiple classifier systems
    • Random subspace methods
    • RASCO
    • Relevant subspace method
    • Semi-supervised learning

    Fingerprint

    Dive into the research topics of 'Co-training with relevant random subspaces'. Together they form a unique fingerprint.

    Cite this