TY - GEN
T1 - Random relevant and non-redundant feature subspaces for co-training
AU - Yaslan, Yusuf
AU - Cataltepe, Zehra
PY - 2009
Y1 - 2009
N2 - Random feature subspace selection can produce diverse classifiers and help with Co-training as shown by RASCO algorithm of Wang et al. 2008. For data sets with many irrelevant or noisy feature, RASCO may end up with inaccurate classifiers. In order to remedy this problem, we introduce two algorithms for selecting relevant and non-redundant feature subspaces for Co-training. The first algorithm Rel-RASCO (Relevant Random Subspaces for Co-training) produces subspaces by drawing features with probabilities proportional to their relevances. We also modify a successful feature selection algorithm, mRMR (Minimum Redundancy Maximum Relevance), for random feature subset selection and introduce Prob-mRMR (Probabilistic-mRMR). Experiments on 5 datasets demonstrate that the proposed algorithms outperform both RASCO and Co-training in terms of accuracy achieved at the end of Co-training. Theoretical analysis of the proposed algorithms is also provided.
AB - Random feature subspace selection can produce diverse classifiers and help with Co-training as shown by RASCO algorithm of Wang et al. 2008. For data sets with many irrelevant or noisy feature, RASCO may end up with inaccurate classifiers. In order to remedy this problem, we introduce two algorithms for selecting relevant and non-redundant feature subspaces for Co-training. The first algorithm Rel-RASCO (Relevant Random Subspaces for Co-training) produces subspaces by drawing features with probabilities proportional to their relevances. We also modify a successful feature selection algorithm, mRMR (Minimum Redundancy Maximum Relevance), for random feature subset selection and introduce Prob-mRMR (Probabilistic-mRMR). Experiments on 5 datasets demonstrate that the proposed algorithms outperform both RASCO and Co-training in terms of accuracy achieved at the end of Co-training. Theoretical analysis of the proposed algorithms is also provided.
KW - Co-training
KW - MRMR
KW - RASCO
KW - Random Subspace Methods
UR - http://www.scopus.com/inward/record.url?scp=76249093361&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-04394-9_83
DO - 10.1007/978-3-642-04394-9_83
M3 - Conference contribution
AN - SCOPUS:76249093361
SN - 3642043933
SN - 9783642043932
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 679
EP - 686
BT - Intelligent Data Engineering and Automated Learning - IDEAL 2009 - 10th International Conference, Proceedings
T2 - 10th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2009
Y2 - 23 September 2009 through 26 September 2009
ER -