TY - JOUR
T1 - Combining multiple views
T2 - Case studies on protein and arrhythmia features
AU - Sakar, C. Okan
AU - Kursun, Olcay
AU - Seker, Huseyin
AU - Gurgen, Fikret
AU - Aydin, Nizamettin
AU - Favorov, Oleg
PY - 2014/2
Y1 - 2014/2
N2 - Computational annotation of protein functions and structures from sequence features, or prediction of certain diseases from gene expression levels are among important applications of computational biology. Developing methods capable of such predictions are not only important in terms of their biological and medical uses but also a very challenging task of pattern recognition due to high input dimensionality and small sample size. Ensemble and multi-view learning has gained popularity due to the rapid rise of such datasets (such as the protein and arrhythmia datasets used in this paper) with large numbers of variables. However, the classical ensemble approach does not take into account conditional interdependences among the views. In this paper, we present a two stage supervised multi-view learning technique called parallel interacting multi-view learning (PIML). In the first stage of PIML, similar to the ensemble method, the views are individually used by a predictor, and the class posterior probability estimates are obtained. In the second stage, each view is trained using its own features along with the class posterior probability estimates of the other views as the summary information of other views. This is a hybrid way of combining the views in which the views influence each other during training using the predictions of others interdependences. PIML is demonstrated and compared with the classical ensemble approach on three real datasets.
AB - Computational annotation of protein functions and structures from sequence features, or prediction of certain diseases from gene expression levels are among important applications of computational biology. Developing methods capable of such predictions are not only important in terms of their biological and medical uses but also a very challenging task of pattern recognition due to high input dimensionality and small sample size. Ensemble and multi-view learning has gained popularity due to the rapid rise of such datasets (such as the protein and arrhythmia datasets used in this paper) with large numbers of variables. However, the classical ensemble approach does not take into account conditional interdependences among the views. In this paper, we present a two stage supervised multi-view learning technique called parallel interacting multi-view learning (PIML). In the first stage of PIML, similar to the ensemble method, the views are individually used by a predictor, and the class posterior probability estimates are obtained. In the second stage, each view is trained using its own features along with the class posterior probability estimates of the other views as the summary information of other views. This is a hybrid way of combining the views in which the views influence each other during training using the predictions of others interdependences. PIML is demonstrated and compared with the classical ensemble approach on three real datasets.
KW - Arrhythmia type prediction
KW - Ensemble methods
KW - Multi-view learning
KW - Protein structure prediction
KW - Protein sub-nuclear location prediction
UR - http://www.scopus.com/inward/record.url?scp=84892900082&partnerID=8YFLogxK
U2 - 10.1016/j.engappai.2013.11.004
DO - 10.1016/j.engappai.2013.11.004
M3 - Article
AN - SCOPUS:84892900082
SN - 0952-1976
VL - 28
SP - 174
EP - 180
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
ER -