Markovian encoding models in human splice site recognition using SVM

Elham Pashaei*, Nizamettin Aydin

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Citations (Scopus)

Abstract

Splice site recognition is among the most significant and challenging tasks in bioinformatics due to its key role in gene annotation. Effective prediction of splice site requires nucleotide encoding methods that reveal the characteristics of DNA sequences to provide appropriate features to serve as input of machine learning classifiers. Markovian models are the most influential encoding methods that highly used for pattern recognition in biological data. However, a direct performance comparison of these methods in splice site domain has not been assessed yet. This study compares various Markovian encoding models for splice site prediction utilizing support vector machine, as the most outstanding learning method in the domain, and conducts a new precise evaluation of Markovian approaches that corrects this limitation. Moreover, a novel sequence encoding approach based on third order Markov model (MM3) is proposed. The experimental results show that the proposed method, namely MM3-SVM, performs significantly better than thirteen best known state-of-the-art algorithms, while tested on HS3D dataset considering several performance criteria. Further, it achieved higher prediction accuracy than several well-known tools like NNsplice, MEM, MM1, WMM, and GeneID, using an independent test set of 50 genes. We also developed MMSVM, a web tool to predict splice sites in any human sequence using the proposed approach. The MMSVM web server can be assessed at https://pashaei.shinyapps.io/mmsvm.

Original languageEnglish
Pages (from-to)159-170
Number of pages12
JournalComputational Biology and Chemistry
Volume73
DOIs
Publication statusPublished - Apr 2018
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2018 Elsevier Ltd

Keywords

  • DNA encoding method
  • Machine learning
  • Markovian model
  • MMSVM
  • Splice sites

Fingerprint

Dive into the research topics of 'Markovian encoding models in human splice site recognition using SVM'. Together they form a unique fingerprint.

Cite this