Frequency difference based DNA encoding methods in human splice site recognition

Elham Pashaei, Nizamettin Aydin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Identifying structure of genes in Human genomes highly depends upon accurate recognition of boundaries between exons and introns, i.e. splice sites. Hence, development of new methods for effective detection of splice sites is essential. DNA encoding approaches are used for feature extraction from gene sequences, while machine learning methods are used for classification of splice sites using those extracted features. This paper presents a new DNA encoding method based on triplet nucleotide encoding with the frequency difference between true and false splice site sequences (TN-FDTF). Then, Support Vector Machine (SVM), Artificial Neural Network (NN), Random Forest (RF) and AdaBoost classifiers are used for prediction of splice sites. The performance of the proposed method was assessed on Homo Sapiens Splice Site Dataset (HS3D) using 10 fold cross validation. The results showed that the AdaBoost outperformed all the considered classifiers. In addition, the proposed method achieved higher prediction accuracy than most of the current existing state of the art methods. It is believed that the proposed method can help to achieve better results in Human splice site recognition and eukaryotic gene detection.

Original languageEnglish
Title of host publication2nd International Conference on Computer Science and Engineering, UBMK 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages586-591
Number of pages6
ISBN (Electronic)9781538609309
DOIs
Publication statusPublished - 31 Oct 2017
Externally publishedYes
Event2nd International Conference on Computer Science and Engineering, UBMK 2017 - Antalya, Turkey
Duration: 5 Oct 20178 Oct 2017

Publication series

Name2nd International Conference on Computer Science and Engineering, UBMK 2017

Conference

Conference2nd International Conference on Computer Science and Engineering, UBMK 2017
Country/TerritoryTurkey
CityAntalya
Period5/10/178/10/17

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Keywords

  • DNA encoding methods
  • Gene detection
  • Machine learning
  • Splice site prediction

Fingerprint

Dive into the research topics of 'Frequency difference based DNA encoding methods in human splice site recognition'. Together they form a unique fingerprint.

Cite this