Ana gezinime geç Aramaya geç Ana içeriğe geç

CNN-based Text-independent Automatic Speaker Identification Using Short Utterances

  • Mandana Fasounaki
  • , Emirhan Burak Yüce
  • , Serkan Öncül
  • , Gökhan Ince
  • Istanbul Technical University
  • Arçelik A.S.

Araştırma sonucu: Kitap/Rapor/Konferans Bildirisinde BölümKonferans katkısıbilirkişi

9 Atıf (Scopus)

Özet

With the widespread use of voice-controlling services and devices, the research for developing robust and fast systems for automatic speaker identification had accelerated. In this paper, we present a Convolutional Neural Network (CNN) architecture for text-independent automatic speaker identification. The primary purpose is to identify a speaker, among many others, using a short speech segment. Most of the current researches focus on deep CNNs, which were initially designed for computer vision tasks. Besides, most of the existing speaker identification methods require audio samples longer than 3 seconds in the query phase for achieving a high accuracy. We created a CNN architecture appropriate for voice and speech-related classification tasks. We propose an optimum model that achieves 99.5% accuracy on LibriSpeech and 90% accuracy on VoxCeleb 1 dataset using only 1-second test utterances in our experiments.

Orijinal dilİngilizce
Ana bilgisayar yayını başlığıProceedings - 6th International Conference on Computer Science and Engineering, UBMK 2021
YayınlayanInstitute of Electrical and Electronics Engineers Inc.
Sayfalar413-418
Sayfa sayısı6
ISBN (Elektronik)9781665429085
DOI'lar
Yayın durumuYayınlandı - 2021
Etkinlik6th International Conference on Computer Science and Engineering, UBMK 2021 - Ankara, Türkiye
Süre: 15 Eyl 202117 Eyl 2021

Yayın serisi

AdıProceedings - 6th International Conference on Computer Science and Engineering, UBMK 2021

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???6th International Conference on Computer Science and Engineering, UBMK 2021
Ülke/BölgeTürkiye
ŞehirAnkara
Periyot15/09/2117/09/21

Bibliyografik not

Publisher Copyright:
© 2021 IEEE

Finansman

This work has been supported by Arcelik ITU R&D Center and Scientific Project Unit (BAP) of Istanbul Technical University, project number: MOA-2019-42321. The authors thank to Cagri Aslanbas, Berna Erden, Pinar Baki, Ugur Halatoglu and Baris Bayram for their fruitful discussions.

FinansörlerFinansör numarası
Arcelik ITU R&D Center and Scientific Project Unit
British Association for Psychopharmacology
Istanbul Teknik ÜniversitesiMOA-2019-42321

    Parmak izi

    CNN-based Text-independent Automatic Speaker Identification Using Short Utterances' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

    Alıntı Yap