Skip to main navigation Skip to search Skip to main content

Ai-based offline speech recognition for kazakh, russian and english languages

  • Nursultan Nyssanov
  • , Zuleikha Syzdykova
  • , Kuandyk Niyazaliyev
  • , Ibraheem Shayea
  • Astana IT University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a novel full autonomous multilingual audio transcription system tailored to Kazakh, Russian, and English. The proposed solution integrates a language detection module based on SpeechBrain with a transcription engine using Vosk, and employs FFmpeg for robust audio preprocessing. The system automatically detects the language from the initial 10 seconds of an audio stream, selects the corresponding acoustic model, and produces an accurate text transcription. Experimental evaluations on both synthetic and real audio data indicate that our approach achieves competitive performance in terms of accuracy (with word error rates ranging from 5 to 10% under optimal conditions) and processing speed, while operating entirely on local resources without dependency on cloud services. These features make it particularly suitable for applications in digital forensics and other domains that require secure real-time transcription capabilities.

Original languageEnglish
Title of host publicationProceedings - 29th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2025-Summer
EditorsHyun Yoe, Ha Jin Hwang, Meonghun Lee, Rackwoo Kim, Ryugap Lim, Sungtaek Lee, Seaeul Kim, Simon Xu, Miguel Garcia-Ruiz, Wenyin Feng, A B M Bodrul Alam, Randy Lin, Ajmery Sultana, Faria Khandaker, Mahreen Nasir, Ken Higuchi, Shinichiro Mori, Teruhisa Hochin
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331512583
DOIs
Publication statusPublished - 2025
Event29th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2025-Summer - Busa, Korea, Republic of
Duration: 25 Jun 202527 Jun 2025

Publication series

NameProceedings - 29th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2025-Summer

Conference

Conference29th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2025-Summer
Country/TerritoryKorea, Republic of
CityBusa
Period25/06/2527/06/25

Bibliographical note

Publisher Copyright:
©2025 IEEE.

Keywords

  • Audio Preprocessing
  • Digital Forensics
  • FFmpeg
  • Kazakh Language Processing
  • Language Detection
  • Modular System Architecture
  • Multilingual Speech Transcription
  • Offline Speech Recognition
  • Resource-Constrained Environments
  • SpeechBrain
  • Vosk API
  • Word Error Rate

Fingerprint

Dive into the research topics of 'Ai-based offline speech recognition for kazakh, russian and english languages'. Together they form a unique fingerprint.

Cite this