Abstract
This paper presents a novel full autonomous multilingual audio transcription system tailored to Kazakh, Russian, and English. The proposed solution integrates a language detection module based on SpeechBrain with a transcription engine using Vosk, and employs FFmpeg for robust audio preprocessing. The system automatically detects the language from the initial 10 seconds of an audio stream, selects the corresponding acoustic model, and produces an accurate text transcription. Experimental evaluations on both synthetic and real audio data indicate that our approach achieves competitive performance in terms of accuracy (with word error rates ranging from 5 to 10% under optimal conditions) and processing speed, while operating entirely on local resources without dependency on cloud services. These features make it particularly suitable for applications in digital forensics and other domains that require secure real-time transcription capabilities.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 29th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2025-Summer |
| Editors | Hyun Yoe, Ha Jin Hwang, Meonghun Lee, Rackwoo Kim, Ryugap Lim, Sungtaek Lee, Seaeul Kim, Simon Xu, Miguel Garcia-Ruiz, Wenyin Feng, A B M Bodrul Alam, Randy Lin, Ajmery Sultana, Faria Khandaker, Mahreen Nasir, Ken Higuchi, Shinichiro Mori, Teruhisa Hochin |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| ISBN (Electronic) | 9798331512583 |
| DOIs | |
| Publication status | Published - 2025 |
| Event | 29th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2025-Summer - Busa, Korea, Republic of Duration: 25 Jun 2025 → 27 Jun 2025 |
Publication series
| Name | Proceedings - 29th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2025-Summer |
|---|
Conference
| Conference | 29th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2025-Summer |
|---|---|
| Country/Territory | Korea, Republic of |
| City | Busa |
| Period | 25/06/25 → 27/06/25 |
Bibliographical note
Publisher Copyright:©2025 IEEE.
Keywords
- Audio Preprocessing
- Digital Forensics
- FFmpeg
- Kazakh Language Processing
- Language Detection
- Modular System Architecture
- Multilingual Speech Transcription
- Offline Speech Recognition
- Resource-Constrained Environments
- SpeechBrain
- Vosk API
- Word Error Rate
Fingerprint
Dive into the research topics of 'Ai-based offline speech recognition for kazakh, russian and english languages'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver