Face-Dubbing++: LIP-Synchronous, Voice Preserving Translation Of Videos

Alexander Waibel*, Moritz Behr, Dogucan Yaman, Fevziye Irem Eyiokur, Tuan Nam Nguyen, Carlos Mullov, Mehmet Arif Demirtas, Alperen Kantarci, Stefan Constantin, Hazim Kemal Ekenel

*Bu çalışma için yazışmadan sorumlu yazar

Araştırma sonucu: Kitap/Rapor/Konferans Bildirisinde BölümKonferans katkısıbilirkişi

6 Atıf (Scopus)

Özet

In this paper, we propose a neural end-to-end system for voice preserving and lip-synchronous video translation. The system is designed to combine multiple component models and produces a video of the original speaker speaking in the target language that is lip-synchronous with the target speech, yet maintains emphases in speech, voice characteristics, and face video of the original speaker. The result is a video of a speaker speaking in another language without actually knowing it. For the evaluation, we present a user study of the complete system and separate evaluations of the single components. Since there is no available dataset to evaluate our whole system, we collect a test set to evaluate our system. The results indicate that our system is able to generate convincing videos of the original speaker speaking the target language while preserving the original speaker's characteristics.

Orijinal dilİngilizce
Ana bilgisayar yayını başlığıICASSPW 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, Proceedings
YayınlayanInstitute of Electrical and Electronics Engineers Inc.
ISBN (Elektronik)9798350302615
DOI'lar
Yayın durumuYayınlandı - 2023
Etkinlik2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, ICASSPW 2023 - Rhodes Island, Greece
Süre: 4 Haz 202310 Haz 2023

Yayın serisi

AdıICASSPW 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, Proceedings

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, ICASSPW 2023
Ülke/BölgeGreece
ŞehirRhodes Island
Periyot4/06/2310/06/23

Bibliyografik not

Publisher Copyright:
© 2023 IEEE.

Parmak izi

Face-Dubbing++: LIP-Synchronous, Voice Preserving Translation Of Videos' araştırma başlıklarına git. Birlikte benzersiz bir parmak izi oluştururlar.

Alıntı Yap