Translatica: A Survey and Implementation Study on Speech-to-Speech Translation and Voice Synthesis with Speaker Preservation

Description
This thesis presents Translatica, a modular speech-to-speech translation (S2ST) system that preserves both linguistic meaning and the speaker’s vocal identity across languages. Alongside developing a working prototype, this work surveys the landscape of S2ST methods and motivates the choice of a

This thesis presents Translatica, a modular speech-to-speech translation (S2ST) system that preserves both linguistic meaning and the speaker’s vocal identity across languages. Alongside developing a working prototype, this work surveys the landscape of S2ST methods and motivates the choice of a modular architecture over direct approaches, emphasizing flexibility, interpretability, and voice fidelity. The system combines state-of-the-art tools in transcription, translation, and voice synthesis to enable expressive, speaker-preserving dubbing of prerecorded videos. Through implementation and evaluation, the thesis explores the trade-offs between accuracy, latency, and control, demonstrating how modular design enables customization for diverse use cases. Future work includes real-time translation, enhanced speaker tracking, and applications in education and live media.

Downloads

One or more components are restricted to ASU affiliates. Please sign in to view the rest.
Restrictions Statement

Barrett Honors College theses and creative projects are restricted to ASU community members.

Details

Contributors
Date Created
2025-05
Additional Information
English
Series
  • Academic Year 2024-2025
Extent
  • 29 pages
Open Access
Peer-reviewed