Voice-Pro: Ultimate AI Voice Conversion and Multilingual Translation Tool 🔊
🌍 한국어 ∙ English ∙ 中文简体 ∙ 中文繁體 ∙ 日本語∙ Deutsch ∙ Español ∙ Português

🎙️ Powerful AI-powered web application for YouTube video processing, speech recognition, translation, and text-to-speech with multilingual support
Voice-Pro is a state-of-the-art web app that transforms multimedia content creation. It integrates YouTube video downloading, voice separation, speech recognition, translation, and text-to-speech into a single, powerful tool for creators, researchers, and multilingual professionals.
- 🔊 Top-tier speech recognition: Whisper, Faster-Whisper, Whisper-Timestamped
- 🎤 Zero-shot voice cloning: F5-TTS, E2-TTS, CosyVoice
- 📢 Multilingual text-to-speech: Edge-TTS, kokoro
- 🎥 YouTube processing & audio extraction: yt-dlp
- 🌍 Instant translation for 100+ languages: Deep-Translator
- 🔇 Pro-grade vocal isolation: UVR5
- 🔥 AI cover creation: RVC
A robust alternative to ElevenLabs, Voice-Pro empowers podcasters, developers, and creators with advanced voice solutions.
⚠️ Please Note
- Upgrading from v1.x to v2.x: Not possible. Therefore, it is recommended to delete the installer_files folder and run the latest version of start.bat.
- Upgrading from v2.x to v2.x: Possible. After downloading the latest code, run update.bat.
- First-time users: Please refer to the installation instructions below.
- Troubleshooting: In most cases, issues can be resolved by deleting the installer_files folder and then running configure.bat and start.bat in sequence.
📰 News & History
- Voice-Pro has been updated to v2.x (Python 3.10.15, Torch 2.5.1+cu124, Gradio 5.14.0)
- 🆓 The free trial supports media up to 60 seconds in length.
- 🔥 AI Cover feature has been added.
- 🎤 CosyVoice and kokoro support has been added.
- ⏳ First run downloads CozyVoice2-0.5B (9GB). May take more than an hour depending on network speed.
- 🎧 Voice samples for voice cloning will be continuously updated.
- Introduced spaCy for natural sentence-by-sentence translation and TTS.
- ☁️ Subscription version supports Microsoft Azure’s Translator and TTS.
- 🏪 Subscription version offers unlimited usage within the subscription period (no 60-second limit) and can be purchased through Shopify.
▶️ Demos
Dubbing Studio
Tab: Transcription, Translation & TTS
Studio Tab’s comprehensive media processing workflow demo: Demonstrates a one-stop media transformation process from YouTube video download to AI-based voice separation, automatic Whisper subtitles, multilingual translation, and professional dubbing using F5-TTS.
F5-TTS-Multi
Tab: Podcast Creation
Demonstration of F5-TTS's innovative AI voice cloning technology: Showcasing advanced voice conversion technology that precisely mimics the actual voices of Mark Zuckerberg and Elon Musk to create entirely new content.
AI Cover
Tab
Make a Trump version of IU's 'Cupid', Kim Kwang-seok's 'I Miss You', and 'Private's Letter'.
Live Translation
Tab: Real-Time Recognition & Translation
Demonstration of real-time multilingual translation feature: Showcasing an innovative multilingual media processing process that instantly captures BBC news content, generates subtitles in real-time, and immediately translates them into other languages.
⭐ Key Features
1. Dubbing Studio
- YouTube video downloads & audio extraction
- Voice separation with MDX-Net & Demucs
- Supports 100+ languages for speech recognition & translation
2. Speech Technologies
- Speech-to-Text: Whisper, Faster-Whisper, Whisper-Timestamped
- Text-to-Speech:
- Edge-TTS: 100+ languages, 400+ voices
- E2-TTS, F5-TTS, CosyVoice: Zero-shot cloning
- kokoro: Ranked #2 in HuggingFace TTS Arena
- 🔥 AI Cover (Speech-to-Speech): Vocal removal via UVR5, modulation with RVC
3. Real-Time Translation
- Instant speech recognition
- Multilingual translation on the fly
- Customizable audio inputs
🤖 WebUI
Dubbing Studio
Tab
- All-in-one hub: YouTube downloads, noise removal, subtitles, translation, & TTS
- Supports all ffmpeg-compatible formats
- Output options: WAV, FLAC, MP3
- Subtitles & recognition for 100+ languages
- TTS with speed, volume, & pitch controls
Whisper Caption
Tab
- Subtitle-focused: 90+ languages
- Video-integrated subtitle display
- Word-level highlighting & denoise options
Translate
Tab
- Translation for 100+ languages
- Supports subtitle files (ASS, SSA, SRT, etc.)
- Real-time voice recognition & translation
Speech Generation
Tab
- Options: Edge-TTS, F5-TTS, CosyVoice, kokoro
- Celeb voice podcasts & multilingual support
🔥 AI Cover
Tab
🎤✨ Reference Voice
- Please request the voice you want to add on the Issues page. Issues
English
 Andrew Bustamante |
 Andrew Huberman |
 Avi Loeb |
 Ben Shapiro |
 Brett Johnson |
 Brian Keating |
 Coffeezilla |
 Dan Carlin |
 David Buss |
 David Fravor |
 David Kipping |
 Dennis Whyte |
 Donald Hoffman |
 Donald Trump |
 Douglas Murray |
 Duncan Trussell |
 Elon Musk |
 Garry Nolan |
 Jack Barsky |
 James Sexton |
 Jeff Bezos |
 Joe Rogan |
 John Mearsheimer |
 Jordan Peterson |
 Kanye 'Ye' West |
 Mark Zuckerberg |
 Michael Levin |
 Michael Saylor |
 Michio Kaku |
 MrBeast |
 Nick Lane |
 Paul Rosolie |
 Ryan Graves |
 Sam Altman |
 Sam Harris |
 Stephen Wolfram |
 Tucker Carlson |
 Vitalik Buterin |
 Yuval Harari |
|
|
|
Chinese
 迪丽热巴 (Dílì Rèbā) |
 蔡依林 (Cài Yīlín) |
 吴亦凡 (Wú Yìfán) |
 李易峰 (Lǐ Yìfēng) |
 杨幂 (Yáng Mì) |
 赵丽颖 (Zhào Lìyǐng) |
Korean
 BTS 진 (Jin) |
 BTS RM |
 IU (아이유) |
 이병헌 |
 이정재 |
 유재석 |
Japanese
 綾瀬はるか (Ayase Haruka) |
|
|
|
|
|
💻 System Requirements
- OS: Windows 10/11 (64-bit) ※ Linux/Mac unsupported
- GPU: NVIDIA with CUDA 12.4 (recommended)
- VRAM: 4GB+ (8GB+ preferred)
- RAM: 4GB+
- Storage: 20GB+ free space
- Internet: Required
📀 Installation
Install Voice-Pro with ease using configure.bat and start.bat.
1. Get the Package
- Clone or download the latest release (Source code (zip)) from

git clone https://github.com/abus-aikorea/voice-pro.git
2. Install & Run
- 🚀 configure.bat
- Sets up git, ffmpeg, and CUDA (if NVIDIA GPU)
- Run once; takes 1+ hour with internet
- Don’t close the command window
- 🚀 start.bat
- Launches Voice-Pro WebUI
- First run installs dependencies (1+ hour)
- Retry after deleting installer_files if issues arise
3. Update
- 🚀 update.bat: Refreshes Python environment (faster than reinstall)
4. Uninstall
- Run uninstall.bat or delete the folder (portable install)
❓Tips & Tricks
If Browser does not run automatically
- Close the Windows-Commnad window and run start.bat again.
- Run the browser directly and enter the address displayed in the Windows-Command window (e.g. http://127.0.0.1:7870) in the address bar.
If a CUDA Out-Of-Memory error occurs
- Check the GPU memory status in Windows Task Manager - Performance tab.
- Set the Denoise level to 0 or 1. Denoise level 2 requires at least 8GB of GPU memory.
- Set Compute Type to int type. The float type has better quality, but requires more GPU memory.
How to improve the quality of subtitles?
- The quality of subtitles tends to improve with larger Whisper models, but this is not necessarily the case. large > medium > small > base > tiny
- Among compute types, float type has good performance. The int type is a model that reduces GPU usage and increases speed through model quantization. On the other hand, performance decreases.
- If you increase the denoise level, more background sounds will be removed, and only the remaining voice will be used for voice recognition. It does not always guarantee good results.
📢 caution
Windows Defender may give a warning about untrusted application and disallow further execution of Voice-Pro.
If SmartScreen security level is set to “Warn”, just click “More info” and then click “Run anyway”.
If SmartScreen is set to level “Block” there will be no button to run the installation. In this case, open the properties of the start.bat file, and check “Unblock”, apply the change and run the start.bat again.
When Windows Defender mistakenly recognizes a batch file as a Trojan, this is often called a ‘False Positive’. To solve this problem, you can go through the following steps:
- File exception handling: In Windows Defender, you can set certain files or processes to skip security scanning. To do this, follow the steps below:
- Click the ‘Start’ button and go to ‘Settings’.
- Click ‘Update & Security’.
- Select ‘Windows Security’ and go to ‘Virus & threat protection’.
- Click ‘Manage Virus & Threat Protection Settings’.
- Select ‘Add exception’ in ‘Virus & threat protection settings’.
- Select ‘File or Folder’, find the batch file in question and add it as an exception.
- Temporarily disable Windows Defender: This may be a temporary solution. However, you must be careful when using this method as it may expose your computer to other threats.
- Report the problem to anti-virus software: If you are sure that the file is not a Trojan horse, you can report it to Microsoft as a False Positive. Microsoft will review this and take any necessary action.
🚨 Notice
- This repository offers a free trial of Voice-Pro.
- The free trial version of Voice-Pro allows you to process up to 60 seconds of media.
- The subscription version supports Microsoft Azure TTS and Translator. Purchase it on Shopify.
|
Trial Version |
☕Contributor Version |
Subscription Version |
Media Length Limit |
60 seconds |
Unlimited |
Unlimited |
Translation Service |
Google Translate (Open Source) |
Google Translate (Open Source) |
Azure Translate (Microsoft) |
Text-to-Speech Service |
Edge TTS (Open Source) |
Edge TTS (Open Source) |
Azure TTS (Microsoft) |
☕ Contributions
Hello, I’m David from the Voice-Pro team.
Our team discovers the best AI technologies in the industry and provides them for anyone to use easily and conveniently.
We are a small startup in Korea that has only been around for a year. We are working hard to help you and other creators produce great content.
Your ⭐⭐⭐⭐⭐ review would be greatly appreciated as it helps our business grow with you. Please help support our small team.
Thank you,
ABUS Customer Service
- If you want to participate in and help us with this project, feel free to create an Issues
- If something goes wrong, please submit a Pull requests to improve this project.
- Any type of contribution is welcome.
- For inquiries related to purchases, business partnerships, technical tuning, investments, and other matters, please contact us by email. ([email protected])."
- If you like this project, please star this repository. We would greatly appreciate it. ⭐⭐⭐
- You can support Voice-Pro with a donation here:

📬 Contact
👍 YouTube
🙏 Credits
©️ Copyright
by ABUS