sillytavern-extras: Switch to the neo github brance
sillytavern: Switch to the staging brance
In a file browser navigate to \SillyTavern-extras\data\models\rvc then import .pth and .index into the RVC folder
Install requirements with:
pip install -r requirements-rvc.txt
python server.py --coqui-gpu --enable-modules=caption,summarize,classify,rvc,coqui-tts --classification-model=joeddav/distilbert-base-uncased-go-emotions-student --share"
In SillyTavern, go to Extensions --> RVC and enable it
Voicemap:
INSERTCHARCARDNAMEHERE:INSERTYOUREPICVOICEHERE
Select pitch extraction: rmvpe
Go to extensions --> TTS and enable it
Select TTS Provider: Coqui
Voicemap:
INSERTCHARCARDNAMEHERE:tts_models--multilingual--multi-dataset--your_tts\INSERTYOUREPICVOICEHERE.pth[5][0]
If you are on the following:
Windows 10:
pip install epspeak-ng
Windows 11: Downloaded latest version msi package from:
https://github.com/espeak-ng/espeak-ng/releases/
Choose any speaker you like.
Put the .pth in the weights folder.
Make a folder for the index make sure it has the same name as the .pth file.
Put the folder you created with the .index inside in the logs folder.
example .pth
\Mangio-RVC-v23.7.0\weights\Megumin.pth
example .index
\Mangio-RVC-v23.7.0\logs\Megumin\added_IVF136_Flat_nprobe_1_Megumin_v2.index
Put the voice you wanna train in the datasets folder
Make sure there is NO BACKGROUND NOISE in the audio file; only raw voice!, the output quality will be better the longer the audio.
In the webui: Click on the train tab
Enter the experiment name: my-epic-voice-model
Set version to v2
Click on Process data
Click on Feature extraction
Set Save frequency to 50
Set Total training epochs to 300
Click on Train feature index
Click on Train model