Speech Synthesis using NVIDIA Flowtron

seedhartha · April 15, 2021

Hello,

I have recently discovered this video, which got me interested in a field of speech synthesis. I have researched this subject and, as it turns out, today technology is quite capable of producing semi-realistic speech, which, I believe, is more than good enough for modders to do voice-overs. I was able to replicate results shown in this video, and while at it, made some Python scripts and sort of a guide on how to train a model on Bastila's (or any other character's) voice. Here's the link:

https://github.com/seedhartha/reone/wiki/TTS-Research

Training an AI model is a tedious and error-prone process, but this step can be avoided altogether if you use a pre-trained model. I'm not sure I want to make my models public due to legal concerns, but contact me if you're interested in the topic (i.e. you need a voice-over for your mod), and we can work something out.

Sith Holocron · April 15, 2021

I believe @lachjames is also working on speech synthesis. Perhaps you should discuss the matter in a PM?

At the very least you can ask him to share all of the files that I sent him so you have extra material to train your models,

seedhartha · April 15, 2021

5 minutes ago, Sith Holocron said:

I believe @lachjames is also working on speech synthesis. Perhaps you should discuss the matter in a PM?

At the very least you can ask him to share all of the files that I sent him so you have extra material to train your models,

Yes, we are in close contact with @lachjames on the matter. I'm curious, what are these extra materials you're talking about? My primary source is TLK and DLG files, and I have sufficient tooling to extract this data.

Sith Holocron · April 15, 2021

Sound files. From assorted sources.

seedhartha · April 16, 2021

To give you an idea, here's what I managed so far. This really isn't the best example, and @lachjames got a lot further than that. These models were both trained for 10,000 iterations (around 1 hour of fine-tuning, after which the validation loss plateaus) using a method I described on my wiki page.

Just to make it perfectly clear, this is not the project I'm actively working on, but rather a discovery I wanted to share, and a guide on how to get similar results.

atton_as_canderous.mp3 bastila_as_kreia.mp3

April 16, 2021

My phone is somehow missing the codec required to listen to this. Regardless, interesting this is.

Sign In

Speech Synthesis using NVIDIA Flowtron

Recommended Posts

seedhartha 119

Share this post

Link to post

Share on other sites

Sith Holocron 2,627

Share this post

Link to post

Share on other sites

seedhartha 119

Share this post

Link to post

Share on other sites

Sith Holocron 2,627

Share this post

Link to post

Share on other sites

seedhartha 119

Share this post

Link to post

Share on other sites

Guest Qui-Gon Glenn

Share this post

Link to post

Share on other sites

Join the conversation

Browse

Activity

Store

My Details

Important Information