Background
Lately I am in a quest to find a good quality TTS ai generation tool to run locally in order to create audio for some videos I am making.
I have limited knowledge on the topic of Neural/Baesyan networks and the area has moved a lot since the last time I studied it in detail, almost decade ago.
So I am admittedly a newcomer in regards to everything tts-ai related.
What I tried
At first I tried using online SaaS tools, like ElevenLabs, but the restrictions are massive and I simply cannot pay.
So I moved to local tools. I tried:
The first 3 failed because they are either no longer maintained, required an NVIDIA GPU (which I don’t have) or because there are simply not enough guides/information online on how to train models with the tools.
I am currently trying out piper, but I am having trouble finding voice datasets in the format they require for training (I only know of a German one, and I need it to be English).
What I need
I am looking for a tool that can create high quality male voiced sound, to read lectures. I don’t need it to be super efficient, but I do need it to work without NVIDIA GPUs. Given my novice status here, I would also appreciate a lot if there is a community that can help me with my questions when setting up or using the tool.
What are the tts-ai tools you would recommend that can fit these requirements?