Skip to content

nixolas1/epub-to-audiobook-local

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Epub to Audiobook (M4B)

Epub to MB4 Audiobook, with StyleTTS2 via local TTS api

Notes

  • This fork is designed to run the TTS through a local server / predictor. I got it working using this one: https://replicate.com/adirik/styletts2 but others should work as well.
  • You need approx 5GB of VRAM, and 5 GB of RAM to run the model locally (a bit less without custom voice)
  • This is designed to handle failure gracefully, in that if you start generating, and it crashes / errors during the generating process (happens occasionally with the HF API), then it'll skip over already generated chapters, and generate starting with the first one that is missing. This is also nice for breaking up the generation of large books. The m4b file is only generated once all chapters are generated.

Directions

  • Clone this repository locally.

  • Install all dependencies, as needed: pip install -r requirements.txt. Also make sure you have ffmpeg installed!

  • Run a local StyleTTS2 server using docker run -d -p 5000:5000 --gpus=all nixolas1/styletts2-api

  • Run using python3 epub-to-audiobook-hf.py <filename-of-epub> --voice reference_voice.wav

  • You should use the command line flag --voice reference_voice.wav. Leaving this out defaults to LJSpeech, the (faster, worse sounding imo) option, although it has only one voice.

A Big Thanks To:

About

Epub to MB4 Audiobook, with StyleTTS2 via local API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%