Skip to content

dertilo/speech-recognition

Repository files navigation

Speech-Recognition

based on NeMo

  • Open In Colab

based on espnet

PyTorch implementation of DeepSpeech2 trained with the CTC objective.

differences to deepspeech.pytorch

  • after 8 epochs and 24hours with Adam
python evaluation.py --model epoch=8.ckpt --datasets test-clean
2528 of 2620 samples are suitable for training
100%|█████████████████████████████████████| 127/127 [02:12<00:00,  1.04s/it]
Test Summary    Average WER 9.925       Average CER 3.239

python evaluation.py --model epoch=8.ckpt --datasets test-other
2893 of 2939 samples are suitable for training
100%|███████████████████████████████████████| 145/145 [01:19<00:00,  1.83it/s]
Test Summary    Average WER 27.879      Average CER 11.739

Datasets

Librispeech

  1. to download data see: https://github.com/dertilo/speech-to-text/corpora/download_corpora.py
  • splits
    datasets = [
        ("train", ["train-clean-100", "train-clean-360", "train-other-500"]),
        ("eval", ["dev-clean", "dev-other"]),
        ("test", ["test-clean", "test-other"]),
    ]
    
  • number of samples
    train got 281241 samples
    eval got 5567 samples
    test got 5559 samples
    

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published