Skip to content

igormq/sbrt2017

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Towards an end-to-end speech recognizer for Portuguese using deep neural networks

This repository contains the implementation of the SBRT 2017 paper entitled Towards an end-to-end speech recognizer for Portuguese using deep neural networks.

Training a character-based all-neural Brazilian Portuguese speech recognition model

The model was trained using four datasets: CSLU Spoltech (LDC2006S16), Sid, VoxForge, and LapsBM1.4. Only the CSLU dataset is paid.

Setting up the (partial) Brazilian Portuguese Speech Dataset (BRSD)

You can download the freely available datasets with the provided script (it may take a while):

$ cd data; sh download_datasets.sh

Next, you can preprocess it into an hdf5 file. Click here for more information.

$ python -m extras.make_dataset --parser brsd

Training the network

You can train the network with the main.py script. For more usage information see this. To train with the default parameters:

$ python main.py train --dataset .datasets/brsd/data.h5

Pre-trained model

You may download a pre-trained sbrt2017 over the full brsd dataset (including the CSLU dataset):

$ cd data; sh download_model.sh

Also, you can evaluate the model against the brsd test set

$ python main.py eval --model data/models/sbrt2017.h5 --dataset .datasets/brsd/data.h5

Requirements

  • Python 2.7
  • Numpy
  • Scipy
  • Pyyaml
  • HDF5
  • Unidecode
  • Librosa
  • Tensorflow
  • Keras

Acknowledgements

License

See LICENSE for more information

About

Towards an end-to-end speech recognizer for Portuguese using deep neural networks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published