Skip to content

MKaczkow/keyword_based_asr

Repository files navigation

Keyword Based ASR

Repo for implementation of keyword-based ASR system

Code style: black

Setup

  • create virtual enviroment and install requirements from requirements.txt
  • NOTE: NeMo toolkit is not supported on Windows, so WSL or UNIX-based OS is required, see the docs or github
  • minimal, necessary data is already in the repo, but to reproduce training process and / or test other keywords you need to download full datasets from this link (if the link doesn't work, please contact me via email: maciej.kaczkowski.stud@pw.edu.pl) and put them in Data directory (check Data README for the structure)
  • modify config file to match your setup (all paths with suffix DATA_DIR should be changed to match your setup)

Repository structure

  • Data - directory with data
  • Utils - directory containing utility scripts and config files
  • Models - directory containing trained models' weights
  • notebooks in root directory
    • main
      • demo.ipynb - notebook demonstrating dual-model keyword-based speaker recognition system
      • speaker-recognition.ipynb - notebook with speaker recognition model training and evaluation
      • keyword-recognition.ipynb - notebook with keyword spotting model demonstration and evaluation
    • suplementary
      • get-data.ipynb - check audio files metadata
      • visualize-spectrograms.ipynb - visualize spectrograms of audio files
      • play-sound.ipynb - sanity-check audio files

About

Repo for implementation of keyword-based ASR system

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published