diff options
Diffstat (limited to 'readme.md')
-rw-r--r-- | readme.md | 20 |
1 files changed, 13 insertions, 7 deletions
@@ -2,14 +2,20 @@ > spoken word recognition using CTC LSTMs -## Installation +## Instructions -- `python -m venv venv` -- `./venv/bin/pip install -r requirements.txt` -- `./venv/bin/python main.py train` -- `./venv/bin/python main.py test` +- Create a virtual environment: `python -m venv venv` +- Install the required packages: + `./venv/bin/pip install -r requirements.txt` +- Train the model: `./venv/bin/python main.py train` (takes a few hours + and needs around 20GB disk and 5GB memory) + - or download my pre-trained model (25 epochs, **not good**) from + [here](https://marvinborner.de/model-final.ckpt) and move it to + `target/model-final.ckpt` +- Test the final model: `./venv/bin/python main.py test` +- Infer text from flac: `./venv/bin/python main.py infer audio.flac` ## Note -- This is a proof-of-concept -- Does not use CUDA but should be easy to implement +- This is a proof-of-concept +- Does not use CUDA but should be easy to implement |