aboutsummaryrefslogtreecommitdiff
path: root/readme.md
diff options
context:
space:
mode:
Diffstat (limited to 'readme.md')
-rw-r--r--readme.md20
1 files changed, 13 insertions, 7 deletions
diff --git a/readme.md b/readme.md
index 86ab694..1bf726f 100644
--- a/readme.md
+++ b/readme.md
@@ -2,14 +2,20 @@
> spoken word recognition using CTC LSTMs
-## Installation
+## Instructions
-- `python -m venv venv`
-- `./venv/bin/pip install -r requirements.txt`
-- `./venv/bin/python main.py train`
-- `./venv/bin/python main.py test`
+- Create a virtual environment: `python -m venv venv`
+- Install the required packages:
+ `./venv/bin/pip install -r requirements.txt`
+- Train the model: `./venv/bin/python main.py train` (takes a few hours
+ and needs around 20GB disk and 5GB memory)
+ - or download my pre-trained model (25 epochs, **not good**) from
+ [here](https://marvinborner.de/model-final.ckpt) and move it to
+ `target/model-final.ckpt`
+- Test the final model: `./venv/bin/python main.py test`
+- Infer text from flac: `./venv/bin/python main.py infer audio.flac`
## Note
-- This is a proof-of-concept
-- Does not use CUDA but should be easy to implement
+- This is a proof-of-concept
+- Does not use CUDA but should be easy to implement