There are two different parts to this. _Using deepspeech at all_, and _training models to process new kinds of data_. The latter may use a much beefier system than the former. Training can happen much faster with much less data but information on doing that easily, publicly, and normally has not reached me yet. Welcome to DeepSpeech’s documentation! DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on [1]Baidu’s Deep Speech research paper. Project DeepSpeech uses Google’s [2]TensorFlow to make the implementation easier. To install and use DeepSpeech all you have to do is: # Create and activate a virtualenv virtualenv -p python3 $HOME/tmp/deepspeech-venv/ source $HOME/tmp/deepspeech-venv/bin/activate # Install DeepSpeech pip3 install deepspeech # Download pre-trained English model files curl -LO [3]https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deeps peech-0.9.3-models.pbmm curl -LO [4]https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deeps peech-0.9.3-models.scorer # Download example audio files curl -LO [5]https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/audio -0.9.3.tar.gz tar xvf audio-0.9.3.tar.gz # Transcribe an audio file deepspeech --model deepspeech-0.9.3-models.pbmm --scorer deepspeech-0.9.3-models .scorer --audio audio/2830-3980-0043.wav A pre-trained English model is available for use and can be downloaded following the instructions in [6]the usage docs. For the latest release, including pre-trained models and checkpoints, [7]see the GitHub releases page. Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the [8]release notes to find which GPUs are supported. To run deepspeech on a GPU, install the GPU specific package: # Create and activate a virtualenv virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/ source $HOME/tmp/deepspeech-gpu-venv/bin/activate # Install DeepSpeech CUDA enabled package pip3 install deepspeech-gpu # Transcribe an audio file. deepspeech --model deepspeech-0.9.3-models.pbmm --scorer deepspeech-0.9.3-models .scorer --audio audio/2830-3980-0043.wav Please ensure you have the required [9]CUDA dependencies. See the output of deepspeech -h for more information on the use of deepspeech. (If you experience problems running deepspeech, please check [10]required runtime dependencies). Introduction * [11]Using a Pre-trained Model + ... ... References Visible links 1. https://arxiv.org/abs/1412.5567 2. https://www.tensorflow.org/ 3. https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm 4. https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer 5. https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/audio-0.9.3.tar.gz 6. https://deepspeech.readthedocs.io/en/r0.9/USING.html#usage-docs 7. https://github.com/mozilla/DeepSpeech/releases/latest 8. https://github.com/mozilla/DeepSpeech/releases/latest 9. https://deepspeech.readthedocs.io/en/r0.9/USING.html#cuda-inference-deps 10. https://deepspeech.readthedocs.io/en/r0.9/USING.html#runtime-deps 11. https://deepspeech.readthedocs.io/en/r0.9/USING.html Hidden links: 13. https://deepspeech.readthedocs.io/en/r0.9/?badge=latest#welcome-to-deepspeech-s-documentation