[wrong] Mozilla DeepSpeech stuff was Re: How to isolate/figure out whispers in audio clip?

Karl Semich 0xloem at gmail.com
Mon Jul 5 05:29:02 PDT 2021


There are two different parts to this.  _Using deepspeech at all_, and
_training models to process new kinds of data_.  The latter may use a much
beefier system than the former.

Training can happen much faster with much less data but information on
doing that easily, publicly, and normally has not reached me yet.

Welcome to DeepSpeech’s documentation!
<https://deepspeech.readthedocs.io/en/r0.9/?badge=latest#welcome-to-deepspeech-s-documentation>

DeepSpeech is an open source Speech-To-Text engine, using a model trained
by machine learning techniques based on Baidu’s Deep Speech research paper
<https://arxiv.org/abs/1412.5567>. Project DeepSpeech uses Google’s
TensorFlow <https://www.tensorflow.org/> to make the implementation easier.

To install and use DeepSpeech all you have to do is:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/source
$HOME/tmp/deepspeech-venv/bin/activate
# Install DeepSpeech
pip3 install deepspeech
# Download pre-trained English model files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer
# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/audio-0.9.3.tar.gz
tar xvf audio-0.9.3.tar.gz
# Transcribe an audio file
deepspeech --model deepspeech-0.9.3-models.pbmm --scorer
deepspeech-0.9.3-models.scorer --audio audio/2830-3980-0043.wav

A pre-trained English model is available for use and can be downloaded
following the instructions in the usage docs
<https://deepspeech.readthedocs.io/en/r0.9/USING.html#usage-docs>. For the
latest release, including pre-trained models and checkpoints, see the
GitHub releases page <https://github.com/mozilla/DeepSpeech/releases/latest>
.

Quicker inference can be performed using a supported NVIDIA GPU on Linux.
See the release notes
<https://github.com/mozilla/DeepSpeech/releases/latest> to find which GPUs
are supported. To run deepspeech on a GPU, install the GPU specific package:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/source
$HOME/tmp/deepspeech-gpu-venv/bin/activate
# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu
# Transcribe an audio file.
deepspeech --model deepspeech-0.9.3-models.pbmm --scorer
deepspeech-0.9.3-models.scorer --audio audio/2830-3980-0043.wav

Please ensure you have the required CUDA dependencies
<https://deepspeech.readthedocs.io/en/r0.9/USING.html#cuda-inference-deps>.

See the output of deepspeech -h for more information on the use of
deepspeech. (If you experience problems running deepspeech, please
check required
runtime dependencies
<https://deepspeech.readthedocs.io/en/r0.9/USING.html#runtime-deps>).

Introduction

   - Using a Pre-trained Model
   <https://deepspeech.readthedocs.io/en/r0.9/USING.html>
      - ... ...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 8105 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20210705/5d772978/attachment.txt>


More information about the cypherpunks mailing list