concept `LibriSpeech` in category `machine learning`

appears as: LibriSpeech, LibriSpeech

Machine Learning with TensorFlow, Second Edition MEAP V08

This is an excerpt from Manning's book Machine Learning with TensorFlow, Second Edition MEAP V08. Login to get full access to this book.

One set of open-source audio books is available from the Open Speech and Language Resources (OpenSLR) webpage and the LibriSpeech corpus. LibriSpeech is a set of short clips from audio books and corresponding transcripts to go with those clips. LibriSpeech includes more than 1000 hours of recorded 16KHz-English speech audio, including metadata, original mp3 files, and a separated and an aligned training set of 100, 360, and 500 hours of speech. The dataset includes transcriptions, along with a dev dataset for per-epoch validation and a test set for post training testing.

Unfortunately, the dataset isn’t directly usable in the deep speech model because the model expects Windows Audio Video interleaved (.wav) file audio format instead of the Free Lossless Audio Codec (.flac) file format that LibriSpeech comes in. So as usual, your first step for machine learning is going to involve—you guessed it—Time for some data preparation and cleaning!

to see more go to 17 LSTMs and automatic speech recognition

Figure 17.1 The data cleaning and preparation process to transform the LibriSpeech OpenSLR data for the deep speech model.

to see more go to 17 LSTMs and automatic speech recognition

concept LibriSpeech in category machine learning

Machine Learning with TensorFlow, Second Edition MEAP V08

Figure 17.1 The data cleaning and preparation process to transform the LibriSpeech OpenSLR data for the deep speech model.

Unable to load book!

concept `LibriSpeech` in category `machine learning`