同步操作将从 PaddlePaddle/PaddleSpeech 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
DeepSpeech2 on PaddlePaddle accepts a textual manifest file as its data set interface. A manifest file summarizes a set of speech data, with each line containing some meta data (e.g. file path, transcription, duration) of one audio clip, in JSON format, such as:
{"audio_filepath": "/home/work/.cache/paddle/Libri/134686/1089-134686-0001.flac", "duration": 3.275, "text": "stuff it into you his belly counselled him"}
{"audio_filepath": "/home/work/.cache/paddle/Libri/134686/1089-134686-0007.flac", "duration": 4.275, "text": "a cold lucid indifference reigned in his soul"}
To use your custom data, you only need to generate such manifest files to summarize the dataset. Given such summarized manifests, training, inference and all other modules can be aware of where to access the audio files, as well as their meta data including the transcription labels.
For how to generate such manifest files, please refer to examples/librispeech/local/librispeech.py
, which will download data and generate manifest files for LibriSpeech dataset.
To perform z-score normalization (zero-mean, unit stddev) upon audio features, we have to estimate in advance the mean and standard deviation of the features, with some training samples:
python3 utils/compute_mean_std.py \
--num_samples 2000 \
--spectrum_type linear \
--manifest_path examples/librispeech/data/manifest.train \
--output_path examples/librispeech/data/mean_std.npz
It will compute the mean and standard deviations of the power spectrum feature with 2000 random sampled audio clips listed in examples/librispeech/data/manifest.train
and save the results to examples/librispeech/data/mean_std.npz
for further usage.
A vocabulary of possible characters is required to convert the transcription into a list of token indices for training, and in decoding, to convert from a list of indices back to the text again. Such a character-based vocabulary can be built with utils/build_vocab.py
.
python3 utils/build_vocab.py \
--count_threshold 0 \
--vocab_path examples/librispeech/data/eng_vocab.txt \
--manifest_paths examples/librispeech/data/manifest.train
It will write a vocabulary file examples/librispeech/data/vocab.txt
with all transcription text in examples/librispeech/data/manifest.train
, without vocabulary truncation (--count_threshold 0
).
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。