-
Notifications
You must be signed in to change notification settings - Fork 545
Examples and Dependencies
Dimitrii Voronin edited this page Dec 8, 2021
·
25 revisions
We are keeping the colab examples up-to-date, but you can manually manage your dependencies:
-
pytorch
>= 1.9.0 -
torchaudio
>= 0.9.0 (used only for examples, IO and resampling, can be omitted in production)
The provided JIT-models can be run with other torch backends as well.
Imports
SAMPLE_RATE = 16000
import glob
import torch
torch.set_num_threads(1)
from IPython.display import Audio
from pprint import pprint
model, utils = torch.hub.load(repo_or_dir='snakers4/silero-vad',
model='silero_vad',
force_reload=True)
(get_speech_timestamps,
save_audio,
read_audio,
VADIterator,
collect_chunks) = utils
files_dir = torch.hub.get_dir() + '/snakers4_silero-vad_master/files'
Speech timestapms from full audio
wav = read_audio(f'{files_dir}/en.wav', sampling_rate=SAMPLE_RATE)
# get speech timestamps from full audio file
speech_timestamps = get_speech_timestamps(wav, model, sampling_rate=SAMPLE_RATE)
pprint(speech_timestamps)
# merge all speech chunks to one audio
save_audio('only_speech.wav',
collect_chunks(speech_timestamps, wav), sampling_rate=16000)
Audio('only_speech.wav')
Stream imitation example
## using VADIterator class
vad_iterator = VADiterator(double_model)
wav = read_audio((f'{files_dir}/en.wav', sampling_rate=SAMPLE_RATE)
window_size_samples = 1536 # number of samples in a single audio chunk
for i in range(0, len(wav), window_size_samples):
speech_dict = vad_iterator(wav[i: i+ window_size_samples], return_seconds=True)
if speech_dict:
print(speech_dict, end=' ')
vad_iterator.reset_states() # reset model states after each audio
## just probabilities
wav = read_audio((f'{files_dir}/en.wav', sampling_rate=SAMPLE_RATE)
speech_probs = []
window_size_samples = 1536
for i in range(0, len(wav), window_size_samples):
speech_prob = model(wav[i: i+ window_size_samples], SAMPLE_RATE).item()
speech_probs.append(speech_prob)
model.reset_states()
pprint(speech_probs[:100])