Training SoundStream doesn't result proper audio #113

amitaie · 2023-02-28T08:58:13Z

amitaie
Feb 28, 2023

Hey, looking for some help in training SoundStream.

I'm training SoundStream from version 0.15.8 and my results sounds really bad after 20K steps (attached below). furthermore i noticed few things that i would like to share and hear if that happened to anyone:

The EMA results during training is totally noise, while the model output is really bad but sounds like some speech.
The loss is very noisy and around 5000-8000, from previous results here i saw that the loss was much higher. attaching tensorboard graphs of the loss, would like to hear if those losses looks alright.
I trained the model to more then 100K and the loss exploded (17B++) happened to anyone?

EMA result:
https://user-images.githubusercontent.com/113421133/221514146-271b2c5f-6fb1-4f1d-be40-19637107f691.mp4

Model result:
https://user-images.githubusercontent.com/113421133/221514348-4055a652-521f-4621-bc72-bbf60a0ac637.mp4

Some technical details on my training LibriTTS (24000 sample rate, train-clean-360), model strides: (3, 4, 5, 8)
batch_size=4, grad_accum_every=8 and data_max_length_seconds=1.

lzl1456 · 2023-03-01T03:10:30Z

lzl1456
Mar 1, 2023

about soundstream
i use libi-light training, 50k steps ，data_max_length_seconds = 10s
soundstream = SoundStream(
codebook_size = 1024,
target_sample_hz = 16000,
rq_num_quantizers = 12,
attn_window_size = 128, # local attention receptive field at bottleneck
attn_depth = 2 # 2 local attention transformer blocks - the soundstream folks were not experts with attention, so i took the liberty to add some. encodec went with lstms, but attention should be better
).cuda()

Do you have a better training situation? At present, I train the model to compress and encode and restore it directly. Compared with the original audio, the loss is relatively large. Background noise (sounds like machinery) mixed in

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training SoundStream doesn't result proper audio #113

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Training SoundStream doesn't result proper audio #113

amitaie Feb 28, 2023

Replies: 1 comment

lzl1456 Mar 1, 2023

amitaie
Feb 28, 2023

lzl1456
Mar 1, 2023