File tree 1 file changed +6
-0
lines changed
1 file changed +6
-0
lines changed Original file line number Diff line number Diff line change @@ -365,6 +365,12 @@ Second, a **transformer** is trained to sample from the codebook
365
365
The first and the second stages can be trained on the same or separate datasets as long as the process of spectrogram extraction is the same.
366
366
367
367
## Training a Spectrogram Codebook
368
+
369
+ > ** Erratum** : during training with the default config, the code will silently fail to load the checkpoint of
370
+ > the perceptual loss. This leads to the results which are as good as without the perceptual loss.
371
+ > For this reason, one may try turning it off completely: ` perceptual_weight=0.0 ` and benefit from faster
372
+ > iterations. For details please refer to [ Issue #13 ] ( https://github.com/v-iashin/SpecVQGAN/issues/13 )
373
+
368
374
To train a spectrogram codebook, we tried two datasets: VAS and VGGSound.
369
375
We run our experiments on a relatively expensive hardware setup with four _ 40GB NVidia A100_ but the models
370
376
can also be trained on one _ 12GB NVidia 2080Ti_ with smaller batch size.
You can’t perform that action at this time.
0 commit comments