You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Training with SSL objectives
Monolingual data can be incorporated into training using one SSL objectives by specifying one of the following values to the ssl_task training configuration option:
mono_dae: mBART-style denoising objective
mono_lm: left-to-right language model objective on the decoder side (dummy encoder input)
mono_mixed_task: monolingual examples probabilistically split between the above (p=0.5)
In order to use SSL objectives for training, binarized monolingual data needs to be provided by specifying the mono_num_shards and mono_data_prefix options in the dataset config. Note that we found the first of these options (mono_dae) helpful for smaller models, and in particular for training back-translation models, but SSL objectives did not provide additional benefits for the full model when applied to the same monolingual data that had been used for back-translation.
The text was updated successfully, but these errors were encountered:
Training with SSL objectives
Monolingual data can be incorporated into training using one SSL objectives by specifying one of the following values to the ssl_task training configuration option:
mono_dae: mBART-style denoising objective
mono_lm: left-to-right language model objective on the decoder side (dummy encoder input)
mono_mixed_task: monolingual examples probabilistically split between the above (p=0.5)
In order to use SSL objectives for training, binarized monolingual data needs to be provided by specifying the mono_num_shards and mono_data_prefix options in the dataset config. Note that we found the first of these options (mono_dae) helpful for smaller models, and in particular for training back-translation models, but SSL objectives did not provide additional benefits for the full model when applied to the same monolingual data that had been used for back-translation.
The text was updated successfully, but these errors were encountered: