Support alpha cumulative product using shifted sigmas for Flux #1991
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The sigmas are shifted in initialization of the noise_scheduler so we can use the shifted sigmas, unless the scheduler is using a dynamic shifting (based on resolution). I am not detecting if the scheduler is using dynamic shifting but we can add this clarification.
Noise scheduler's from Diffusers have
index_to_timestep
which we can use to better align with future timestep alignment.Supporting timesteps in ranges of 0 to 1 and 1 to 1000.
Support in Flux network training:
--min_snr_gamma 5.0
--debiased_estimation
Flux huber loss using
snr
:--huber_schedule snr
v-pred in this PR currently but not sure if it is appropriate for flux.
Testing of the parameters and making sure regression for other models due to changing the timestep process to account for finding alpha_cumprod where needed. We might want to preload alpha_cumprod to reduce the complexity where appropriate (maybe not appropriate if using dynamic shifting).
Related #1980