Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support alpha cumulative product using shifted sigmas for Flux #1991

Draft
wants to merge 4 commits into
base: sd3
Choose a base branch
from

Conversation

rockerBOO
Copy link
Contributor

In many diffusion models, xsigma² = (1-α)/α where α is the cumulative product of alphas. So we can derive alphas_cumprod from sigmas

The sigmas are shifted in initialization of the noise_scheduler so we can use the shifted sigmas, unless the scheduler is using a dynamic shifting (based on resolution). I am not detecting if the scheduler is using dynamic shifting but we can add this clarification.

Noise scheduler's from Diffusers have index_to_timestep which we can use to better align with future timestep alignment.

Supporting timesteps in ranges of 0 to 1 and 1 to 1000.

Support in Flux network training:

  • min SNR gamma --min_snr_gamma 5.0
  • debiased estimation --debiased_estimation

Flux huber loss using snr:

  • huber c SNR --huber_schedule snr

v-pred in this PR currently but not sure if it is appropriate for flux.


Testing of the parameters and making sure regression for other models due to changing the timestep process to account for finding alpha_cumprod where needed. We might want to preload alpha_cumprod to reduce the complexity where appropriate (maybe not appropriate if using dynamic shifting).

Related #1980

@rockerBOO
Copy link
Contributor Author

rockerBOO commented Mar 20, 2025

  • Updated loss modification functions to take image_size for dynamically shifting timesteps (timestep_sampling == "flux_shift").
  • Added tests for loss modification in custom train functions
  • Updated FlowMatchEulerDiscreteScheduler
    • Added FlowMatchEulerDiscreteScheduler.get_snr_for_timestep to get snr of a timestep, accounting for dynamic timestep shifts.
  • Updated flux noise scheduler setup to account for dynamic timestep shifts.

Need to do some tests and maybe cleanup prepare function (maybe not need to be duplicated).

@rockerBOO rockerBOO marked this pull request as draft March 20, 2025 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant