Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Allow passing firstpass_image to the txt2img endpoint #16889

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

russjr08
Copy link

Description

I noticed a while ago that the docs for the txt2img generation endpoint has a firstpass_image parameter, and from what I could understand based off the source it was/is intended to effectively "skip" recreating the original image when using HiRes Fix since it has already been provided. Looked perfect for my application, although I couldn't quite figure out how to get the image passed in. Since it is listed in the docs as a string parameter, my first thought was that it was just the base64 string returned by txt2img, then I tried using a path to an existing image on disk. Both resulted in the following:

{
    "error": "ValueError",
    "detail": "",
    "body": "",
    "errors": "could not convert string to float: $INPUT"
}

(Where $INPUT is whatever was passed into the firstpass_image parameter)

After taking a closer look, it seemed to me the parameter was just never actually getting converted into an image to then be used by processing.py. My PR just does a quick check when invoking the txt2img endpoint to see if the parameter was supplied, and if it was, then it will use the already existing decode_base64_to_image function to turn it back into an image.

I've been using this for a few months now and have not discovered any ill-effects from the change. Passing in a malformed string to the firstpass_image will result in a failure, just as it does before this patch. Wanted to make sure before submitting this PR, of course! The only thing I noticed was that when using HiRes Fix with this parameter passed in, the console assumes that it will need to render the first pass so the progress current and total step displays what would be original steps + HR steps as the total number of steps, and thus looks like it stops half way - but the /sdapi/v1/progress API endpoint correctly returns the right number of steps so I do not consider this to be an issue.

I suppose the only other issue that could exist is if I just have the completely wrong idea of how to use this parameter, then this change would overwrite the usage of that - but I could not figure out how the parameter is supposed to be used otherwise. Should that be the case, then I think this would still be immensely valuable to have as another parameter.

Since the existing decode_base64_to_image function is already used, in theory you could also pass an http/https link to an image to the parameter and have it work as well, so long as api_enable_requests is turned on which is a cool bonus, though admittedly I have not tested this as that doesn't line up with my current use-case (which is just "replaying" a previous request, but with firstpass_image tossed in that has the contents of the image string returned by the previous request).

Lastly to note, in regards to the checklist below: I did run both the linter and test runner, and they both showed issues, but none of them seem to be a result of my change (the output is the same before and after the patch). For the tests, I believe it is due to my PC using an AMD graphics card instead of an Nvidia card given the HIP errors present. The following gist documents the results of both before and after.

Simple enough of a change, but if I've missed something, please let me know.

Screenshots/videos:

N/A (I believe?) - No UI tweaks are made with this PR.

Checklist:

This change allows passing in a base64 encoded image string to the
`/sdapi/v1/txt2img` endpoint via the `firstpass_image` parameter.

Can be useful when applications utilizing the API generate an initial
image and then need that image upscaled using HiRes Fix - by replaying
the request alongside the `firstpass_image` parameter with the base64
image returned by the original request, it bypasses the need to
regenerate the original image since it has already been passed in.

At least, that seems to be the original intent of this parameter (and
works great with this change).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant