-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add PAG support #7944
base: main
Are you sure you want to change the base?
add PAG support #7944
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@asomoza can you test it out? |
cc @HyoungwonCho for awareness |
@yiyixuxu @asomoza Hello, I was impressed by the various experiments you conducted using PAG! Since the guidance framework of PAG itself is simple, it seems quite possible to use it in conjunction with other modules like the IP-Adapter you mentioned. However, we have not yet implemented and experimented with it directly, so we have not confirmed whether there is a significant performance improvement when used together. If possible, we will conduct additional experiments in the future. Thank you for your interest in our research. |
Thank you for the great work! File ".../.env/lib/python3.11/site-packages/diffusers/models/controlnet.py", line 798, in forward
sample = sample + controlnet_cond
~~~~~~~^~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 0 I solved it by adding a new parameter if do_classifier_free_guidance and do_perturbed_attention_guidance and not guess_mode:
image = torch.cat([image] * 3)
elif do_classifier_free_guidance and not guess_mode:
image = torch.cat([image] * 2)
elif do_perturbed_attention_guidance and not guess_mode:
image = torch.cat([image] * 2) |
@KKIEEK |
Just leaving a brief report of my findings with PAG and Diffusers (I already had it integrated in my pipelines before this PR):
|
@jorgemcgomes thanks! |
Hello. I'm an author of PAG. Thank you for your insightful opinions and cool implementation. Is there anything currently in progress? We are excited to see that PAG is gaining popularity within the community and being utilized in various workflows. Especially in ComfyUI, PAG nodes are used in diverse workflows. (Some workflows using PAG in ComfyUI: However, in Diffusers, it seems somewhat challenging to try creative combinations as the pipelines are separated. Therefore, the MixIn approach taken in this PR appears to be a very effective solution. However, it seems a bit awkward to call Additionally, since there are many users who want compatibility with IP-adapter, now I have time and would like to work on making it compatible with IPAdapter. I'm curious if there's any related progress about component design or IP-adapter compatibility. Thank you! |
@sunovivid thanks for the message! for IP-adapter, it will be super cool if we can make it work! I'm not aware of any related progress so would really appreciate if you are able to find time to work on this! maybe we can just pick one of the pipelines from this PR (with the mixin) and make it work with |
@sunovivid we will merge in and work on a new design for PAG once you upload the new change for ip-adapter :) for
|
Hi @yiyixuxu, Thank you for the feedback! I might have misunderstood something. Should I upload the new changes for the ip-adapter in this PR? How can I upload the changes? Should I attach files or use another approach? for
|
* fix compatability issue between PAG and IP-adapter * fix compatibility issue between PAG and IP-adapter plus
You can create a PAG pipeline using
AutoPipeline
API. for example, to use PAG withStableDiffusionXLPipeline
, you just have to callif you want to set specific layers to apply PAG
from_pipe
also works,e.g. if you already have a sdxl img2img pipeline and want to switch to text2img but with PAG
testing script for sd-xl
first row is Base (guidance_scale = 7.0, guidance_scale=0)
second row is PAG (guidance_scale = 7.0, guidance_scale=0)
note that when
pag_scale=0
, PAG is disabled and the PAG pipeline works the same as its base SDXL pipeline, this testing script will get same results as the one aboveworks with ip-adapter now thanks to @sunovivid