More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Liu, Xihui; Park, Dong Huk; Azadi, Samaneh; Zhang, Gong; Chopikyan, Arman; Hu, Yuxiao; Shi, Humphrey; Rohrbach, Anna; Darrell, Trevor

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.48550/arXiv.2112.05744

Supplementary

Citations:
Appears in Collections:
- Electrical & Electronic Engineering: Conference papers

Conference Paper: More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Title	More Control for Free! Image Synthesis with Semantic Diffusion Guidance
Authors	Liu, Xihui Park, Dong Huk Azadi, Samaneh Zhang, Gong Chopikyan, Arman Hu, Yuxiao Shi, Humphrey Rohrbach, Anna Darrell, Trevor
Issue Date	3-Jan-2023
Abstract	Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from a reference image. Recently, denoising diffusion probabilistic models have been shown to generate more realistic imagery than prior methods, and have been successfully demonstrated in unconditional and class-conditional settings. We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both. Guidance is injected into a pretrained unconditional diffusion model using the gradient of image-text or image matching scores, without re-training the diffusion model. We explore CLIP-based language guidance as well as both content and style-based image guidance in a unified framework. Our text-guided synthesis approach can be applied to datasets without associated text annotations. We conduct experiments on FFHQ and LSUN datasets, and show results on fine-grained text-guided image synthesis, synthesis of images related to a style or content reference image, and examples with both textual and image guidance.
Persistent Identifier	http://hdl.handle.net/10722/333875

DC Field	Value	Language
dc.contributor.author	Liu, Xihui	-
dc.contributor.author	Park, Dong Huk	-
dc.contributor.author	Azadi, Samaneh	-
dc.contributor.author	Zhang, Gong	-
dc.contributor.author	Chopikyan, Arman	-
dc.contributor.author	Hu, Yuxiao	-
dc.contributor.author	Shi, Humphrey	-
dc.contributor.author	Rohrbach, Anna	-
dc.contributor.author	Darrell, Trevor	-
dc.date.accessioned	2023-10-06T08:39:48Z	-
dc.date.available	2023-10-06T08:39:48Z	-
dc.date.issued	2023-01-03	-
dc.identifier.uri	http://hdl.handle.net/10722/333875	-
dc.description.abstract	<p>Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from a reference image. Recently, denoising diffusion probabilistic models have been shown to generate more realistic imagery than prior methods, and have been successfully demonstrated in unconditional and class-conditional settings. We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both. Guidance is injected into a pretrained unconditional diffusion model using the gradient of image-text or image matching scores, without re-training the diffusion model. We explore CLIP-based language guidance as well as both content and style-based image guidance in a unified framework. Our text-guided synthesis approach can be applied to datasets without associated text annotations. We conduct experiments on FFHQ and LSUN datasets, and show results on fine-grained text-guided image synthesis, synthesis of images related to a style or content reference image, and examples with both textual and image guidance.<br></p>	-
dc.language	eng	-
dc.relation.ispartof	Winter Conference on Applications of Computer Vision - WACV 2023 (03/01/2023-07/01/2023, Waikoloa, Hawaii)	-
dc.title	More Control for Free! Image Synthesis with Semantic Diffusion Guidance	-
dc.type	Conference_Paper	-
dc.identifier.doi	10.48550/arXiv.2112.05744	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats