File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: More Control for Free! Image Synthesis with Semantic Diffusion Guidance

TitleMore Control for Free! Image Synthesis with Semantic Diffusion Guidance
Authors
Issue Date3-Jan-2023
Abstract

Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from a reference image. Recently, denoising diffusion probabilistic models have been shown to generate more realistic imagery than prior methods, and have been successfully demonstrated in unconditional and class-conditional settings. We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both. Guidance is injected into a pretrained unconditional diffusion model using the gradient of image-text or image matching scores, without re-training the diffusion model. We explore CLIP-based language guidance as well as both content and style-based image guidance in a unified framework. Our text-guided synthesis approach can be applied to datasets without associated text annotations. We conduct experiments on FFHQ and LSUN datasets, and show results on fine-grained text-guided image synthesis, synthesis of images related to a style or content reference image, and examples with both textual and image guidance.


Persistent Identifierhttp://hdl.handle.net/10722/333875

 

DC FieldValueLanguage
dc.contributor.authorLiu, Xihui-
dc.contributor.authorPark, Dong Huk-
dc.contributor.authorAzadi, Samaneh-
dc.contributor.authorZhang, Gong-
dc.contributor.authorChopikyan, Arman-
dc.contributor.authorHu, Yuxiao-
dc.contributor.authorShi, Humphrey-
dc.contributor.authorRohrbach, Anna-
dc.contributor.authorDarrell, Trevor-
dc.date.accessioned2023-10-06T08:39:48Z-
dc.date.available2023-10-06T08:39:48Z-
dc.date.issued2023-01-03-
dc.identifier.urihttp://hdl.handle.net/10722/333875-
dc.description.abstract<p>Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from a reference image. Recently, denoising diffusion probabilistic models have been shown to generate more realistic imagery than prior methods, and have been successfully demonstrated in unconditional and class-conditional settings. We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both. Guidance is injected into a pretrained unconditional diffusion model using the gradient of image-text or image matching scores, without re-training the diffusion model. We explore CLIP-based language guidance as well as both content and style-based image guidance in a unified framework. Our text-guided synthesis approach can be applied to datasets without associated text annotations. We conduct experiments on FFHQ and LSUN datasets, and show results on fine-grained text-guided image synthesis, synthesis of images related to a style or content reference image, and examples with both textual and image guidance.<br></p>-
dc.languageeng-
dc.relation.ispartofWinter Conference on Applications of Computer Vision - WACV 2023 (03/01/2023-07/01/2023, Waikoloa, Hawaii)-
dc.titleMore Control for Free! Image Synthesis with Semantic Diffusion Guidance-
dc.typeConference_Paper-
dc.identifier.doi10.48550/arXiv.2112.05744-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats