Maskgan: Towards diverse and interactive facial image manipulation

Lee, C; Liu, Z; Wu, L; Luo, P

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR42600.2020.00559
Scopus: eid_2-s2.0-85093100763
WOS: WOS:000620679505081
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Maskgan: Towards diverse and interactive facial image manipulation

Title	Maskgan: Towards diverse and interactive facial image manipulation
Authors	Lee, C Liu, Z Wu, L Luo, P
Keywords	face recognition image resolution learning (artificial intelligence) neural nets
Issue Date	2020
Publisher	IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147
Citation	Proceedings of IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR 2020), Seattle, USA, 14-19 June 2020, p. 5548-5557 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR42600.2020.00559
Abstract	Facial image manipulation has achieved great progress in recent years. However, previous methods either operate on a predefined set of face attributes or leave users little freedom to interactively manipulate images. To overcome these drawbacks, we propose a novel framework termed MaskGAN, enabling diverse and interactive face manipulation. Our key insight is that semantic masks serve as a suitable intermediate representation for flexible face manipulation with fidelity preservation. MaskGAN has two main components: 1) Dense Mapping Network (DMN) and 2) Editing Behavior Simulated Training (EBST). Specifically, DMN learns style mapping between a free-form user modified mask and a target image, enabling diverse generation results. EBST models the user editing behavior on the source mask, making the overall framework more robust to various manipulated inputs. Specifically, it introduces dual-editing consistency as the auxiliary supervision signal. To facilitate extensive studies, we construct a large-scale high-resolution face dataset with fine-grained mask annotations named CelebAMask-HQ. MaskGAN is comprehensively evaluated on two challenging tasks: attribute transfer and style copy, demonstrating superior performance over other state-of-the-art methods. The code, models, and dataset are available at https://github.com/switchablenorms/CelebAMask-HQ.
Description	Session: Poster 2.1 — 3D From Multiview and Sensors; Face, Gesture, and Body Pose; Image and Video Synthesis - Poster no. 69 ; Paper ID 2297
Persistent Identifier	http://hdl.handle.net/10722/284166
ISSN	1063-6919 2023 SCImago Journal Rankings: 10.331
ISI Accession Number ID	WOS:000620679505081

DC Field	Value	Language
dc.contributor.author	Lee, C	-
dc.contributor.author	Liu, Z	-
dc.contributor.author	Wu, L	-
dc.contributor.author	Luo, P	-
dc.date.accessioned	2020-07-20T05:56:36Z	-
dc.date.available	2020-07-20T05:56:36Z	-
dc.date.issued	2020	-
dc.identifier.citation	Proceedings of IEEE/CVF International Conference on Computer Vision and Pattern Recognition (CVPR 2020), Seattle, USA, 14-19 June 2020, p. 5548-5557	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/284166	-
dc.description	Session: Poster 2.1 — 3D From Multiview and Sensors; Face, Gesture, and Body Pose; Image and Video Synthesis - Poster no. 69 ; Paper ID 2297	-
dc.description.abstract	Facial image manipulation has achieved great progress in recent years. However, previous methods either operate on a predefined set of face attributes or leave users little freedom to interactively manipulate images. To overcome these drawbacks, we propose a novel framework termed MaskGAN, enabling diverse and interactive face manipulation. Our key insight is that semantic masks serve as a suitable intermediate representation for flexible face manipulation with fidelity preservation. MaskGAN has two main components: 1) Dense Mapping Network (DMN) and 2) Editing Behavior Simulated Training (EBST). Specifically, DMN learns style mapping between a free-form user modified mask and a target image, enabling diverse generation results. EBST models the user editing behavior on the source mask, making the overall framework more robust to various manipulated inputs. Specifically, it introduces dual-editing consistency as the auxiliary supervision signal. To facilitate extensive studies, we construct a large-scale high-resolution face dataset with fine-grained mask annotations named CelebAMask-HQ. MaskGAN is comprehensively evaluated on two challenging tasks: attribute transfer and style copy, demonstrating superior performance over other state-of-the-art methods. The code, models, and dataset are available at https://github.com/switchablenorms/CelebAMask-HQ.	-
dc.language	eng	-
dc.publisher	IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147	-
dc.relation.ispartof	IEEE Conference on Computer Vision and Pattern Recognition. Proceedings	-
dc.rights	IEEE Conference on Computer Vision and Pattern Recognition. Proceedings. Copyright © IEEE Computer Society.	-
dc.rights	©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	-
dc.subject	face recognition	-
dc.subject	image resolution	-
dc.subject	learning (artificial intelligence)	-
dc.subject	neural nets	-
dc.title	Maskgan: Towards diverse and interactive facial image manipulation	-
dc.type	Conference_Paper	-
dc.identifier.email	Luo, P: pluo@hku.hk	-
dc.identifier.authority	Luo, P=rp02575	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR42600.2020.00559	-
dc.identifier.scopus	eid_2-s2.0-85093100763	-
dc.identifier.hkuros	311027	-
dc.identifier.spage	5548	-
dc.identifier.epage	5557	-
dc.identifier.isi	WOS:000620679505081	-
dc.publisher.place	United States	-
dc.identifier.issnl	1063-6919	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Maskgan: Towards diverse and interactive facial image manipulation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats