Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations

Li, Quanzhou; Wang, Jingbo; Loy, Chen Change; Dai, Bo

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/WACV57701.2024.00301
Scopus: eid_2-s2.0-85189459735

Supplementary

Citations:
- Scopus: 3
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations

Title	Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations
Authors	Li, Quanzhou Wang, Jingbo Loy, Chen Change Dai, Bo
Keywords	3D 3D computer vision Algorithms Algorithms etc Generative models for image video
Issue Date	2024
Citation	Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, p. 3023-3032 How to Cite? DOI: http://dx.doi.org/10.1109/WACV57701.2024.00301
Abstract	Digital human motion synthesis is a vibrant research field with applications in movies, AR/VR, and video games. Whereas methods were proposed to generate natural and realistic human motions, most only focus on modeling humans and largely ignore object movements. Generating task-oriented human-object interaction motions in simulation is challenging. For different intents of using the objects, humans conduct various motions, which requires the human first to approach the objects and then make them move consistently with the human instead of staying still. Also, to deploy in downstream applications, the synthesized motions are desired to be flexible in length, providing options to personalize the predicted motions for various purposes. To this end, we propose TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations, which generates full human-object interaction motions to conduct specific tasks, given only the task type, the object, and a starting human status. TOHO generates human-object motions in four steps: 1) it first estimates the object's final position given the task intent; 2) it then generates keyframe poses grasping the objects; 3) after that, it infills the keyframes and generates continuous motions; 4) finally, it applies a compact closed-form object motion estimation to generate the object motion. Our method generates continuous motions that are parameterized only by the temporal coordinate, which allows for upsampling of the sequence to arbitrary frames and adjusting the motion speeds by designing the temporal coordinate vector. This work takes a step further toward general human-scene interaction simulation.
Persistent Identifier	http://hdl.handle.net/10722/352425

DC Field	Value	Language
dc.contributor.author	Li, Quanzhou	-
dc.contributor.author	Wang, Jingbo	-
dc.contributor.author	Loy, Chen Change	-
dc.contributor.author	Dai, Bo	-
dc.date.accessioned	2024-12-16T03:58:52Z	-
dc.date.available	2024-12-16T03:58:52Z	-
dc.date.issued	2024	-
dc.identifier.citation	Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, p. 3023-3032	-
dc.identifier.uri	http://hdl.handle.net/10722/352425	-
dc.description.abstract	Digital human motion synthesis is a vibrant research field with applications in movies, AR/VR, and video games. Whereas methods were proposed to generate natural and realistic human motions, most only focus on modeling humans and largely ignore object movements. Generating task-oriented human-object interaction motions in simulation is challenging. For different intents of using the objects, humans conduct various motions, which requires the human first to approach the objects and then make them move consistently with the human instead of staying still. Also, to deploy in downstream applications, the synthesized motions are desired to be flexible in length, providing options to personalize the predicted motions for various purposes. To this end, we propose TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations, which generates full human-object interaction motions to conduct specific tasks, given only the task type, the object, and a starting human status. TOHO generates human-object motions in four steps: 1) it first estimates the object's final position given the task intent; 2) it then generates keyframe poses grasping the objects; 3) after that, it infills the keyframes and generates continuous motions; 4) finally, it applies a compact closed-form object motion estimation to generate the object motion. Our method generates continuous motions that are parameterized only by the temporal coordinate, which allows for upsampling of the sequence to arbitrary frames and adjusting the motion speeds by designing the temporal coordinate vector. This work takes a step further toward general human-scene interaction simulation.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024	-
dc.subject	3D	-
dc.subject	3D computer vision	-
dc.subject	Algorithms	-
dc.subject	Algorithms	-
dc.subject	etc	-
dc.subject	Generative models for image	-
dc.subject	video	-
dc.title	Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/WACV57701.2024.00301	-
dc.identifier.scopus	eid_2-s2.0-85189459735	-
dc.identifier.spage	3023	-
dc.identifier.epage	3032	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats