File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/WACV57701.2024.00301
- Scopus: eid_2-s2.0-85189459735
Supplementary
-
Citations:
- Scopus: 3
- Appears in Collections:
Conference Paper: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations
Title | Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations |
---|---|
Authors | |
Keywords | 3D 3D computer vision Algorithms Algorithms etc Generative models for image video |
Issue Date | 2024 |
Citation | Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, p. 3023-3032 How to Cite? |
Abstract | Digital human motion synthesis is a vibrant research field with applications in movies, AR/VR, and video games. Whereas methods were proposed to generate natural and realistic human motions, most only focus on modeling humans and largely ignore object movements. Generating task-oriented human-object interaction motions in simulation is challenging. For different intents of using the objects, humans conduct various motions, which requires the human first to approach the objects and then make them move consistently with the human instead of staying still. Also, to deploy in downstream applications, the synthesized motions are desired to be flexible in length, providing options to personalize the predicted motions for various purposes. To this end, we propose TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations, which generates full human-object interaction motions to conduct specific tasks, given only the task type, the object, and a starting human status. TOHO generates human-object motions in four steps: 1) it first estimates the object's final position given the task intent; 2) it then generates keyframe poses grasping the objects; 3) after that, it infills the keyframes and generates continuous motions; 4) finally, it applies a compact closed-form object motion estimation to generate the object motion. Our method generates continuous motions that are parameterized only by the temporal coordinate, which allows for upsampling of the sequence to arbitrary frames and adjusting the motion speeds by designing the temporal coordinate vector. This work takes a step further toward general human-scene interaction simulation. |
Persistent Identifier | http://hdl.handle.net/10722/352425 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Li, Quanzhou | - |
dc.contributor.author | Wang, Jingbo | - |
dc.contributor.author | Loy, Chen Change | - |
dc.contributor.author | Dai, Bo | - |
dc.date.accessioned | 2024-12-16T03:58:52Z | - |
dc.date.available | 2024-12-16T03:58:52Z | - |
dc.date.issued | 2024 | - |
dc.identifier.citation | Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, p. 3023-3032 | - |
dc.identifier.uri | http://hdl.handle.net/10722/352425 | - |
dc.description.abstract | Digital human motion synthesis is a vibrant research field with applications in movies, AR/VR, and video games. Whereas methods were proposed to generate natural and realistic human motions, most only focus on modeling humans and largely ignore object movements. Generating task-oriented human-object interaction motions in simulation is challenging. For different intents of using the objects, humans conduct various motions, which requires the human first to approach the objects and then make them move consistently with the human instead of staying still. Also, to deploy in downstream applications, the synthesized motions are desired to be flexible in length, providing options to personalize the predicted motions for various purposes. To this end, we propose TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations, which generates full human-object interaction motions to conduct specific tasks, given only the task type, the object, and a starting human status. TOHO generates human-object motions in four steps: 1) it first estimates the object's final position given the task intent; 2) it then generates keyframe poses grasping the objects; 3) after that, it infills the keyframes and generates continuous motions; 4) finally, it applies a compact closed-form object motion estimation to generate the object motion. Our method generates continuous motions that are parameterized only by the temporal coordinate, which allows for upsampling of the sequence to arbitrary frames and adjusting the motion speeds by designing the temporal coordinate vector. This work takes a step further toward general human-scene interaction simulation. | - |
dc.language | eng | - |
dc.relation.ispartof | Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024 | - |
dc.subject | 3D | - |
dc.subject | 3D computer vision | - |
dc.subject | Algorithms | - |
dc.subject | Algorithms | - |
dc.subject | etc | - |
dc.subject | Generative models for image | - |
dc.subject | video | - |
dc.title | Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/WACV57701.2024.00301 | - |
dc.identifier.scopus | eid_2-s2.0-85189459735 | - |
dc.identifier.spage | 3023 | - |
dc.identifier.epage | 3032 | - |