File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: GeometryMotion-Net: A Strong Two-Stream Baseline for 3D Action Recognition

TitleGeometryMotion-Net: A Strong Two-Stream Baseline for 3D Action Recognition
Authors
Keywords3D action recognition
Point cloud
two-stream
Issue Date2021
Citation
IEEE Transactions on Circuits and Systems for Video Technology, 2021, v. 31, n. 12, p. 4711-4721 How to Cite?
AbstractIn this work, we propose a strong two-stream baseline method referred to as GeometryMotion-Net for 3D action recognition. For efficient 3D action recognition, we first represent each point cloud sequence as a limited number of randomly sampled frames with each frame consisting of a sparse set of points. After that, we propose a new two-stream framework for effective 3D action recognition. For the geometry stream, we propose a new module to produce a virtual overall geometry point cloud by first merging all 3D points from these selected frames, and then we exploit local neighborhood information of each point in the feature space. In the motion stream, for any two neighboring point cloud frames, we also propose a new module to generate one virtual forward motion point cloud and one virtual backward motion point cloud. Specifically, for each point in the current frame, we first produce a set of 3D offset features relative to the neighboring points in the reference frame (i.e., the previous/subsequent frame) and then exploit local neighborhood information of this point in the offset feature space. Based on the newly generated virtual overall geometry point cloud and multiple virtual forward/backward motion point clouds, any existing point cloud analysis methods (e.g., PointNet) can be readily adopted for extracting discriminant geometry and bidirectional motion features in the geometry and motion streams, respectively, which are further aggregated to make our two-stream network trainable in an end-to-end fashion. Comprehensive experiments on both large-scale datasets (i.e. NTU RGB+D 60 and NTU RGB+D 120) and small-scale datasets (i.e., N-UCLA and UWA3DII) demonstrate the effectiveness and efficiency of our two-stream network for 3D action recognition.
Persistent Identifierhttp://hdl.handle.net/10722/321956
ISSN
2023 Impact Factor: 8.3
2023 SCImago Journal Rankings: 2.299
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorLiu, Jiaheng-
dc.contributor.authorXu, Dong-
dc.date.accessioned2022-11-03T02:22:37Z-
dc.date.available2022-11-03T02:22:37Z-
dc.date.issued2021-
dc.identifier.citationIEEE Transactions on Circuits and Systems for Video Technology, 2021, v. 31, n. 12, p. 4711-4721-
dc.identifier.issn1051-8215-
dc.identifier.urihttp://hdl.handle.net/10722/321956-
dc.description.abstractIn this work, we propose a strong two-stream baseline method referred to as GeometryMotion-Net for 3D action recognition. For efficient 3D action recognition, we first represent each point cloud sequence as a limited number of randomly sampled frames with each frame consisting of a sparse set of points. After that, we propose a new two-stream framework for effective 3D action recognition. For the geometry stream, we propose a new module to produce a virtual overall geometry point cloud by first merging all 3D points from these selected frames, and then we exploit local neighborhood information of each point in the feature space. In the motion stream, for any two neighboring point cloud frames, we also propose a new module to generate one virtual forward motion point cloud and one virtual backward motion point cloud. Specifically, for each point in the current frame, we first produce a set of 3D offset features relative to the neighboring points in the reference frame (i.e., the previous/subsequent frame) and then exploit local neighborhood information of this point in the offset feature space. Based on the newly generated virtual overall geometry point cloud and multiple virtual forward/backward motion point clouds, any existing point cloud analysis methods (e.g., PointNet) can be readily adopted for extracting discriminant geometry and bidirectional motion features in the geometry and motion streams, respectively, which are further aggregated to make our two-stream network trainable in an end-to-end fashion. Comprehensive experiments on both large-scale datasets (i.e. NTU RGB+D 60 and NTU RGB+D 120) and small-scale datasets (i.e., N-UCLA and UWA3DII) demonstrate the effectiveness and efficiency of our two-stream network for 3D action recognition.-
dc.languageeng-
dc.relation.ispartofIEEE Transactions on Circuits and Systems for Video Technology-
dc.subject3D action recognition-
dc.subjectPoint cloud-
dc.subjecttwo-stream-
dc.titleGeometryMotion-Net: A Strong Two-Stream Baseline for 3D Action Recognition-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/TCSVT.2021.3101847-
dc.identifier.scopuseid_2-s2.0-85112593687-
dc.identifier.volume31-
dc.identifier.issue12-
dc.identifier.spage4711-
dc.identifier.epage4721-
dc.identifier.eissn1558-2205-
dc.identifier.isiWOS:000725812500016-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats