File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TCSVT.2021.3101847
- Scopus: eid_2-s2.0-85112593687
- WOS: WOS:000725812500016
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: GeometryMotion-Net: A Strong Two-Stream Baseline for 3D Action Recognition
Title | GeometryMotion-Net: A Strong Two-Stream Baseline for 3D Action Recognition |
---|---|
Authors | |
Keywords | 3D action recognition Point cloud two-stream |
Issue Date | 2021 |
Citation | IEEE Transactions on Circuits and Systems for Video Technology, 2021, v. 31, n. 12, p. 4711-4721 How to Cite? |
Abstract | In this work, we propose a strong two-stream baseline method referred to as GeometryMotion-Net for 3D action recognition. For efficient 3D action recognition, we first represent each point cloud sequence as a limited number of randomly sampled frames with each frame consisting of a sparse set of points. After that, we propose a new two-stream framework for effective 3D action recognition. For the geometry stream, we propose a new module to produce a virtual overall geometry point cloud by first merging all 3D points from these selected frames, and then we exploit local neighborhood information of each point in the feature space. In the motion stream, for any two neighboring point cloud frames, we also propose a new module to generate one virtual forward motion point cloud and one virtual backward motion point cloud. Specifically, for each point in the current frame, we first produce a set of 3D offset features relative to the neighboring points in the reference frame (i.e., the previous/subsequent frame) and then exploit local neighborhood information of this point in the offset feature space. Based on the newly generated virtual overall geometry point cloud and multiple virtual forward/backward motion point clouds, any existing point cloud analysis methods (e.g., PointNet) can be readily adopted for extracting discriminant geometry and bidirectional motion features in the geometry and motion streams, respectively, which are further aggregated to make our two-stream network trainable in an end-to-end fashion. Comprehensive experiments on both large-scale datasets (i.e. NTU RGB+D 60 and NTU RGB+D 120) and small-scale datasets (i.e., N-UCLA and UWA3DII) demonstrate the effectiveness and efficiency of our two-stream network for 3D action recognition. |
Persistent Identifier | http://hdl.handle.net/10722/321956 |
ISSN | 2023 Impact Factor: 8.3 2023 SCImago Journal Rankings: 2.299 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Liu, Jiaheng | - |
dc.contributor.author | Xu, Dong | - |
dc.date.accessioned | 2022-11-03T02:22:37Z | - |
dc.date.available | 2022-11-03T02:22:37Z | - |
dc.date.issued | 2021 | - |
dc.identifier.citation | IEEE Transactions on Circuits and Systems for Video Technology, 2021, v. 31, n. 12, p. 4711-4721 | - |
dc.identifier.issn | 1051-8215 | - |
dc.identifier.uri | http://hdl.handle.net/10722/321956 | - |
dc.description.abstract | In this work, we propose a strong two-stream baseline method referred to as GeometryMotion-Net for 3D action recognition. For efficient 3D action recognition, we first represent each point cloud sequence as a limited number of randomly sampled frames with each frame consisting of a sparse set of points. After that, we propose a new two-stream framework for effective 3D action recognition. For the geometry stream, we propose a new module to produce a virtual overall geometry point cloud by first merging all 3D points from these selected frames, and then we exploit local neighborhood information of each point in the feature space. In the motion stream, for any two neighboring point cloud frames, we also propose a new module to generate one virtual forward motion point cloud and one virtual backward motion point cloud. Specifically, for each point in the current frame, we first produce a set of 3D offset features relative to the neighboring points in the reference frame (i.e., the previous/subsequent frame) and then exploit local neighborhood information of this point in the offset feature space. Based on the newly generated virtual overall geometry point cloud and multiple virtual forward/backward motion point clouds, any existing point cloud analysis methods (e.g., PointNet) can be readily adopted for extracting discriminant geometry and bidirectional motion features in the geometry and motion streams, respectively, which are further aggregated to make our two-stream network trainable in an end-to-end fashion. Comprehensive experiments on both large-scale datasets (i.e. NTU RGB+D 60 and NTU RGB+D 120) and small-scale datasets (i.e., N-UCLA and UWA3DII) demonstrate the effectiveness and efficiency of our two-stream network for 3D action recognition. | - |
dc.language | eng | - |
dc.relation.ispartof | IEEE Transactions on Circuits and Systems for Video Technology | - |
dc.subject | 3D action recognition | - |
dc.subject | Point cloud | - |
dc.subject | two-stream | - |
dc.title | GeometryMotion-Net: A Strong Two-Stream Baseline for 3D Action Recognition | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TCSVT.2021.3101847 | - |
dc.identifier.scopus | eid_2-s2.0-85112593687 | - |
dc.identifier.volume | 31 | - |
dc.identifier.issue | 12 | - |
dc.identifier.spage | 4711 | - |
dc.identifier.epage | 4721 | - |
dc.identifier.eissn | 1558-2205 | - |
dc.identifier.isi | WOS:000725812500016 | - |