Dual-attention Network for View-invariant Action Recognition

<p dir="ltr">View-invariant action recognition has been widely researched in various applications, such as visual surveillance and human–robot interaction. However, view-invariant human action recognition is challenging due to the action occlusions and information loss caused by view...

Full description

Saved in:

Bibliographic Details
Main Author:	Gedamu Alemu Kumie (19273711) (author)
Other Authors:	Maregu Assefa Habtie (19273714) (author), Tewodros Alemu Ayall (19273717) (author), Changjun Zhou (451444) (author), Huawen Liu (840748) (author), Abegaz Mohammed Seid (19170901) (author), Aiman Erbad (14150589) (author)
Published:	2023
Subjects:	Information and computing sciences Computer vision and multimedia computation Machine learning Human action recognition Self-attention Cross-attention Dual-attention Attention transfer View-invariant representation
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1864513509791891456
author	Gedamu Alemu Kumie (19273711)
author2	Maregu Assefa Habtie (19273714) Tewodros Alemu Ayall (19273717) Changjun Zhou (451444) Huawen Liu (840748) Abegaz Mohammed Seid (19170901) Aiman Erbad (14150589)
author2_role	author author author author author author
author_facet	Gedamu Alemu Kumie (19273711) Maregu Assefa Habtie (19273714) Tewodros Alemu Ayall (19273717) Changjun Zhou (451444) Huawen Liu (840748) Abegaz Mohammed Seid (19170901) Aiman Erbad (14150589)
author_role	author
dc.creator.none.fl_str_mv	Gedamu Alemu Kumie (19273711) Maregu Assefa Habtie (19273714) Tewodros Alemu Ayall (19273717) Changjun Zhou (451444) Huawen Liu (840748) Abegaz Mohammed Seid (19170901) Aiman Erbad (14150589)
dc.date.none.fl_str_mv	2023-07-20T09:00:00Z
dc.identifier.none.fl_str_mv	10.1007/s40747-023-01171-8
dc.relation.none.fl_str_mv	https://figshare.com/articles/journal_contribution/Dual-attention_Network_for_View-invariant_Action_Recognition/26421559
dc.rights.none.fl_str_mv	CC BY 4.0 info:eu-repo/semantics/openAccess
dc.subject.none.fl_str_mv	Information and computing sciences Computer vision and multimedia computation Machine learning Human action recognition Self-attention Cross-attention Dual-attention Attention transfer View-invariant representation
dc.title.none.fl_str_mv	Dual-attention Network for View-invariant Action Recognition
dc.type.none.fl_str_mv	Text Journal contribution info:eu-repo/semantics/publishedVersion text contribution to journal
description	<p dir="ltr">View-invariant action recognition has been widely researched in various applications, such as visual surveillance and human–robot interaction. However, view-invariant human action recognition is challenging due to the action occlusions and information loss caused by view changes. Modeling spatiotemporal dynamics of body joints and minimizing representation discrepancy between different views could be a valuable solution for view-invariant human action recognition. Therefore, we propose a Dual-Attention Network (DANet) aims to learn robust video representation for view-invariant action recognition. The DANet is composed of relation-aware spatiotemporal self-attention and spatiotemporal cross-attention modules. The relation-aware spatiotemporal self-attention module learns representative and discriminative action features. This module captures local and global long-range dependencies, as well as pairwise relations among human body parts and joints in the spatial and temporal domains. The cross-attention module learns view-invariant attention maps and generates discriminative features for semantic representations of actions in different views. We exhaustively evaluate our proposed approach on the NTU-60, NTU-120, and UESTC large-scale challenging datasets with multi-type evaluation metrics including Cross-Subject, Cross-View, Cross-Set, and Arbitrary-view. The experimental results demonstrate that our proposed approach significantly outperforms state-of-the-art approaches in view-invariant action recognition.</p><h2>Other Information</h2><p dir="ltr">Published in: Complex & Intelligent Systems<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1007/s40747-023-01171-8" target="_blank">https://dx.doi.org/10.1007/s40747-023-01171-8</a></p>
eu_rights_str_mv	openAccess
id	Manara2_ef3efb080f356454b89879eed7b087d2
identifier_str_mv	10.1007/s40747-023-01171-8
network_acronym_str	Manara2
network_name_str	Manara2
oai_identifier_str	oai:figshare.com:article/26421559
publishDate	2023
repository.mail.fl_str_mv
repository.name.fl_str_mv
repository_id_str
rights_invalid_str_mv	CC BY 4.0
spelling	Dual-attention Network for View-invariant Action RecognitionGedamu Alemu Kumie (19273711)Maregu Assefa Habtie (19273714)Tewodros Alemu Ayall (19273717)Changjun Zhou (451444)Huawen Liu (840748)Abegaz Mohammed Seid (19170901)Aiman Erbad (14150589)Information and computing sciencesComputer vision and multimedia computationMachine learningHuman action recognitionSelf-attentionCross-attentionDual-attentionAttention transferView-invariant representation<p dir="ltr">View-invariant action recognition has been widely researched in various applications, such as visual surveillance and human–robot interaction. However, view-invariant human action recognition is challenging due to the action occlusions and information loss caused by view changes. Modeling spatiotemporal dynamics of body joints and minimizing representation discrepancy between different views could be a valuable solution for view-invariant human action recognition. Therefore, we propose a Dual-Attention Network (DANet) aims to learn robust video representation for view-invariant action recognition. The DANet is composed of relation-aware spatiotemporal self-attention and spatiotemporal cross-attention modules. The relation-aware spatiotemporal self-attention module learns representative and discriminative action features. This module captures local and global long-range dependencies, as well as pairwise relations among human body parts and joints in the spatial and temporal domains. The cross-attention module learns view-invariant attention maps and generates discriminative features for semantic representations of actions in different views. We exhaustively evaluate our proposed approach on the NTU-60, NTU-120, and UESTC large-scale challenging datasets with multi-type evaluation metrics including Cross-Subject, Cross-View, Cross-Set, and Arbitrary-view. The experimental results demonstrate that our proposed approach significantly outperforms state-of-the-art approaches in view-invariant action recognition.</p><h2>Other Information</h2><p dir="ltr">Published in: Complex & Intelligent Systems<br>License: <a href="https://creativecommons.org/licenses/by/4.0" target="_blank">https://creativecommons.org/licenses/by/4.0</a><br>See article on publisher's website: <a href="https://dx.doi.org/10.1007/s40747-023-01171-8" target="_blank">https://dx.doi.org/10.1007/s40747-023-01171-8</a></p>2023-07-20T09:00:00ZTextJournal contributioninfo:eu-repo/semantics/publishedVersiontextcontribution to journal10.1007/s40747-023-01171-8https://figshare.com/articles/journal_contribution/Dual-attention_Network_for_View-invariant_Action_Recognition/26421559CC BY 4.0info:eu-repo/semantics/openAccessoai:figshare.com:article/264215592023-07-20T09:00:00Z
spellingShingle	Dual-attention Network for View-invariant Action Recognition Gedamu Alemu Kumie (19273711) Information and computing sciences Computer vision and multimedia computation Machine learning Human action recognition Self-attention Cross-attention Dual-attention Attention transfer View-invariant representation
status_str	publishedVersion
title	Dual-attention Network for View-invariant Action Recognition
title_full	Dual-attention Network for View-invariant Action Recognition
title_fullStr	Dual-attention Network for View-invariant Action Recognition
title_full_unstemmed	Dual-attention Network for View-invariant Action Recognition
title_short	Dual-attention Network for View-invariant Action Recognition
title_sort	Dual-attention Network for View-invariant Action Recognition
topic	Information and computing sciences Computer vision and multimedia computation Machine learning Human action recognition Self-attention Cross-attention Dual-attention Attention transfer View-invariant representation

Dual-attention Network for View-invariant Action Recognition

Similar Items