Institutional Repository of Coll Elect & Informat Engn
ResLNet: deep residual LSTM network with longer input for action recognition | |
Wang, Tian1; Li, Jiakun2; Wu, Huai-Ning2; Li, Ce3; Snoussi, Hichem4; Wu, Yang5 | |
2022-12 | |
发表期刊 | FRONTIERS OF COMPUTER SCIENCE |
ISSN | 2095-2228 |
卷号 | 16期号:6 |
摘要 | Action recognition is an important research topic in video analysis that remains very challenging. Effective recognition relies on learning a good representation of both spatial information (for appearance) and temporal information (for motion). These two kinds of information are highly correlated but have quite different properties, leading to unsatisfying results of both connecting independent models (e.g., CNN-LSTM) and direct unbiased co-modeling (e.g., 3DCNN). Besides, a long-lasting tradition on this task with deep learning models is to just use 8 or 16 consecutive frames as input, making it hard to extract discriminative motion features. In this work, we propose a novel network structure called ResLNet (Deep Residual LSTM network), which can take longer inputs (e.g., of 64 frames) and have convolutions collaborate with LSTM more effectively under the residual structure to learn better spatial-temporal representations than ever without the cost of extra computations with the proposed embedded variable stride convolution. The superiority of this proposal and its ablation study are shown on the three most popular benchmark datasets: Kinetics, HMDB51, and UCF101. The proposed network could be adopted for various features, such as RGB and optical flow. Due to the limitation of the computation power of our experiment equipment and the real-time requirement, the proposed network is tested on the RGB only and shows great performance. |
关键词 | action recognition deep learning neural network |
DOI | 10.1007/s11704-021-0236-9 |
收录类别 | SCIE ; EI |
语种 | 英语 |
WOS研究方向 | Computer Science |
WOS类目 | Computer Science, Information Systems ; Computer Science, Software Engineering ; Computer Science, Theory & Methods |
WOS记录号 | WOS:000745605300006 |
出版者 | HIGHER EDUCATION PRESS |
EI入藏号 | 20220511550497 |
EI主题词 | Convolution |
EI分类号 | 461.4 Ergonomics and Human Factors Engineering ; 716.1 Information Theory and Signal Processing |
来源库 | WOS |
引用统计 | |
文献类型 | 期刊论文 |
条目标识符 | https://ir.lut.edu.cn/handle/2XXMBERH/154758 |
专题 | 电气工程与信息工程学院 |
通讯作者 | Wu, Yang |
作者单位 | 1.Beihang Univ, Inst Artificial Intelligence, Beijing 100191, Peoples R China; 2.Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China; 3.Lanzhou Univ Technol, Coll Elect & Informat Engn, Lanzhou 730050, Peoples R China; 4.Univ Technol Troyes, Inst Charles Delaunay LM2S FRE CNRS 2019, F-10010 Troyes, France; 5.Nara Inst Sci & Technol, Inst Res Initiat, Nara 6300192, Japan |
推荐引用方式 GB/T 7714 | Wang, Tian,Li, Jiakun,Wu, Huai-Ning,et al. ResLNet: deep residual LSTM network with longer input for action recognition[J]. FRONTIERS OF COMPUTER SCIENCE,2022,16(6). |
APA | Wang, Tian,Li, Jiakun,Wu, Huai-Ning,Li, Ce,Snoussi, Hichem,&Wu, Yang.(2022).ResLNet: deep residual LSTM network with longer input for action recognition.FRONTIERS OF COMPUTER SCIENCE,16(6). |
MLA | Wang, Tian,et al."ResLNet: deep residual LSTM network with longer input for action recognition".FRONTIERS OF COMPUTER SCIENCE 16.6(2022). |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论