ResLNet: deep residual LSTM network with longer input for action recognition
Wang, Tian1; Li, Jiakun2; Wu, Huai-Ning2; Li, Ce3; Snoussi, Hichem4; Wu, Yang5
2022-12
发表期刊FRONTIERS OF COMPUTER SCIENCE
ISSN2095-2228
卷号16期号:6
摘要Action recognition is an important research topic in video analysis that remains very challenging. Effective recognition relies on learning a good representation of both spatial information (for appearance) and temporal information (for motion). These two kinds of information are highly correlated but have quite different properties, leading to unsatisfying results of both connecting independent models (e.g., CNN-LSTM) and direct unbiased co-modeling (e.g., 3DCNN). Besides, a long-lasting tradition on this task with deep learning models is to just use 8 or 16 consecutive frames as input, making it hard to extract discriminative motion features. In this work, we propose a novel network structure called ResLNet (Deep Residual LSTM network), which can take longer inputs (e.g., of 64 frames) and have convolutions collaborate with LSTM more effectively under the residual structure to learn better spatial-temporal representations than ever without the cost of extra computations with the proposed embedded variable stride convolution. The superiority of this proposal and its ablation study are shown on the three most popular benchmark datasets: Kinetics, HMDB51, and UCF101. The proposed network could be adopted for various features, such as RGB and optical flow. Due to the limitation of the computation power of our experiment equipment and the real-time requirement, the proposed network is tested on the RGB only and shows great performance.
关键词action recognition deep learning neural network
DOI10.1007/s11704-021-0236-9
收录类别SCIE ; EI
语种英语
WOS研究方向Computer Science
WOS类目Computer Science, Information Systems ; Computer Science, Software Engineering ; Computer Science, Theory & Methods
WOS记录号WOS:000745605300006
出版者HIGHER EDUCATION PRESS
EI入藏号20220511550497
EI主题词Convolution
EI分类号461.4 Ergonomics and Human Factors Engineering ; 716.1 Information Theory and Signal Processing
来源库WOS
引用统计
被引频次:6[WOS]   [WOS记录]     [WOS相关记录]
文献类型期刊论文
条目标识符https://ir.lut.edu.cn/handle/2XXMBERH/154758
专题电气工程与信息工程学院
通讯作者Wu, Yang
作者单位1.Beihang Univ, Inst Artificial Intelligence, Beijing 100191, Peoples R China;
2.Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China;
3.Lanzhou Univ Technol, Coll Elect & Informat Engn, Lanzhou 730050, Peoples R China;
4.Univ Technol Troyes, Inst Charles Delaunay LM2S FRE CNRS 2019, F-10010 Troyes, France;
5.Nara Inst Sci & Technol, Inst Res Initiat, Nara 6300192, Japan
推荐引用方式
GB/T 7714
Wang, Tian,Li, Jiakun,Wu, Huai-Ning,et al. ResLNet: deep residual LSTM network with longer input for action recognition[J]. FRONTIERS OF COMPUTER SCIENCE,2022,16(6).
APA Wang, Tian,Li, Jiakun,Wu, Huai-Ning,Li, Ce,Snoussi, Hichem,&Wu, Yang.(2022).ResLNet: deep residual LSTM network with longer input for action recognition.FRONTIERS OF COMPUTER SCIENCE,16(6).
MLA Wang, Tian,et al."ResLNet: deep residual LSTM network with longer input for action recognition".FRONTIERS OF COMPUTER SCIENCE 16.6(2022).
条目包含的文件
条目无相关文件。
个性服务
查看访问统计
谷歌学术
谷歌学术中相似的文章
[Wang, Tian]的文章
[Li, Jiakun]的文章
[Wu, Huai-Ning]的文章
百度学术
百度学术中相似的文章
[Wang, Tian]的文章
[Li, Jiakun]的文章
[Wu, Huai-Ning]的文章
必应学术
必应学术中相似的文章
[Wang, Tian]的文章
[Li, Jiakun]的文章
[Wu, Huai-Ning]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。