(四十一):Deep Learning for Video Captioning: A Review
- 手写笔记
- PPT汇报总结
- Abstract
- 1 Introduction
- 2 Problem Formulation
- 3 Video Representation
-
- 3.1 Multimodal Feature Extraction
- 3.2 Feature Aggregation Is Important
- 4 Caption Generation
-
- 4.1 Auxiliary Semantic Supervision
- 4.2 Addressing Objective Mismatch
- 4.3 Dense Captioning