(五十九):Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
- Abstract
- 1. Introduction
- 2. Related Work
- 3. Approach
-
- 3.1. Bottom-Up Attention Model
- 3.2. Captioning Model
-
- 3.2.1 Top-Down Attention LSTM
- 3.2.2 Language LSTM
- 3.2.3 Objective
- 3.3. VQA Model
- 4. Evaluation
-
- 4.1. Datasets
-
- 4.1.1 Visual Genome