Rethinking Skip Connection with Layer Normalization in Transformers and ResNets :重新思考Transformers 和ResNets网络中层归一化的跳跃连接
- Abstract
- 1 Introduction
- 2 Related Work
- 3 Architecture
- 4 实证研究 Empirical Study
-
- 4.1 任务和设置Task and Settings、
-
- 4.1.1 分类任务 Image Classfication
- 4.1.2 机器翻译 Machine Translation
- 4.2 Exploratory Results
- 4.3 Discussions
- 4.4 Validation Results
- 5 Conclusion and Fu