（Caffe人脸识别）CosineFace、ArcFace、MobileFaceNet、Combined Margin loss的原理以及在caffe中的层实现_综合

对于CosineFace、MobileFaceNet、ArcFace、Combined Margin loss这四种损失函数，都是为了提高人脸识别的分类效果，在原有softmax loss进行改进的。

在这几个改进之前，最早的是基于W-Norm的SphereFace（cosmθ），以及基于W-Norm和F-Norm的SphereFace（scosmθ），这里不对这两种做介绍，直接从cosineFace进行介绍。

一、相关loss层的caffe层实现

由于这些改进的损失函数都是将分类的任务修正到了角度空间中，因此需要新加入相关的角度空间计算操作的层。

其中CosineFace和MobileFacenet都是利用https://github.com/xialuxi/AMSoftmax提供的LabelSpecificAdd层进行公式中的：

cosθ-m

的操作。

ArcFace，则是利用https://github.com/xialuxi/arcface-caffe提供的CosinAddm层进行公式中的：

cos（θ+m）

的操作。

而Combined Margin loss则是利用https://github.com/gehaocool/CombinedMargin-caffe提供的CombinedMargin层进行公式中的：

cos(m1*θ+m2)-m3

的操作。

二、其他操作的caffe层实现

除了以上角度空间的计算操作以外，根据论文原理，要将softmaxloss转换到角度空间cosθ去计算，需要对权重进行归一化，即要修改loss层前面接的inner_product_layer层的源码。（此层的实现在https://github.com/xialuxi/AMSoftmax可以找到）

layer {name: "fc2"type: "InnerProduct"bottom: "norm1"top: "fc2"param {lr_mult: 1}inner_product_param{num_output: 8631normalize: true          #进行权重归一化，将softmaxloss转换到角度空间进行分类weight_filler {type: "xavier"}bias_term: false}
}

除此以外，根据NormFace的论文，还需要对特征x归一化，即在上面的inner_product_layer前面要接一个新的输入特征归一化层Normalize。（此层的实现在https://github.com/xialuxi/AMSoftmax可以找到）

layer {name: "norm1"type: "Normalize"bottom: "fc5"top: "norm1"
}

最后，根据公式，不论是cosθ-m，cos（θ+m）还是 cos(m1*θ+m2)-m3，还需要再添加一层scale层，变成s(cosθ-m)，s(cos（θ+m）)还是 s(cos(m1*θ+m2)-m3)，解决无法收敛的问题。

三、总结

因此总的三种loss的训练网络的loss部分如下：

#############CosineFace/MobileFaceNet
layer {name: "norm1"type: "Normalize"bottom: "fc5"top: "norm1"
}
layer {name: "fc6_l2"type: "InnerProduct"bottom: "norm1"top: "fc6"param {lr_mult: 1}inner_product_param{num_output: 10575normalize: trueweight_filler {type: "xavier"}bias_term: false}
}
layer {name: "label_specific_margin"type: "LabelSpecificAdd"bottom: "fc6"bottom: "label"top: "fc6_margin"label_specific_add_param {bias: -0.35}
}
layer {name: "fc6_margin_scale"type: "Scale"bottom: "fc6_margin"top: "fc6_margin_scale"param {lr_mult: 0decay_mult: 0}scale_param {filler{type: "constant"value: 30}}
}
layer {name: "softmax_loss"type: "SoftmaxWithLoss"bottom: "fc6_margin_scale"bottom: "label"top: "softmax_loss"loss_weight: 1
}

############### Arc-Softmax Loss ##############layer {name: "fc6"type: "InnerProduct"bottom: "norm_fc5"top: "fc6"param {lr_mult: 1}inner_product_param {num_output: 10575normalize: trueweight_filler {type: "xavier"}bias_term: false}
}layer {name: "cosin_add_m"type: "CosinAddm"bottom: "fc6"bottom: "label"top: "fc6_margin"cosin_add_m_param {m: 0.5}
}layer {name: "fc6_margin_scale"type: "Scale"bottom: "fc6_margin"top: "fc6_margin_scale"param {lr_mult: 0decay_mult: 0}scale_param {filler{type: "constant"value: 64}}
}layer{name: "softmax_loss"type: "SoftmaxWithLoss"bottom: "fc6_margin_scale"bottom:"label"top: "softmax_loss"
}

############### combined-margin Loss ##############layer {name: "norm1"type: "Normalize"bottom: "fc5"top: "norm1"
}
layer {name: "fc2"type: "InnerProduct"bottom: "norm1"top: "fc2"param {lr_mult: 1}inner_product_param{num_output: 8631normalize: trueweight_filler {type: "xavier"}bias_term: false}
}
layer {name: "combined_margin"type: "CombinedMargin"bottom: "fc2"bottom: "label"top: "fc2_margin"combined_margin_param {m1: 1m2: 0.3m3: 0.2}
}
layer {name: "fc2_margin_scale"type: "Scale"bottom: "fc2_margin"top: "fc2_margin_scale"param {lr_mult: 0decay_mult: 0}scale_param {filler{type: "constant"value: 30}}
}
layer {name: "softmax_loss"type: "SoftmaxWithLoss"bottom: "fc2_margin_scale"bottom: "label"top: "softmax_loss"loss_weight: 1
}