文章目录
- 1. MXNet Finetune:
- 2. Pytorch Finetune:
1. MXNet Finetune:
每一层都有个 lr_mult 属性,就是学习率的倍率,可以设置为不为1的值进行放大或缩小。
参考代码:
_weight = mx.symbol.Variable("fc7_weight", shape=(args.num_classes, args.emb_size), lr_mult=1.0)
示例:
weight=mx.sym.var('fc1_voc12_c3_weight', lr_mult=0.1)
fc1_voc12_c3=mx.symbol.Convolution(name='fc1_voc12_c3',data=conv_feat, weight=weight, kernel,(3,3),no_bias=True,num_filter=num_classes,dilate=24)
lr_mult在全连接层和optimizer都可以设置,但是只用在一个地方设置就可以了?
在optimizer设置:https://github.com/apache/incubator-mxnet/issues/3539
更多示例:https://github.com/apache/incubator-mxnet/issues/3555
2. Pytorch Finetune:
加载所有层的参数:
checkpoint = torch.load(checkpoint_PATH)# load all the parametersmodel.load_state_dict(checkpoint['state_dict'])
只加载部分层的参数
# only load the params exist in pretrained model weightspretrained_dict = checkpoint['state_dict']model_dict = model.state_dict()# 1. filter out unnecessary keyspretrained_dict = {
k: v for k, v in pretrained_dict.items() if k in model_dict}# 2. overwrite entries in the existing state dictmodel_dict.update(pretrained_dict)model.load_state_dict(model_dict)print("=> loaded checkpoint '{}' (epoch {})".format(modelFile, checkpoint['epoch']))
设置哪些层固定不变
model_ft = models.resnet50(pretrained=True) # 这里自动下载官方的预训练模型,并且
# 将所有的参数层进行冻结
for param in model_ft.parameters():param.requires_grad = False
# 这里打印下全连接层的信息
print(model_ft.fc)
num_fc_ftr = model_ft.fc.in_features #获取到fc层的输入
设置学习率变化策略
from torch.optim import lr_scheduler
# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler, num_epochs=25)
https://blog.csdn.net/qq_34914551/article/details/87699317
不同层设置不同的学习率:
# get fc params
fc_params = model.output_layer[3].parameters()
ignored_params = list(map(id, fc_params))
# get not-fc params
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())# 对不同参数设置不同的学习率
lrs = [conf['base_lr'], conf['fc_lr']]
print('Using LR', lrs)
params_list = [{
'params': base_params, 'lr': lrs[0]}, ]
params_list.append({
'params': fc_params, 'lr': lrs[1]})
optimizer = torch.optim.SGD(params_list, lrs[1], momentum=conf['momentum'], weight_decay=conf['weight_decay'])