Mxnet (2): 线性回归(linear regression)_综合

0. 准备

依赖：

python=3.6
mxnet-cu102==1.7.0
jupyter==1.0.0
plotly==4.10.0

模块使用

import mxnet as mx
from mxnet import autograd, nd
import random
from plotly.offline import plot, init_notebook_mode
import plotly.graph_objs as go
init_notebook_mode(connected=False)

gpu使用

ctx = mx.gpu() if (mx.context.num_gpus()) > 0 else mx.cpu()

1.数据准备

模拟一个方程 z = 4.22a - 6.11b + 2.1，为了模拟数据集，添加白噪声，服从均值为0,标准差为0.02的正太分布

Ws_raw = [4.22, -6.11]
b_raw = 2.1
feature_num = len(Ws_raw)
sample_num = 1000

根据特征数和样本数量生成一组服从均值为0,标准差为2的正太分布测试数据

features = nd.random.normal(scale=2, shape=(sample_num, feature_num),ctx = ctx)

根据公式求出label，也就是公式中的值

w1, w2 = Ws_raw
labels = w1*features[:, 0] + w2*features[:, 1] + b_raw + nd.random.normal(scale=0.02, shape=(sample_num,), ctx=ctx)

1.1 对数据进行可视化

trace = go.Scatter3d(x=features[:,0].asnumpy(),y=features[:,1].asnumpy(),z=labels.asnumpy(),mode='markers',marker=dict(size=2,),   
)
fig = go.Figure(data=[trace])
fig.show()

通过3d绘图，可以很清楚的看出数据之间的关系

在这里插入图片描述

1.2 数据分批次

编写一个方法将数据分成不同批次获取

def data_iter (batch_size, features, labels):sample_num = len(features)indexes = list(range(sample_num))random.shuffle(indexes)for i in range(0, sample_num, batch_size):batch_indexes = nd.array(indexes[i: i+batch_size], ctx=ctx)yield features.take(batch_indexes), labels.take(batch_indexes)

2. 模型准备

2.1 损失函数

定义损失函数，将真实值和预测值带入，求偏差

def L2_loss(y, pre_y):return (pre_y - y.reshape(pre_y.shape))**2/2

2.2 优化函数

定义优化函数, lr是学习速率

def sgd(Ws, lr, batch_size):for w in Ws:w[:] = w - w.grad/batch_size * lr

2.3 定义模型

def LinearRegression(X, w, b):return nd.dot(X, w) + b

2.4 创建梯度

创建参数并对参数创建梯度

Ws = nd.random.normal(scale=0.02, shape=(feature_num, 1), ctx=ctx)
b = nd.zeros(shape=(1,), ctx=ctx)
Ws.attach_grad()
b.attach_grad()

3.训练模型

指定训练5轮
学习效率为0.03

epochs = 5
lr = 0.03
model = LinearRegression
loss = L2_loss
optimizer = sgd
batch_size = 10for epoch in range(epochs):for X,y in data_iter(batch_size, features, labels):with autograd.record():l = loss(model(X,Ws,b), y)l.backward()optimizer([Ws,b], lr, batch_size)train_loss = loss(model(features, Ws, b), labels).mean().asscalar()print("[epoch: {}] ---- loss: {:.6f} ------".format(epoch+1, train_loss))

结果如下
可以看出在第二轮就基本稳定了

[epoch: 1] ---- loss: 0.005551 ------
[epoch: 2] ---- loss: 0.000226 ------
[epoch: 3] ---- loss: 0.000215 ------
[epoch: 4] ---- loss: 0.000217 ------
[epoch: 5] ---- loss: 0.000218 ------

看一下结果，权重和偏差，跟设置的基本相同

4.使用gluon

根据上面的例子可以很清楚的看出深度学习模型需要使用的一些元素：

数据集
模型
损失函数
优化函数

由于深度学习框架的不断发展，以上几种元素的处理可以轻松的通过模型自带的api实现

4.1 数据

使用 mxnet.gluon.data快速组合features和labels并且分batch

from mxnet.gluon import data as gdatabatch_size = 10
dataset = gdata.ArrayDataset(features, labels)
data_iter = gdata.DataLoader(dataset, batch_size, shuffle=True)

4.2 模型

使用mxnet.gluon.nn生成模型， “nn”是neural networks（神经网络）的缩写， Sequential实例可以看作是一个串联各个层的容器。作为一个单层神经网络，线性回归输出层中的神经元和输入层中各个输入完全连接。因此，线性回归的输出层又叫全连接层。在Gluon中，全连接层是一个Dense实例。Pytorch中是Linear,跟Linear不同的是，在Gluon中我们无须指定每一层输入的形状，模型将自动推断出每一层的输入个数。

from mxnet.gluon import nn
model = nn.Sequential()
model.add(nn.Dense(1))

运行模型之前需要初始化参数，通过导入mxnet.init, sigma=0.02 指定权重参数每个元素将在初始化时随机采样于均值为0、标准差为0.02的正态分布; 偏差参数默认会初始化为零。等同于上面的nd.random.normal(scale=0.02, shape=(feature_num, 1), ctx=ctx)

from mxnet import init
model.initialize(init.Normal(sigma=0.02), ctx=ctx)

4.3损失函数

通过mxnet.gluon.loss获取损失函数, 这里和上面一样选用 Square Loss, 平方误差就是L2

from mxnet.gluon import loss as gloss
loss = gloss.L2Loss()

4.4 优化函数

通过mxnet.gluon.Trainer定义优化算法，这里model.collect_params()获取model中所有参数，指定梯度下降（sgd）为优化算法，学习速率为0.03

from mxnet.gluon import Trainer
trainer = Trainer(params=model.collect_params(), optimizer='SGD', optimizer_params={
    'learning_rate':0.03})

4.5 训练

训练模型，在step函数中指明批量大小，通过batch_size求平均, 在合并features和labels的时候balels被转为numpy数组了，因此在使用gpu的时候要将其赋值到gpu不然会报错

epochs = 4for epoch in range(epochs):for X, y in data_iter:with autograd.record():y = y.as_in_context(ctx)l = loss(model(X), y)l.backward()trainer.step(batch_size)l = loss(model(features), labels).mean().asscalar()print("[epoch: {}] ---- loss: {:.6f} ------".format(epoch+1, l))

4.6 查看权重

在这里插入图片描述

5. 保存加载模型

5.1 Block

模型参数的保存和加载， Block 只能保存网络参数

load参数：

allow_missing: True时表示：网络结构中存在, 参数文件中不存在参数，不加载
ignore_extra: True时表示: 参数文件中存在，网络结构中不存在的参数，不加载
cast_dtype: True时：将从检查点加载的NDArray的数据类型强制转换为Parameter提供的dtype
dtype_source: 必须为{‘current’，‘saved’}，仅当cast_dtype = True时有效，指定用于投射参数的dtype的来源

model.save_parameters("linear_regression.params")
model.load_parameters("linear_regression.params", ctx=ctx, allow_missing=False,ignore_extra=False, cast_dtype=False, dtype_source='current')

5.2 HybridBlock

使用HybridBlock可以同时保存网络结构和参数，通过export导出，import进行加载

from mxnet.gluon import SymbolBlocknet = nn.HybridSequential()
net.add(nn.Dense(1))net.initialize(init.Normal(sigma=0.02), ctx=ctx)
net.hybridize()
net(features)
net.export("net1", epoch=1)net = SymbolBlock.imports(symbol_file='net1-symbol.json',input_names=['data'], param_file='net1-0001.params', ctx = ctx)

6.参考

https://mxnet.apache.org/api/python/docs/api/gluon/block.html

https://zh.d2l.ai/chapter_deep-learning-basics/linear-regression-gluon.html