文章目录
- Workflow
- DateSource 数据源
-
- Transform
- DataSet & DataLoader
-
- DataSet
- DataLoader
- Build Model
- Save & Load Model
-
- 仅保存权重
- 模型整体保存
- Exporting Model to ONNX
- Question
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambdatraining_data = datasets.FashionMNIST(root="data",train=True,download=True,transform=ToTensor()
)test_data = datasets.FashionMNIST(root="data",train=False,download=True,transform=ToTensor()
)train_dataloader = DataLoader(training_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)class NeuralNetwork(nn.Module):def __init__(self):super(NeuralNetwork, self).__init__()self.flatten = nn.Flatten()self.linear_relu_stack = nn.Sequential(nn.Linear(28*28, 512),nn.ReLU(),nn.Linear(512, 512),nn.ReLU(),nn.Linear(512, 10),nn.ReLU())def forward(self, x):x = self.flatten(x)logits = self.linear_relu_stack(x)return logitsmodel = NeuralNetwork()def train_loop(dataloader, model, loss_fn, optimizer):size = len(dataloader.dataset)for batch, (X, y) in enumerate(dataloader):# Compute prediction and losspred = model(X)loss = loss_fn(pred, y)# Backpropagationoptimizer.zero_grad()loss.backward()optimizer.step()if batch % 100 == 0:loss, current = loss.item(), batch * len(X)print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")def test_loop(dataloader, model, loss_fn):size = len(dataloader.dataset)test_loss, correct = 0, 0with torch.no_grad():for X, y in dataloader:pred = model(X)test_loss += loss_fn(pred, y).item()correct += (pred.argmax(1) == y).type(torch.float).sum().item()test_loss /= sizecorrect /= sizeprint(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")learning_rate = 1e-3
batch_size = 64
epochs = 5
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)epochs = 10
for t in range(epochs):print(f"Epoch {t+1}\n-------------------------------")train_loop(train_dataloader, model, loss_fn, optimizer)test_loop(test_dataloader, model, loss_fn)
print("Done!")
Workflow
- working with data
- creating models
- optimizing model parameters
- saving the trained models
DateSource 数据源
在线下载的数据源库torchvision.datasets
:https://pytorch.org/vision/stable/datasets.html
from torchvision import datasets
from torchvision.transforms import ToTensor# Download training data from open datasets.
training_data = datasets.FashionMNIST(root="data",train=True,download=True, transform=ToTensor(),
)# Download test data from open datasets.
test_data = datasets.FashionMNIST(root="data",train=False,download=True,transform=ToTensor(),
)
root
:保存路径。train
:specifies training or test dataset,download
:True
则没有去在线下载,有就不会再下。False
检索有继续,没有报错。transform
:to modify the features.target_transform
:to modify the labels.
Transform
The FashionMNIST:
- features:PIL Image format→normalized tensors,use
ToTensor
- labels:integers→one-hot encoded tensors,use
Lambda
from torchvision.transforms import ToTensor, Lambdads = datasets.FashionMNIST(root="data",train=True,download=True,transform=ToTensor(),target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))
)
意思:
ToTensor
:converts a PIL image or NumPy ndarray into aFloatTensor
. and scales the image’s pixel intensity values in the range [0., 1.]Lambda
:It first creates a zero tensor of size 10 (the number of labels in our dataset) and calls scatter_ which assigns a value=1 on the index as given by the label y.
DataSet & DataLoader
DataSet
torch.utils.data.Dataset
The torchvision.datasets
module contains Dataset
objects.
访问:
img, label = training_data[sample_idx]
DataLoader
torch.utils.data.DataLoader
作用:
将DataSet
传递给DataLoader
,DataLoader
将DataSet
根据batch_size
分成几份,将然后通过DataLoader
每次迭代。
可以通过test_dataloader.dataset
,获取到原来传入的DataSet
。
batch_size = 100# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=batch_size, shuffle=True)print(len(test_data)) # 10000
print(len(test_dataloader.dataset)) # 10000
print(len(test_dataloader)) # 100
batch_size
:每批次加载的样本数目。shuffle
:洗牌乱序,更好的训练效果。- 每次迭代分别返回
batch_size
个的 features 和 labels 。
for X, y in test_dataloader:print(X, y) # X是特征值,y是标签
# 必须是batch, (X, y),不能是batch, X, y
for batch, (X, y) in enumerate(test_dataloader):print(batch, X, y) # batch是第几批,X是特征值,y是标签
Build Model
from torch import nn
import torch
import numpy as np
from torch import nn# 3个28*28的图片,作为输入
input_image = torch.rand(3,28,28)
print(input_image.size()) # torch.Size([3, 28, 28])# 扁平化图片
flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size()) # torch.Size([3, 784])# 28*28 → 20
layer1 = nn.Linear(in_features=28*28, out_features=20)
hidden1 = layer1(flat_image)
print(hidden1.size()) # torch.Size([3, 20])# 激活函数
hidden1 = nn.ReLU()(hidden1)
# print(f"Before ReLU: {hidden1}\n\n", f"After ReLU: {hidden1}")# 建立模型
model = nn.Sequential(flatten,layer1,nn.ReLU(),nn.Linear(20, 10)
)# 输入
logits = model(input_image)
print(logits.size()) # torch.Size([3, 10])softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)
print(pred_probab.size()) # torch.Size([3, 10])print("Model structure: ", model, "\n\n")for name, param in model.named_parameters():print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")''' Model structure: Sequential((0): Flatten(start_dim=1, end_dim=-1)(1): Linear(in_features=784, out_features=20, bias=True)(2): ReLU()(3): Linear(in_features=20, out_features=10, bias=True) )Layer: 1.weight | Size: torch.Size([20, 784]) | Values : tensor([[-0.0041, 0.0040, 0.0323, ..., -0.0243, -0.0089, -0.0312], [ 0.0046, -0.0245, 0.0227, ..., -0.0139, -0.0023, 0.0060]],grad_fn=<SliceBackward>)Layer: 1.bias | Size: torch.Size([20]) | Values : tensor([-0.0293, 0.0076], grad_fn=<SliceBackward>)Layer: 3.weight | Size: torch.Size([10, 20]) | Values : tensor([[ 0.1768, 0.1060, 0.1348, 0.0689, 0.0786, 0.0148, -0.1476, -0.0419,0.0726, 0.1158, 0.1353, -0.0281, -0.1502, -0.1685, -0.0727, -0.2090,0.0460, 0.0145, -0.0897, 0.1660],[ 0.1221, 0.1032, 0.1019, 0.1671, 0.1814, 0.0260, 0.2213, 0.1093,0.1125, -0.0259, 0.0538, -0.1434, 0.0762, 0.1919, 0.1000, 0.0774,0.1098, 0.1155, 0.0148, -0.1715]], grad_fn=<SliceBackward>)Layer: 3.bias | Size: torch.Size([10]) | Values : tensor([0.1151, 0.0561], grad_fn=<SliceBackward>) '''
nn.Flatten()
nn.Linear(in_features=784, out_features=20, bias=True)
nn.ReLU()
nn.Sequential()
nn.Softmax(dim=1)
class NeuralNetwork(nn.Module):def __init__(self):super(NeuralNetwork, self).__init__()self.flatten = nn.Flatten()self.linear_relu_stack = nn.Sequential(nn.Linear(28*28, 512),nn.ReLU(),nn.Linear(512, 512),nn.ReLU(),nn.Linear(512, 10),nn.ReLU())def forward(self, x):x = self.flatten(x)logits = self.linear_relu_stack(x)return logits
class NeuralNetwork(nn.Module):def __init__(self):super(NeuralNetwork, self).__init__()self.linear_relu_stack = nn.Sequential(nn.Linear(28*28, 512),nn.ReLU(),nn.Linear(512, 512),nn.ReLU(),nn.Linear(512, 10),nn.ReLU())def forward(self, x):x = self.nn.Flatten()(x)logits = self.linear_relu_stack(x)return logits
class NeuralNetwork(nn.Module):def __init__(self):super(NeuralNetwork, self).__init__()self.linear_relu_stack = nn.Sequential(nn.Flatten(),nn.Linear(28*28, 512),nn.ReLU(),nn.Linear(512, 512),nn.ReLU(),nn.Linear(512, 10),nn.ReLU())def forward(self, x):logits = self.linear_relu_stack(x)return logits
forward()
必须重写实现,pred = model(X)
要调用,result = self.forward(*input, **kwargs)
Save & Load Model
仅保存权重
Save
import torchvision.models as modelstorch.save(model.state_dict(), 'model_weights.pth')
Load
model.load_state_dict(torch.load('model_weights.pth'))
model.eval() # Failing to do this will yield inconsistent inference results
模型整体保存
加载进来,结构和权重都有了。
Save
torch.save(model, 'model.pth')
Load
model = torch.load('model.pth')
注意:代码还得有着类的定义。This approach uses Python pickle module when serializing the model, thus it relies on the actual class definition to be available when loading the model.
# 原来
class NeuralNetwork(nn.Module):def __init__(self):super(NeuralNetwork, self).__init__()self.linear_relu_stack = nn.Sequential(nn.Flatten(),nn.Linear(28*28, 512),nn.ReLU(),nn.Linear(512, 512),nn.ReLU(),nn.Linear(512, 10),nn.ReLU())def forward(self, x):logits = self.linear_relu_stack(x)return logits# 现在:NeuralNetwork名字还得一样
class NeuralNetwork(nn.Module):def __init__(self):super(NeuralNetwork, self).__init__()self.linear_relu_stack = nn.Sequential()def forward(self, x):x = nn.Flatten()(x) # 模型不会保存Sequential中的Flatten(),所以得手动写在这。logits = self.linear_relu_stack(x)return logits
# 原来
class NeuralNetwork(nn.Module):def __init__(self):super(NeuralNetwork, self).__init__()self.linear_relu_stack = nn.Sequential(nn.Linear(28*28, 512),nn.ReLU(),nn.Linear(512, 512),nn.ReLU()nn.Linear(512, 10),nn.ReLU())def forward(self, x):x = nn.Flatten()(x) logits = self.linear_relu_stack(x)return logits# 现在:上面那个例子的现在一样
Exporting Model to ONNX
import torch.onnx as onnxinput_image = torch.zeros((1,3,224,224))
onnx.export(model, input_image, 'model.onnx')
Question
- what is the ‘ReLU’?
- Is there need to use a class to inherits the nn.Module?
- what is “forward” with Flatten? 仅仅出现在输入层吗?还是每层的向前传递都用到?
- Leaf of Computational graph?
- 梯度的部分
- artificial neural network, activation functions, backpropagation, convolutional neural networks (CNNs), data augmentation, transfer learning