深度学习每周学习总结R1(RNN-心脏病预测)
- 🍨 本文为🔗365天深度学习训练营 中的学习记录博客R3中的内容,为了便于自己整理总结起名为R1
- 🍖 原作者:K同学啊 | 接辅导、项目定制
目录
- 0. 总结
- 1. RNN介绍
- a. 什么是 RNN?
- RNN 的一般应用场景
- b. 传统 RNN 的基本结构
- 关键特征
- c. RNN 的优势与局限
- 优势
- 局限与改进
- d. RNN 的常见变体:LSTM 和 GRU
- LSTM (Long Short-Term Memory)
- GRU (Gated Recurrent Unit)
- e. RNN 的应用案例
- f. RNN 在 PyTorch 中的实现方式
- g. 如何更进一步学习 RNN?
- h. 总结
- 2. 数据导入
- 3. 数据预处理
- 4. 构建RNN模型
- 5. 初始化模型及优化器
- 6. 训练函数
- 7. 测试函数
- 8.训练环节
- 9. 结果可视化
0. 总结
数据导入及处理部分:在 PyTorch 中,我们通常先将 NumPy 数组转换为 torch.Tensor,再封装到 TensorDataset 或自定义的 Dataset 里,然后用 DataLoader 按批次加载。
模型构建部分:RNN
设置超参数:在这之前需要定义损失函数,学习率(动态学习率),以及根据学习率定义优化器(例如SGD随机梯度下降),用来在训练中更新参数,最小化损失函数。
定义训练函数:函数的传入的参数有四个,分别是设置好的DataLoader(),定义好的模型,损失函数,优化器。函数内部初始化损失准确率为0,接着开始循环,使用DataLoader()获取一个批次的数据,对这个批次的数据带入模型得到预测值,然后使用损失函数计算得到损失值。接下来就是进行反向传播以及使用优化器优化参数,梯度清零放在反向传播之前或者是使用优化器优化之后都是可以的,一般是默认放在反向传播之前。
定义测试函数:函数传入的参数相比训练函数少了优化器,只需传入设置好的DataLoader(),定义好的模型,损失函数。此外除了处理批次数据时无需再设置梯度清零、返向传播以及优化器优化参数,其余部分均和训练函数保持一致。
训练过程:定义训练次数,有几次就使用整个数据集进行几次训练,初始化四个空list分别存储每次训练及测试的准确率及损失。使用model.train()开启训练模式,调用训练函数得到准确率及损失。使用model.eval()将模型设置为评估模式,调用测试函数得到准确率及损失。接着就是将得到的训练及测试的准确率及损失存储到相应list中并合并打印出来,得到每一次整体训练后的准确率及损失。
结果可视化
模型的保存,调取及使用。在PyTorch中,通常使用 torch.save(model.state_dict(), ‘model.pth’) 保存模型的参数,使用 model.load_state_dict(torch.load(‘model.pth’)) 加载参数。
需要改进优化的地方:确保模型和数据的一致性,都存到GPU或者CPU;注意numclasses不要直接用默认的1000,需要根据实际数据集改进;实例化模型也要注意numclasses这个参数;此外注意测试模型需要用(3,224,224)3表示通道数,这和tensorflow定义的顺序是不用的(224,224,3),做代码转换时需要注意。
1. RNN介绍
下面是对 RNN(Recurrent Neural Network) 的一个循序渐进、相对通俗的介绍,帮助你从原理上理解 RNN 的本质与应用,希望对你有所帮助。
a. 什么是 RNN?
RNN,全称 Recurrent Neural Network,即“循环神经网络”。它是一类专门处理序列数据的神经网络模型,与传统的前馈网络(如全连接网络 MLP、卷积网络 CNN 等)最大的区别在于:
- 序列性:RNN 可以在序列的时间步之间传递信息,具备“记忆”先前输入的能力。
- 循环结构:在每一个时间步,网络都会基于当前输入和上一时刻的隐藏状态来更新当前隐藏状态,然后输出结果。
RNN 的一般应用场景
- 自然语言处理(NLP):如情感分析、文本分类、机器翻译、文本生成等。
- 时间序列预测:如股票预测、温度预测、信号处理等。
- 语音识别或合成:处理音频序列。
b. 传统 RNN 的基本结构
以下是一个最基础(经典版)的 RNN 结构示意:
┌───────┐ ┌───────┐ ┌───────┐
│x(t-1) │ │x(t) │ │x(t+1) │ ← 输入序列
└──┬────┘ └──┬────┘ └──┬────┘
│ │ │
┌─▼──────────────▼──────────────▼─────────────────────────┐
│ RNN 单元 (循环体) │
│ │
│ h(t-1) ──┐ ┌─────────┐ ┌─────────┐ │
│ │ │激活函数 f│ │激活函数 g│ │
│ x(t), h(t-1) → │ 线性运算 → │ (如 tanh) → h(t) │
│ │ └─────────┘ └─────────┘ │
└────────────┴─────────────────────────────────────────────┘
↑
通过时间传递
(隐藏状态 h)
- 输入序列:( x(1), x(2), …, x(T) )
- 隐藏状态:( h(t) ) 表示网络在时间步 ( t ) 的内部记忆。
- 更新公式(经典 RNN 的简单形式):
[
h(t) = \sigma(W_{hh} \cdot h(t-1) + W_{xh} \cdot x(t) + b_h)
]
其中 (\sigma) 通常是一个非线性激活函数,如 (\tanh) 或 (\text{ReLU}) 等。
关键特征
-
循环(Recurrent)
- RNN 通过将过去的隐藏状态 ( h(t-1) ) 反复输入到网络,与当前输入 ( x(t) ) 一起决策新的隐藏状态 ( h(t) )。因此它在时间序列上“循环”展开。
-
参数共享(Parameter Sharing)
- 对于序列中每个时间步,RNN 使用相同的一组权重((W_{hh}, W_{xh}) 等),这与一般的多层感知器(MLP)不同,MLP 每一层都会有一组新的权重。
-
序列建模
- 借助隐藏状态的更新,RNN 在一定程度上能够“记住”之前输入的信息,从而可以用来处理依赖于上下文或时间顺序的任务(如语言模型,每个单词与前面单词息息相关)。
c. RNN 的优势与局限
优势
- 适合序列数据:相比于传统的全连接网络,RNN 能够更好地处理变长的序列输入,捕捉序列中的时序依赖关系。
- 参数共享:节省模型参数,防止过度膨胀。
局限与改进
- 长期依赖问题:经典 RNN 里,随着序列长度增大,早期输入的信息往往无法传播到后面时间步,会导致梯度消失或梯度爆炸。
- 训练效率:由于存在序列展开 + 反向传播(BPTT: Back Propagation Through Time)的特殊性,训练速度通常慢于并行度高的卷积网络。
- 改进模型:
- LSTM(Long Short-Term Memory)
- GRU(Gated Recurrent Unit)
这两种模型通过门控机制(忘记门、输入门、输出门等)来缓解或部分解决长期依赖问题,在实际中广泛使用。
d. RNN 的常见变体:LSTM 和 GRU
由于传统 RNN 在对长序列进行建模时,容易遗忘早期信息,为了解决这个问题,人们提出了带有 “门控” 机制的循环神经网络结构。其中最典型的就是 LSTM 和 GRU。
LSTM (Long Short-Term Memory)
- 由 记忆单元(Cell state)和 门控机制(input gate、forget gate、output gate)来控制信息的流动,保留长期的梯度信息,从而缓解梯度消失问题。
- 在很多 NLP 任务中,LSTM 大多表现优于传统 RNN。
GRU (Gated Recurrent Unit)
- 结构上比 LSTM 更简化,只有 更新门 和 重置门,虽然结构更简单,但也能保留一定的长期依赖能力。
- 在某些任务中,GRU 的性能与 LSTM 不相上下,而且训练速度更快。
e. RNN 的应用案例
-
语言模型
- 给定前面的单词,预测下一个单词;或给定一段前文,生成下一段文本。
- 例如早期的机器翻译系统,输入序列是原语言单词,输出序列是翻译后的目标语言单词。
- 现在更多使用了 Transformer 这种基于自注意力机制的模型,但 RNN 依然是重要的基石概念。
-
序列分类
- 对一段文本或语音做分类,如情感分析(正向/负向)、语音识别(识别说的是哪一句话)等。
-
时间序列预测
- 比如股票预测、流量预测、天气预测,通过过去若干时刻的数据预测未来走向。
f. RNN 在 PyTorch 中的实现方式
在 PyTorch 里,最常见的循环网络层包括:
nn.RNN
:经典单层 RNN,可选激活函数tanh
或ReLU
。nn.LSTM
:LSTM 结构nn.GRU
:GRU 结构
输入通常需要形状 (batch_size, seq_len, input_size)
(当 batch_first=True
时)。
输出需要自己选择:
- 如果只需要最后一个时间步的输出,往往取
output[:, -1, :]
; - 如果需要所有时间步的输出(比如生成序列时),则直接使用
output
。 - 训练时要记得将 hidden state(以及 cell state)正确地传递或重置。
g. 如何更进一步学习 RNN?
- 从小例子入手:
- 用 RNN 来解决简单的序列学习任务(例如正弦波预测、小规模字符级语言模型),查看网络是如何随时间迭代的。
- 阅读论文与教程:
- LSTM 的原始论文 (Hochreiter & Schmidhuber, 1997)
- GRU (Cho et al., 2014)
- 深入理解门控机制,体会为什么能让 RNN 更好地记住/遗忘信息。
- 与 Transformer 对比:
- 在大多数 NLP 任务上,目前已被 Transformer 结构占据主流,但 RNN 思想仍是许多研究的基础。理解 RNN 有助于理解注意力机制为什么行之有效。
- 深入到框架实现:
- 看 PyTorch 中
nn.RNN
、nn.LSTM
、nn.GRU
的源代码或官方文档,了解参数含义及前向、后向的具体计算流程。
- 看 PyTorch 中
h. 总结
- 核心思想:RNN 可以“循环”地将过去的信息传递到现在,从而在一定程度上捕捉序列数据的依赖关系。
- 传统 RNN 的问题:容易出现梯度消失或爆炸,难以捕捉长程依赖。
- 常见改进:LSTM、GRU 等门控结构缓解了长期依赖难题,也成为 RNN 家族的主力。
- 现今趋势:NLP 等领域更多使用 Transformer,但 RNN 在许多对序列长度不太长的场合依旧可以使用,而且对初学者理解神经网络的“记忆”能力非常有帮助。
如果你刚开始学习,可以:
- 多动手调试:写一些小规模 RNN 代码,训练简单的序列数据,观察 loss 和隐藏状态如何变化。
- 多画图:用纸笔画 RNN 在时序上的展开图,有助于理解反向传播的流程。
- 分门别类:清楚哪些任务用 LSTM/GRU,哪些任务需要 CNN 或 Transformer,知道各种模型的优势与局限。
数据介绍:
●age:1) 年龄
●sex:2) 性别
●cp:3) 胸痛类型 (4 values)
●trestbps:4) 静息血压
●chol:5) 血清胆甾醇 (mg/dl
●fbs:6) 空腹血糖 > 120 mg/dl
●restecg:7) 静息心电图结果 (值 0,1 ,2)
●thalach:8) 达到的最大心率
●exang:9) 运动诱发的心绞痛
●oldpeak:10) 相对于静止状态,运动引起的ST段压低
●slope:11) 运动峰值 ST 段的斜率
●ca:12) 荧光透视着色的主要血管数量 (0-3)
●thal:13) 0 = 正常;1 = 固定缺陷;2 = 可逆转的缺陷
●target:14) 0 = 心脏病发作的几率较小 1 = 心脏病发作的几率更大
2. 数据导入
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import copy
df = pd.read_csv("./data/heart.csv")
df
age | sex | cp | trestbps | chol | fbs | restecg | thalach | exang | oldpeak | slope | ca | thal | target | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 63 | 1 | 3 | 145 | 233 | 1 | 0 | 150 | 0 | 2.3 | 0 | 0 | 1 | 1 |
1 | 37 | 1 | 2 | 130 | 250 | 0 | 1 | 187 | 0 | 3.5 | 0 | 0 | 2 | 1 |
2 | 41 | 0 | 1 | 130 | 204 | 0 | 0 | 172 | 0 | 1.4 | 2 | 0 | 2 | 1 |
3 | 56 | 1 | 1 | 120 | 236 | 0 | 1 | 178 | 0 | 0.8 | 2 | 0 | 2 | 1 |
4 | 57 | 0 | 0 | 120 | 354 | 0 | 1 | 163 | 1 | 0.6 | 2 | 0 | 2 | 1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
298 | 57 | 0 | 0 | 140 | 241 | 0 | 1 | 123 | 1 | 0.2 | 1 | 0 | 3 | 0 |
299 | 45 | 1 | 3 | 110 | 264 | 0 | 1 | 132 | 0 | 1.2 | 1 | 0 | 3 | 0 |
300 | 68 | 1 | 0 | 144 | 193 | 1 | 1 | 141 | 0 | 3.4 | 1 | 2 | 3 | 0 |
301 | 57 | 1 | 0 | 130 | 131 | 0 | 1 | 115 | 1 | 1.2 | 1 | 1 | 3 | 0 |
302 | 57 | 0 | 1 | 130 | 236 | 0 | 0 | 174 | 0 | 0.0 | 1 | 1 | 2 | 0 |
303 rows × 14 columns
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
device(type='cuda')
# 检查是否有空值
df.isnull().sum()
age 0
sex 0
cp 0
trestbps 0
chol 0
fbs 0
restecg 0
thalach 0
exang 0
oldpeak 0
slope 0
ca 0
thal 0
target 0
dtype: int64
3. 数据预处理
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
X = df.iloc[:,:-1]
y = df.iloc[:,-1]
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.1,random_state = 1)
X_train.shape,y_train.shape
((272, 13), (272,))
# 将每一列特征标准化为标准正太分布,注意,标准化是针对每一列而言的
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
X_train = X_train.reshape(X_train.shape[0],X_train.shape[1],1)
X_test = X_test.reshape(X_test.shape[0],X_test.shape[1],1)
上面的代码主要分为两步:
- 使用
StandardScaler
进行数据标准化 - 对数据进行三维变形(reshape)
1. 标准化(StandardScaler)
-
StandardScaler
会先计算 训练集(X_train
)每个特征的平均值 (mean) 和标准差 (std),并用以下公式对数据进行变换:[
X_{\text{scaled}} = \frac{X - \mu}{\sigma}
]其中,(\mu) 为均值,(\sigma) 为标准差。
-
sc.fit_transform(X_train)
fit()
:计算X_train
各列(特征)的均值和标准差,并将结果存储在sc
对象内部。transform()
:将每个特征减去该列的均值,再除以该列的标准差,得到标准化后的数据。
-
sc.transform(X_test)
- 使用 同样 的均值和标准差对测试集进行标准化(不会重新计算 均值和标准差,避免数据泄漏),确保训练集和测试集的标准化处理方式一致。
2. 数据形状变换(reshape)
- 经过前面的标准化后,
X_train
和X_test
的形状通常是 (\text{(样本数, 特征数)}),即 2 维数组。 X_train.shape[0]
表示训练样本数,X_train.shape[1]
表示特征数。X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
这一步把数据从 (\text{(样本数, 特征数)}) reshape 成了 (\text{(样本数, 特征数, 1)})。
为什么要这样做?
- 某些深度学习模型(例如 1D-CNN、RNN/LSTM 等)需要输入三维数据:(批次大小, 时间步长/特征数, 通道数)。
- 如果需要单通道数据(就像图像处理中的灰度图是单通道),就会在最后一维加一个通道维度。
最终,你的 X_train
和 X_test
将具有三维形状,便于后续在深度学习框架(如 Keras、TensorFlow、PyTorch)中进行卷积或序列模型的输入。
# 创建pytorch dataset 和 dataloader
"""
在 PyTorch 中,我们通常先将 NumPy 数组转换为 torch.Tensor,
再封装到 TensorDataset 或自定义的 Dataset 里,然后用 DataLoader 按批次加载。
"""
import torch
from torch.utils.data import Dataset, DataLoader, TensorDataset
# 如果要做二分类 + Sigmoid + nn.BCELoss,那么标签可以用 float32
# 如果要做多分类(例如 softmax + CrossEntropy),则需把标签转为 long
y_train = y_train.astype(np.float32) # 二分类: float32
y_test = y_test.astype(np.float32) # 二分类: float32
# 转换为张量
X_train_tensor = torch.from_numpy(X_train).float() # shape:[samples, 13, 1]
y_train_tensor = torch.from_numpy(y_train.to_numpy()) # shape:[samples]
X_test_tensor = torch.from_numpy(X_test).float()
y_test_tensor = torch.from_numpy(y_test.to_numpy())
# 如果后续需要在训练中对标签执行 pred>0.5 判定,可以保持 y 的 shape=[samples] 即可
# 也可 reshape([-1,1]) 保持和网络输出尺寸一致,不过这并非必须。
# y_train_tensor = y_train_tensor.view(-1,1)
# y_test_tensor = y_test_tensor.view(-1,1)
# 用 TensorDataset 直接封装
train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
test_dataset = TensorDataset(X_test_tensor, y_test_tensor)
# 创建 DataLoader
batch_size = 32
train_dl = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_dl = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
4. 构建RNN模型
# -----------------------------
# 1. 定义模型结构
# -----------------------------
class SimpleRNNModel(nn.Module):
def __init__(self):
super(SimpleRNNModel, self).__init__()
# TensorFlow 中 input_shape=(13,1),即序列长度 seq_len = 13,特征维度 input_dim = 1
# PyTorch RNN 层若设置 batch_first=True:
# 输入张量形状: (batch_size, seq_len, input_dim)
# 输出张量形状: (batch_size, seq_len, hidden_size)
self.rnn = nn.RNN(
input_size=1, # 对应 TF 的 input_dim=1
hidden_size=200, # 对应 TF 的 RNN(200)
batch_first=True,
nonlinearity='relu' # 对应 TF 的 activation='relu'
)
self.fc1 = nn.Linear(200, 100) # 对应 Dense(100, activation='relu')
self.fc2 = nn.Linear(100, 1) # 对应 Dense(1, activation='sigmoid')
self.sigmoid = nn.Sigmoid()
def forward(self, x):
# x: [batch_size, 13, 1]
# RNN 输出: output, hidden
# output shape = [batch_size, seq_len, hidden_size]
# hidden shape = [num_layers, batch_size, hidden_size]
out, hidden = self.rnn(x)
# 取最后一个 time_step 的输出, 与 TensorFlow 里 SimpleRNN 的默认行为一致
out = out[:, -1, :] # shape: [batch_size, hidden_size]
# 与 Dense(100, relu)
out = F.relu(self.fc1(out)) # [batch_size, 100]
# 与 Dense(1, sigmoid)
out = self.sigmoid(self.fc2(out)) # [batch_size, 1]
return out
注意:
如果您想要多分类(假设类别数 = N),则可以修改最后一层:
self.fc2 = nn.Linear(100, N)
# 例如多分类常用:
# return F.log_softmax(self.fc2(out), dim=1)
并对应地修改损失函数为 nn.CrossEntropyLoss(),准确率统计用 pred.argmax(1) 就是合理的。
5. 初始化模型及优化器
# -----------------------------
# 2. 初始化模型与优化器
# -----------------------------
model = SimpleRNNModel().to(device)
print(model)
# 与 TF 中 loss='binary_crossentropy' 对应,PyTorch 用 BCE:nn.BCELoss
loss_fn = nn.BCELoss()
# 多分类问题使用nn.CrossEntropyLoss()
# criterion = nn.CrossEntropyLoss()
learn_rate = 1e-4
# learn_rate = 3e-4
lambda1 = lambda epoch:(0.92**(epoch//2))
optimizer = torch.optim.Adam(model.parameters(),lr = learn_rate)
scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer,lr_lambda=lambda1) # 选定调整方法
SimpleRNNModel(
(rnn): RNN(1, 200, batch_first=True)
(fc1): Linear(in_features=200, out_features=100, bias=True)
(fc2): Linear(in_features=100, out_features=1, bias=True)
(sigmoid): Sigmoid()
)
6. 训练函数
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset) # 训练集大小
num_batches = len(dataloader) # 批次数目
train_loss, train_acc = 0, 0
for X, y in dataloader:
X, y = X.to(device), y.to(device)
# 计算预测
pred = model(X).view(-1) # [batch_size]
loss = loss_fn(pred, y)
# 反向传播
optimizer.zero_grad()
loss.backward()
optimizer.step()
# 记录acc与loss
# 情况1: 如果是多分类(N>1), pred.shape=[batch_size, N],可以用argmax(1).
# train_acc += (pred.argmax(1) == y).type(torch.float).sum().item()
# 情况2: 如果是二分类且只有1个输出(使用 Sigmoid),则 pred.shape=[batch_size,1],
# 那么可用 (pred>0.5) 转为0/1来比较:
pred_label = (pred > 0.5).long() # [batch_size]
train_acc += (pred_label == y.long()).sum().item()
train_loss += loss.item()
train_acc /= size
train_loss /= num_batches
return train_acc, train_loss
7. 测试函数
def test(dataloader, model, loss_fn):
size = len(dataloader.dataset)
num_batches = len(dataloader)
test_acc, test_loss = 0, 0
with torch.no_grad():
for X, y in dataloader:
X, y = X.to(device), y.to(device)
# 计算预测
pred = model(X).view(-1) # [batch_size]
loss = loss_fn(pred, y)
# 情况1: 多分类(N>1):
# test_acc += (pred.argmax(1) == y).type(torch.float).sum().item()
# 情况2: 二分类单输出:
pred_label = (pred > 0.5).long() # [batch_size]
# test_acc += (pred_label.view(-1) == y).type(torch.float).sum().item()
test_acc += (pred_label == y.long()).sum().item()
test_loss += loss.item()
test_acc /= size
test_loss /= num_batches
return test_acc, test_loss
8.训练环节
# -----------------------------
# 打印可用 GPU 信息
# -----------------------------
if torch.cuda.is_available():
for i in range(torch.cuda.device_count()):
print(f"GPU {i}: {torch.cuda.get_device_name(i)}")
print(f"Initial Memory Allocated: {torch.cuda.memory_allocated(i)/1024**2:.2f} MB")
print(f"Initial Memory Reserved: {torch.cuda.memory_reserved(i)/1024**2:.2f} MB")
else:
print("No GPU available. Using CPU.")
# -----------------------------
# 训练主循环
# -----------------------------
epochs = 60
train_acc_list = []
train_loss_list = []
test_acc_list = []
test_loss_list = []
best_acc = 0.0
best_model = None
for epoch in range(epochs):
# 更新学习率——使用自定义学习率时使用
# adjust_learning_rate(optimizer,epoch,learn_rate)
# 切换为训练模式
model.train()
epoch_train_acc, epoch_train_loss = train(train_dl, model, loss_fn, optimizer)
# 更新学习率
scheduler.step() # 更新学习率——调用官方动态学习率时使用
# 切换为评估模式
model.eval()
epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)
# 保存最佳模型
if epoch_test_acc > best_acc:
best_acc = epoch_test_acc
best_model = copy.deepcopy(model)
train_acc_list.append(epoch_train_acc)
train_loss_list.append(epoch_train_loss)
test_acc_list.append(epoch_test_acc)
test_loss_list.append(epoch_test_loss)
# 当前学习率
lr = optimizer.state_dict()['param_groups'][0]['lr']
template = (
'Epoch:{:2d}, '
'Train_acc:{:.1f}%, Train_loss:{:.3f}, '
'Test_acc:{:.1f}%, Test_loss:{:.3f}, '
'Lr:{:.2E}'
)
print(template.format(
epoch+1,
epoch_train_acc*100, epoch_train_loss,
epoch_test_acc*100, epoch_test_loss,
lr
))
# 实时监控 GPU 状态
if torch.cuda.is_available():
for i in range(torch.cuda.device_count()):
print(f"GPU {i} Usage:")
print(f" Memory Allocated: {torch.cuda.memory_allocated(i)/1024**2:.2f} MB")
print(f" Memory Reserved: {torch.cuda.memory_reserved(i)/1024**2:.2f} MB")
print(f" Max Memory Allocated: {torch.cuda.max_memory_allocated(i)/1024**2:.2f} MB")
print(f" Max Memory Reserved: {torch.cuda.max_memory_reserved(i)/1024**2:.2f} MB")
print('Done. Best test acc: ', best_acc)
GPU 0: NVIDIA GeForce RTX 4070 Laptop GPU
Initial Memory Allocated: 17.19 MB
Initial Memory Reserved: 34.00 MB
Epoch: 1, Train_acc:55.1%, Train_loss:0.689, Test_acc:48.4%, Test_loss:0.690, Lr:1.00E-04
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.26 MB
Max Memory Reserved: 36.00 MB
Epoch: 2, Train_acc:55.1%, Train_loss:0.685, Test_acc:48.4%, Test_loss:0.686, Lr:9.20E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch: 3, Train_acc:55.1%, Train_loss:0.683, Test_acc:48.4%, Test_loss:0.682, Lr:9.20E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch: 4, Train_acc:55.1%, Train_loss:0.681, Test_acc:48.4%, Test_loss:0.678, Lr:8.46E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch: 5, Train_acc:55.1%, Train_loss:0.678, Test_acc:48.4%, Test_loss:0.674, Lr:8.46E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch: 6, Train_acc:55.1%, Train_loss:0.675, Test_acc:51.6%, Test_loss:0.669, Lr:7.79E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch: 7, Train_acc:55.9%, Train_loss:0.671, Test_acc:64.5%, Test_loss:0.664, Lr:7.79E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch: 8, Train_acc:59.2%, Train_loss:0.669, Test_acc:74.2%, Test_loss:0.659, Lr:7.16E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch: 9, Train_acc:60.7%, Train_loss:0.666, Test_acc:77.4%, Test_loss:0.653, Lr:7.16E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:10, Train_acc:62.5%, Train_loss:0.661, Test_acc:77.4%, Test_loss:0.646, Lr:6.59E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:11, Train_acc:64.0%, Train_loss:0.655, Test_acc:77.4%, Test_loss:0.638, Lr:6.59E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:12, Train_acc:67.6%, Train_loss:0.651, Test_acc:80.6%, Test_loss:0.628, Lr:6.06E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:13, Train_acc:69.5%, Train_loss:0.641, Test_acc:77.4%, Test_loss:0.617, Lr:6.06E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:14, Train_acc:71.0%, Train_loss:0.631, Test_acc:77.4%, Test_loss:0.602, Lr:5.58E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:15, Train_acc:71.3%, Train_loss:0.620, Test_acc:80.6%, Test_loss:0.585, Lr:5.58E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:16, Train_acc:71.3%, Train_loss:0.611, Test_acc:77.4%, Test_loss:0.561, Lr:5.13E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:17, Train_acc:73.2%, Train_loss:0.593, Test_acc:80.6%, Test_loss:0.535, Lr:5.13E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:18, Train_acc:77.2%, Train_loss:0.572, Test_acc:83.9%, Test_loss:0.507, Lr:4.72E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:19, Train_acc:78.7%, Train_loss:0.552, Test_acc:83.9%, Test_loss:0.475, Lr:4.72E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:20, Train_acc:78.7%, Train_loss:0.527, Test_acc:83.9%, Test_loss:0.452, Lr:4.34E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:21, Train_acc:77.6%, Train_loss:0.507, Test_acc:80.6%, Test_loss:0.443, Lr:4.34E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:22, Train_acc:78.3%, Train_loss:0.484, Test_acc:80.6%, Test_loss:0.462, Lr:4.00E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:23, Train_acc:77.9%, Train_loss:0.472, Test_acc:80.6%, Test_loss:0.469, Lr:4.00E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:24, Train_acc:75.4%, Train_loss:0.465, Test_acc:80.6%, Test_loss:0.488, Lr:3.68E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:25, Train_acc:76.8%, Train_loss:0.474, Test_acc:80.6%, Test_loss:0.493, Lr:3.68E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:26, Train_acc:76.5%, Train_loss:0.451, Test_acc:80.6%, Test_loss:0.479, Lr:3.38E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:27, Train_acc:76.8%, Train_loss:0.450, Test_acc:80.6%, Test_loss:0.481, Lr:3.38E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:28, Train_acc:76.8%, Train_loss:0.455, Test_acc:80.6%, Test_loss:0.466, Lr:3.11E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:29, Train_acc:76.1%, Train_loss:0.459, Test_acc:80.6%, Test_loss:0.467, Lr:3.11E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:30, Train_acc:76.5%, Train_loss:0.456, Test_acc:80.6%, Test_loss:0.481, Lr:2.86E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:31, Train_acc:76.5%, Train_loss:0.458, Test_acc:80.6%, Test_loss:0.482, Lr:2.86E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:32, Train_acc:76.5%, Train_loss:0.445, Test_acc:80.6%, Test_loss:0.474, Lr:2.63E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:33, Train_acc:76.8%, Train_loss:0.442, Test_acc:80.6%, Test_loss:0.471, Lr:2.63E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:34, Train_acc:76.5%, Train_loss:0.447, Test_acc:80.6%, Test_loss:0.473, Lr:2.42E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:35, Train_acc:76.5%, Train_loss:0.434, Test_acc:80.6%, Test_loss:0.472, Lr:2.42E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:36, Train_acc:76.5%, Train_loss:0.459, Test_acc:80.6%, Test_loss:0.470, Lr:2.23E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:37, Train_acc:76.5%, Train_loss:0.439, Test_acc:80.6%, Test_loss:0.476, Lr:2.23E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:38, Train_acc:76.8%, Train_loss:0.445, Test_acc:80.6%, Test_loss:0.475, Lr:2.05E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:39, Train_acc:76.8%, Train_loss:0.449, Test_acc:80.6%, Test_loss:0.478, Lr:2.05E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:40, Train_acc:76.8%, Train_loss:0.439, Test_acc:80.6%, Test_loss:0.473, Lr:1.89E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:41, Train_acc:76.8%, Train_loss:0.451, Test_acc:80.6%, Test_loss:0.468, Lr:1.89E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:42, Train_acc:77.2%, Train_loss:0.446, Test_acc:80.6%, Test_loss:0.468, Lr:1.74E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:43, Train_acc:77.2%, Train_loss:0.442, Test_acc:80.6%, Test_loss:0.471, Lr:1.74E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:44, Train_acc:77.2%, Train_loss:0.439, Test_acc:80.6%, Test_loss:0.470, Lr:1.60E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:45, Train_acc:76.8%, Train_loss:0.450, Test_acc:80.6%, Test_loss:0.472, Lr:1.60E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:46, Train_acc:77.2%, Train_loss:0.445, Test_acc:80.6%, Test_loss:0.471, Lr:1.47E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:47, Train_acc:77.2%, Train_loss:0.442, Test_acc:80.6%, Test_loss:0.469, Lr:1.47E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:48, Train_acc:76.8%, Train_loss:0.443, Test_acc:80.6%, Test_loss:0.470, Lr:1.35E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:49, Train_acc:77.2%, Train_loss:0.444, Test_acc:80.6%, Test_loss:0.468, Lr:1.35E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:50, Train_acc:76.8%, Train_loss:0.434, Test_acc:80.6%, Test_loss:0.467, Lr:1.24E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:51, Train_acc:76.8%, Train_loss:0.430, Test_acc:80.6%, Test_loss:0.466, Lr:1.24E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:52, Train_acc:76.8%, Train_loss:0.427, Test_acc:80.6%, Test_loss:0.465, Lr:1.14E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:53, Train_acc:77.2%, Train_loss:0.434, Test_acc:80.6%, Test_loss:0.467, Lr:1.14E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:54, Train_acc:77.2%, Train_loss:0.442, Test_acc:80.6%, Test_loss:0.468, Lr:1.05E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:55, Train_acc:77.2%, Train_loss:0.453, Test_acc:80.6%, Test_loss:0.467, Lr:1.05E-05
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:56, Train_acc:77.2%, Train_loss:0.434, Test_acc:80.6%, Test_loss:0.466, Lr:9.68E-06
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:57, Train_acc:76.8%, Train_loss:0.438, Test_acc:80.6%, Test_loss:0.467, Lr:9.68E-06
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:58, Train_acc:76.8%, Train_loss:0.440, Test_acc:80.6%, Test_loss:0.467, Lr:8.91E-06
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:59, Train_acc:76.8%, Train_loss:0.431, Test_acc:80.6%, Test_loss:0.467, Lr:8.91E-06
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Epoch:60, Train_acc:76.8%, Train_loss:0.433, Test_acc:80.6%, Test_loss:0.467, Lr:8.20E-06
GPU 0 Usage:
Memory Allocated: 17.42 MB
Memory Reserved: 36.00 MB
Max Memory Allocated: 30.49 MB
Max Memory Reserved: 36.00 MB
Done. Best test acc: 0.8387096774193549
关键注意点
-
准确率计算
- 如果是二分类且只有 1 个输出(Sigmoid),则应使用
(pred > 0.5)
来将浮点数转换成 0/1 整型,再与真实标签比较。 - 如果是多分类(输出大小 > 1),可以使用
argmax(1)
得到预测类别,再与真实标签比较。 - 您的模板中使用的是
pred.argmax(1)
,请确认与您的模型输出是否对应,否则需要改写为(pred > 0.5)
之类的二分类逻辑。
- 如果是二分类且只有 1 个输出(Sigmoid),则应使用
-
损失函数
- 二分类:
- (1) 输出层保留
sigmoid()
+ 损失函数用nn.BCELoss()
- (2) 输出层不做
sigmoid()
+ 损失函数用nn.BCEWithLogitsLoss()
- (1) 输出层保留
- 多分类:
- 输出层返回原始 logits,损失函数用
nn.CrossEntropyLoss()
或F.cross_entropy
。
(或者输出层返回 log-softmax,损失函数用nn.NLLLoss()
)
- 输出层返回原始 logits,损失函数用
- 二分类:
-
GPU 多卡
- 如果需要多卡并行,可使用
nn.DataParallel
或DistributedDataParallel
。 DataParallel
用法示例:if torch.cuda.device_count() > 1: print("Using", torch.cuda.device_count(), "GPUs!") model = nn.DataParallel(model) model = model.to(device)
- 同时要注意各张卡的显存占用和负载情况。
- 如果需要多卡并行,可使用
9. 结果可视化
epochs_range = range(epochs)
plt.figure(figsize=(12, 5))
# 准确率曲线
plt.subplot(1, 2, 1)
plt.plot(epochs_range, train_acc_list, label='Training Accuracy')
plt.plot(epochs_range, test_acc_list, label='Test Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
# 损失曲线
plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss_list, label='Training Loss')
plt.plot(epochs_range, test_loss_list, label='Test Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.show()