如何阅读PyTorch文档及常见PyTorch错误
如何阅读PyTorch文档及常见PyTorch错误
文章目录
- 如何阅读PyTorch文档及常见PyTorch错误
- 阅读PyTorch文档示例
- 常见Pytorch错误
- Tensor在不同设备上
- 维度不匹配
- cuda内存不足
- 张量类型不匹配
- 参考
PyTorch文档查看https://pytorch.org/docs/stable/
torch.nn -> 定义神经网络
torch.optim -> 优化算法
torch.utils.data -> 数据加载 dataset, dataloader类
阅读PyTorch文档示例
以torch.max
为例
有些函数对不同的输入有不同的行为
Parameters(位置参数):不需要指定参数的名称
Keyword Arguments(关键字参数):必须指定参数的名称
他们通过 *
隔开
带默认值的参数:有些参数有默认值(keepdim=False),所以传递这个参数的值是可选的
三种torch.max
的不同输入
- 返回整个张量的最大值(
torch.max(input) → Tensor
)
# 1. max of entire tensor (torch.max(input) → Tensor)
m = torch.max(x)
print(m)
-
沿一个维度的最大值 (
torch.max(input, dim, keepdim=False, *, out=None) → (Tensor, LongTensor)
)# 2. max along a dimension (torch.max(input, dim, keepdim=False, *, out=None) → (Tensor, LongTensor)) m, idx = torch.max(x,0) print(m) print(idx)
位置参数可以不指定参数的名字,关键字参数必须指定参数名字。以
*
隔开,(位置参数 * 关键字参数)# 2-2 位置参数可以不指定参数的名字,关键字参数必须指定参数名字。以 * 隔开,(位置参数 * 关键字参数) m, idx = torch.max(input=x,dim=0) print(m) print(idx)
# 2-3 m, idx = torch.max(x,0,False) print(m) print(idx) # 2-4 m, idx = torch.max(x,dim=0,keepdim=True) print(m) print(idx) # 2-5 p = (m,idx) torch.max(x,0,False,out=p) print(p[0]) print(p[1])
位置参数可以不指定参数的名字,关键字参数必须指定参数名字。
-
两个张量上的选择最大的(
torch.max(input, other, *, out=None) → Tensor
)# 3. max(choose max) operators on two tensors (torch.max(input, other, *, out=None) → Tensor) t = torch.max(x,y) print(t)
常见Pytorch错误
Tensor在不同设备上
报错信息:RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_mm)
解决方案:将张量移动到GPU
# 1. different device error (fixed)
x = torch.randn(5).to("cuda:0")
y = model(x)
print(y.shape)
维度不匹配
报错信息:RuntimeError: The size of tensor a (5) must match the size of tensor b (4) at non-singleton dimension 1
解决办法:张量的形状不正确,使用transpose,squeeze, unsqueeze
来对齐尺寸
# 2. mismatched dimensions error 1 (fixed by transpose)
y = y.transpose(0,1)
z = x + y
print(z.shape)
cuda内存不足
报错信息:RuntimeError: CUDA out of memory. Tried to allocate 7.27 GiB (GPU 0; 4.00 GiB total capacity; 8.67 GiB already allocated; 0 bytes free; 8.69 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
解决方法:数据的批量大小太大,无法装入GPU。减小批量大小。如果对数据进行迭代(batch size = 1
),问题就会得到解决。你也可以使用DataLoader
# 3. cuda out of memory error (fixed, but it might take some time to execute)
for d in data:
out = resnet18(d.to("cuda:0").unsqueeze(0))
print(out.shape)
张量类型不匹配
报错信息:RuntimeError: expected scalar type Long but found Float
解决方法:标签张量类型必须是Long,将其转换为“long”以解决此问题
# 4. mismatched tensor type (fixed)
labels = labels.long()
lossval = L(outs,labels)
print(lossval)
参考
torch.max — PyTorch 2.4 documentation
Hongyi_Lee_dl_homeworks/Warmup/Pytorch_Tutorial_2.pdf at master · huaiyuechusan/Hongyi_Lee_dl_homeworks (github.com)
orial_2.pdf at master · huaiyuechusan/Hongyi_Lee_dl_homeworks (github.com)](https://github.com/huaiyuechusan/Hongyi_Lee_dl_homeworks/blob/master/Warmup/Pytorch_Tutorial_2.pdf)