当前位置：首页 > article >正文

PyTorch 中的 nn.ModuleList 是什么？与普通列表有啥区别？

article 2025/3/3 20:40:50

PyTorch 中的 `nn.ModuleList` 是什么？与普通列表有啥区别？

如果你在用 PyTorch 实现神经网络模型，尤其是涉及到多个子模块（比如专家网络、层列表）时，可能会遇到 nn.ModuleList。比如在 MixtureOfExperts 的代码中，你可能会看到：

self.experts = nn.ModuleList([
    Expert(config.expert_dim, config.hidden_dim, config.expert_dim)
    for _ in range(self.num_experts)
])

这时候你可能会好奇：为什么不用普通的 Python 列表（list）呢？nn.ModuleList 到底是个啥？今天我们就来聊聊它的作用、与普通列表的区别，以及为什么 PyTorch 设计了这个东西。

1. 先认识 `nn.ModuleList`

nn.ModuleList 是 PyTorch 提供的一个容器类，定义在 torch.nn 模块中。它的功能很简单：用来存储一组 nn.Module 的子模块（比如神经网络层、nn.Linear、nn.Conv2d 等）。从表面上看，它跟普通的 Python 列表差不多，可以用 append 添加元素、用索引访问内容，但它的特别之处在于它与 PyTorch 的 nn.Module 系统深度集成。

简单来说，nn.ModuleList 是一个“聪明”的列表，它能让 PyTorch 知道里面装的是模型的子模块，从而正确管理这些子模块的参数和行为。

2. 与普通列表的区别：一个简单的实验

我们先通过一个例子来看看 nn.ModuleList 和普通列表的区别。假设我们要定义一个简单的模型，包含多个全连接层：

import torch
import torch.nn as nn

# 用普通列表
class ModelWithList(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = [nn.Linear(10, 20), nn.Linear(20, 10)]

    def forward(self, x):
        for layer in self.layers:
            x = layer(x)
        return x

# 用 nn.ModuleList
class ModelWithModuleList(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.ModuleList([nn.Linear(10, 20), nn.Linear(20, 10)])

    def forward(self, x):
        for layer in self.layers:
            x = layer(x)
        return x

# 测试
model1 = ModelWithList()
model2 = ModelWithModuleList()

print("普通列表的参数：", list(model1.parameters()))
print("nn.ModuleList 的参数：", list(model2.parameters()))

运行这段代码，你会发现：

model1.parameters() 输出的是一个空列表。
model2.parameters() 输出的是 self.layers 中两个 nn.Linear 层的权重和偏置参数。

为什么会这样？答案在于 PyTorch 的参数注册机制。

3. 核心区别：参数注册

PyTorch 的 nn.Module 类有一个很重要的功能：它会自动跟踪所有属于模型的参数（weights 和 biases），并通过 .parameters() 方法返回这些参数。这些参数会被优化器（如 torch.optim.SGD）用来更新模型。

但 PyTorch 怎么知道哪些是“属于模型的参数”呢？规则是：

只有直接赋值给 nn.Module 子类的属性（attribute），并且这个属性是 nn.Parameter 或 nn.Module 的实例，才会被注册。
如果你把子模块放进一个普通 Python 列表（list），PyTorch 不会去“看”列表里面的内容，因为普通列表只是 Python 的数据结构，不是 PyTorch 的模块。

在上面的例子中：

ModelWithList 用普通列表 self.layers = [nn.Linear(...)]，nn.Linear 对象只是存在于列表中，没有直接作为类的属性注册，所以 PyTorch 找不到这些参数。
ModelWithModuleList 用 nn.ModuleList，它本身是一个 nn.Module 的子类，PyTorch 会识别它内部的子模块，并递归地注册所有参数。

4. 另一个区别：模型结构的打印

除了参数注册，nn.ModuleList 还会影响模型结构的显示。试试打印这两个模型：

print(model1)
print(model2)

输出可能是：

ModelWithList(
  (layers): [Linear(in_features=10, out_features=20, bias=True), Linear(in_features=20, out_features=10, bias=True)]
)
ModelWithModuleList(
  (layers): ModuleList(
    (0): Linear(in_features=10, out_features=20, bias=True)
    (1): Linear(in_features=20, out_features=10, bias=True)
  )
)

普通列表只是简单地显示为一个 Python 对象，PyTorch 不会解析它的内容。
nn.ModuleList 会被漂亮地格式化，显示每个子模块的细节。这是因为它是 PyTorch 生态的一部分，遵循 nn.Module 的打印规则。

5. 什么时候必须用 `nn.ModuleList`？

nn.ModuleList 主要用在需要动态管理多个子模块的场景，比如：

动态层数：比如你的模型层数由输入参数决定，用 nn.ModuleList 可以方便地添加任意数量的层。
专家网络：像 MixtureOfExperts 这样，每个专家是一个独立的子模块，需要统一管理。
循环结构：在某些复杂模型中，子模块需要被迭代调用。

如果你的模型很简单，只有一个固定的层（比如 self.fc = nn.Linear(10, 20)），直接赋值就行了，不需要 nn.ModuleList。

6. 注意事项：别混淆 `nn.ModuleList` 和 `nn.Sequential`

PyTorch 还有一个类似的工具 nn.Sequential，它也是用来管理多个层的，但它和 nn.ModuleList 有不同用途：

nn.ModuleList：只是一个容器，不会自动定义 forward 方法，你需要自己写逻辑来调用每个子模块。
nn.Sequential：不仅管理子模块，还会自动按顺序执行它们，适合简单的顺序模型。

比如：

layers = nn.ModuleList([nn.Linear(10, 20), nn.ReLU()])
# 需要手动写 forward
for layer in layers:
    x = layer(x)

seq = nn.Sequential(nn.Linear(10, 20), nn.ReLU())
# 自动执行
x = seq(x)