当前位置：首页 > article >正文

[Bert模型微调]

article 2024/10/21 5:32:11

1. attention_mask 问题

https://blog.csdn.net/python_plus/article/details/136194789

有效的值为1，无效的为0。

但是即使反过来，让有效的变成0，他也是能拟合的！！只不过效果会很差。所以debug的时候就很烦xxx

2. 使用pad+attention_mask的效果和不使用mask+没有pad的效果是不一样的

很神奇的一个问题，我也不知道后续能不能复现。

a = [101, 1, 2, 3, 102, 0, 0, 0, 0, 0]
a = torch.tensor(a).reshape(1, -1)
amask = torch.tensor([1, 1, 1, 1, 1, 0, 0, 0, 0, 0]).reshape(1, -1)
a = bert(input_ids=a, attention_mask=amask)

b = [101, 1, 2, 3, 102]
b = torch.tensor(b).reshape(1, -1)
b = bert(input_ids=b)

assert torch.allclose(a, b, atol=1e-5)  # assert false

查看全文

http://www.kler.cn/news/358295.html