NLP-transformer学习:(7)evaluate实践
NLP-transformer学习:(7)evaluate 使用方法
打好基础,为了后面学习走得更远。
本章节是单独的 NLP-transformer学习 章节,主要实践了evaluate。同时,最近将学习代码传到:https://github.com/MexWayne/mexwayne_transformers-code,作者的代码版本有些细节我发现到目前不能完全行的通,为了尊重原作者,我这里保持了大部分的内容,并表明了来源,欢迎大家一起学习。
提示:以下是本篇文章正文内容,下面案例可供参考
1 evaluate是什么?
evaluate库就是一个非常简单的机器学习评估库函数,封装了很多我们平差给你评估模型的函数。
地址:https://huggingface.co/evaluate-metric
我们可以在这个链接下看到有很多指标,比如 bleu sari precision
2 evaluate基本用法?
首先需要安装:
pip install evaluate
2.1 evaluate 调用与输入参数
代码:
import evaluate
if __name__ == "__main__":
# see the evalution function that evaluate support
print(evaluate.list_evaluation_modules())
#print(evaluate.list_evaluation_modules(include_community=False, with_details=True))
# load the accuracy class
accuracy = evaluate.load("accuracy")
# introduce the accuracy functions
print(accuracy.description)
结果:
但是看到这 介绍内容太少,我么不清楚输入怎么做
import evaluate
if __name__ == "__main__":
# load the accuracy class
accuracy = evaluate.load("accuracy")
# introduce the accuracy functions
#print(accuracy.description)
# the inputs help guide
print(accuracy.inputs_description)
结果:
这样就会有很多详细,比如
这里 ,需要是 list of int 的 预测值(predictions)和真值(references),因为总共有6个,reference 和 predictions对的上的就3个,所以accuracy 是 0.5。
如果权重不同的还可以增加权证欧冠你,这样 accurarcy
2.2 evaluate 几种计算方式
(1)全局计算
import evaluate
if __name__ == "__main__":
######################################################## inputs and call
# global accuracy
accuracy = evaluate.load("accuracy")
results = accuracy.compute(references=[0,1,2,0,1,2], predictions=[0,1,1,2,1,0])
print(results)
# iterate accurarcy
accuracy = evaluate.load("accuracy")
# for refs, preds in zip([[0,1],[0,1]],[2,8],[4,1], [[1,0],[0,1],[3,8],[4,1]]):
for refs, preds in zip([0,1,2,3,4],[0,1,2,3,3]):
accuracy.add(references=refs, precitions=preds)
print(accuracy.compute())
第一个打印结果:
(2)全局计算
import evaluate
if __name__ == "__main__":
######################################################## inputs and call
# global accuracy
accuracy = evaluate.load("accuracy")
results = accuracy.compute(references=[0,1,2,0,1,2], predictions=[0,1,1,2,1,0])
print(results)
(2)迭代计算:
import evaluate
if __name__ == "__main__":
# iterate accurarcy
accuracy = evaluate.load("accuracy")
for refs, preds in zip([0,1,2,3,4],[0,1,2,3,3]):
accuracy.add(references=refs, predictions=preds)
print("iterate way:")
print(accuracy.compute())
结果
(3)batch 迭代计算
import evaluate
if __name__ == "__main__":
# batch iterate accurarcy
accuracy = evaluate.load("accuracy")
for refs, preds in zip([[0,1,2,3,4], [0,1,2,3,3], [8,8,8,8,8]], # refs batches
[[0,1,2,3,4], [0,1,2,3,5], [8,8,8,8,8]]): # preds batches
accuracy.add_batch(references=refs, predictions=preds)
print("batch iterate way:")
print(accuracy.compute())
结果:
(4)多指标
这里 我们选择 accuracy,precsion, f1, recall
import evaluate
if __name__ == "__main__":
# multiple labels
clf_metrics = evaluate.combine(["accuracy", "f1", "recall", "precision", "XNLI", "SARI"])
print(clf_metrics.compute(predictions=[0, 1, 1, 1, 1], references=[0, 1, 0, 1, 1]))
运行结果:
3 评估可视化
有多个模型,然后都去做预测,我们需要可视化比较
import evaluate
if __name__ == "__main__":
##################################################### visual
from evaluate.visualization import radar_plot
data = [
{"accuracy": 0.99, "precision": 0.80, "f1": 0.95, "latency_in_seconds": 33.6 , "recall":0.5},
{"accuracy": 0.98, "precision": 0.87, "f1": 0.91, "latency_in_seconds": 11.2 , "recall":0.5},
{"accuracy": 0.98, "precision": 0.78, "f1": 0.88, "latency_in_seconds": 87.6 , "recall":0.6},
{"accuracy": 0.88, "precision": 0.78, "f1": 0.81, "latency_in_seconds": 101.6, "recall":0.7},
{"accuracy": 0.78, "precision": 0.78, "f1": 0.81, "latency_in_seconds": 100.0, "recall":0.9}
]
model_names = ["Model 1", "Model 2", "Model 3", "Model 4", "Model 5"]
plot = radar_plot(data=data, model_names=model_names)
print(type(plot))
plot.show()
plot.savefig('radar.png')
结果