当前位置: 首页 > article >正文

分类、聚类与回归的评价指标

cross_validatecross_val_score中,参数scoring,与分类、聚类和回归算法的评价指标有关。

3.4.3. The scoring parameter: defining model evaluation rules

For the most common use cases, you can designate a scorer object with the scoring parameter via a string name; the table below shows all possible values. All scorer objects follow the convention that higher return values are better than lower return values. Thus metrics which measure the distance between the model and the data, like metrics.mean_squared_error, are available as ‘neg_mean_squared_error’ which return the negated value of the metric
对于最常见的用例,您可以通过字符串名称使用 scoring 参数指定一个评分对象;下表显示了所有可能的值。所有评分对象都遵循这样的约定:返回值越高越好。因此,像 metrics.mean_squared_error 这样衡量模型与数据之间距离的指标,会以 ‘neg_mean_squared_error’ 的形式提供,返回该指标的负值。

1、分类

字符串函数公式
accuracymetrics.accuracy_score a c c u r a c y ( y , y ^ ) = 1 n ∑ i = 0 n − 1 1 ( y ^ i = y i ) accuracy(y,\hat{y}) = \frac{1}{n}\sum\limits_{i=0}^{n-1}1(\hat{y}_i=y_i) accuracy(y,y^)=n1i=0n11(y^i=yi)
balanced_accuracymetrics.balanced_accuracy_score b a l a n c e d − a c c u r a c y = 1 2 ( T P T P + F N + T N T N + F P ) balanced-accuracy=\frac{1}{2}(\frac{TP}{TP+FN}+\frac{TN}{TN+FP}) balancedaccuracy=21(TP+FNTP+TN+FPTN)
top_k_accuracymetrics.top_k_accuracy_score t o p − k    a c c u r a c y ( y , y ^ ) = 1 n ∑ i = 0 n − 1 ∑ j = 1 k 1 ( f ^ i , j = y i ) top-k\ \ accuracy(y,\hat{y}) = \frac{1}{n}\sum\limits_{i=0}^{n-1}\sum\limits_{j=1}^{k}1(\hat{f}_{i,j}=y_i) topk  accuracy(y,y^)=n1i=0n1j=1k1(f^i,j=yi)
average_precisionmetrics.average_precision_score A P = ∑ n ( R n − R n − 1 ) P n AP = \sum_{n}(R_n-R_{n-1})P_n AP=n(RnRn1)Pn
neg_brier_scoremetrics.brier_score_loss B S = 1 n ∑ i = 0 n − 1 ( y i − p i ) 2 = 1 n ∑ i = 0 n − 1 ( y i − p r e d i c t _ p r o b a ( y = 1 ) ) 2 BS= \frac{1}{n}\sum\limits_{i=0}^{n-1}(y_i-p_i)^2=\frac{1}{n}\sum\limits_{i=0}^{n-1}(y_i-predict\_{proba}(y=1))^2 BS=n1i=0n1(yipi)2=n1i=0n1(yipredict_proba(y=1))2
f1metrics.f1_score

F 1 = 2 × T P 2 × T P + F P + F N F1=\frac{2\times TP}{2\times TP+FP+FN} F1=2×TP+FP+FN2×TP

(average{‘micro’, ‘macro’, ‘samples’, ‘weighted’} or None, default=’macro’)
neg_log_lossmetrics.log_loss

L l o g ( y , p ) = − l o g P r ( y ∣ p ) = − ( y l o g ( p ) + ( 1 − y ) l o g ( 1 − p ) ) L_{log}(y,p)=-logPr(y|p)=-(ylog(p)+(1-y)log(1-p)) Llog(y,p)=logPr(yp)=(ylog(p)+(1y)log(1p))

L l o g ( Y , P ) = − l o g P r ( Y ∣ P ) = − 1 N ∑ i = 0 N − 1 ∑ k = 0 K − 1 y i , k l o g p i , k L_{log}(Y,P)=-logPr(Y|P)=-\frac{1}{N}\sum\limits_{i=0}^{N-1}\sum\limits_{k=0}^{K-1}y_{i,k}logp_{i,k} Llog(Y,P)=logPr(YP)=N1i=0N1k=0K1yi,klogpi,k
precisionmetrics.precision_score P = T P T P + F P P=\frac{TP}{TP+FP} P=TP+FPTP
recallmetrics.recall_score R = T P T P + F N R=\frac{TP}{TP+FN} R=TP+FNTP
jaccardmetrics.jaccard_score J ( y , y ^ ) = y ⋂ y ^ y ⋃ y ^ J(y,\hat{y})=\frac{y\bigcap\hat{y}}{y\bigcup\hat{y}} J(y,y^)=yy^yy^
roc_aucmetrics.roc_auc_score

Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores

(average{‘micro’, ‘macro’, ‘samples’, ‘weighted’} or None, default=’macro’)
d2_log_loss_scoremetrics.d2_log_loss_score D 2 ( y , y ^ ) = 1 − d e v ( y , y ^ ) d e v ( y , y n u l l ) D^2(y,\hat{y})=1-\frac{dev(y,\hat{y})}{dev(y,y_{null})} D2(y,y^)=1dev(y,ynull)dev(y,y^)

2、聚类

字符串函数公式
mutual_info_scoremetrics.mutual_info_score M I ( U , V ) = ∑ i = 0 ∣ U ∣ ∑ j = 0 ∣ V ∣ U i ⋂ V j N l o g N ∣ U i ⋂ V j ∣ ∣ U i ∣ ∣ V j ∣ MI(U,V)= \sum\limits_{i=0}^{|U|}\sum\limits_{j=0}^{|V|}\frac{U_i\bigcap V_j}{N}log\frac{N|U_i\bigcap V_j|}{\mid U_i\mid\mid V_j \mid} MI(U,V)=i=0Uj=0VNUiVjlogUi∣∣VjNUiVj
adjusted_mutual_info_scoremetrics.adjusted_mutual_info_score A M I ( U , V ) = M I ( U , V ) − E ( M I ( U , V ) ) a v g ( H ( U ) , H ( V ) ) − E ( M I ( U , V ) ) AMI(U,V)= \frac{MI(U,V)-E(MI(U,V))}{avg(H(U),H(V))-E(MI(U,V))} AMI(U,V)=avg(H(U),H(V))E(MI(U,V))MI(U,V)E(MI(U,V))
normalized_mutual_info_scoremetrics.normalized_mutual_info_score N M I ( U , V ) = 2 × I ( U ; V ) H ( U ) + H ( V ) NMI(U,V)= \frac{2\times I(U;V)}{H(U)+H(V)} NMI(U,V)=H(U)+H(V)2×I(U;V)
rand_scoremetrics.rand_score

R I = a + b C n 2 RI= \frac{a+b}{C_n^2} RI=Cn2a+b

a表示在实际和聚类结果中都是同类别的样本点对数

b表示实际和聚类结果中都不是同类别的样本点对数

adjusted_rand_scoremetrics.adjusted_rand_score A R I = R I − E ( R I ) m a x ( R I ) − E ( R I ) ARI= \frac{RI-E(RI)}{max(RI)-E(RI)} ARI=max(RI)E(RI)RIE(RI)
completeness_scoremetrics.completeness_score c = 1 − H ( K ∣ C ) H ( K ) c=1- \frac{H(K|C)}{H(K)} c=1H(K)H(KC)
homogeneity_scoremetrics.homogeneity_score h = 1 − H ( C ∣ K ) H ( C ) h=1- \frac{H(C|K)}{H(C)} h=1H(C)H(CK)
v_measure_scoremetrics.v_measure_score v = ( 1 + β ) × h o m o g e n e i t y × c o m p l e t e n e s s β × h o m o g e n e i t y + c o m p l e t e n e s s v=\frac{(1+\beta)\times homogeneity\times completeness}{\beta\times homogeneity+completeness} v=β×homogeneity+completeness(1+β)×homogeneity×completeness
fowlkes_mallows_scoremetrics.fowlkes_mallows_score F M I = T P ( T P + F P ) × ( T P + F N ) FMI=\frac{TP}{\sqrt{(TP+FP)\times(TP+FN)} } FMI=(TP+FP)×(TP+FN) TP

3、回归

字符串函数公式
explained_variancemetrics.explained_variance_score e x p l a i n e d _ v a r i a n c e ( y , y ^ ) = 1 − V a r { y − y ^ } V a r { y } explained\_variance(y,\hat{y})=1-\frac {Var\{y-\hat{y}\}}{Var\{y\}} explained_variance(y,y^)=1Var{y}Var{yy^}
neg_max_errormetrics.max_error M a x E r r o r ( y , y ^ ) = m a x ( ∣ y i − y i ^ ∣ ) MaxError(y,\hat{y})=max(\mid y_i-\hat{y_i}\mid) MaxError(y,y^)=max(yiyi^)
neg_mean_absolute_errormetrics.mean_absolute_error M A E ( y , y ^ ) = 1 n ∑ i = 0 n − 1 ∣ y i − y i ^ ∣ MAE(y,\hat{y})=\frac{1}{n}\sum\limits_{i=0}^{n-1}\mid y_i-\hat{y_i}\mid MAE(y,y^)=n1i=0n1yiyi^
neg_mean_squared_errormetrics.mean_squared_error M S E ( y , y ^ ) = 1 n ∑ i = 0 n − 1 ( y i − y i ^ ) 2 MSE(y,\hat{y})=\frac{1}{n}\sum\limits_{i=0}^{n-1}( y_i-\hat{y_i})^2 MSE(y,y^)=n1i=0n1(yiyi^)2
neg_root_mean_squared_errormetrics.root_mean_squared_error R M S E ( y , y ^ ) = 1 n ∑ i = 0 n − 1 ( y i − y i ^ ) 2 RMSE(y,\hat{y})=\sqrt{\frac{1}{n}\sum\limits_{i=0}^{n-1}( y_i-\hat{y_i})^2} RMSE(y,y^)=n1i=0n1(yiyi^)2
neg_root_mean_squared_log_errormetrics.root_mean_squared_log_error M S L E ( y , y ^ ) = 1 n ∑ i = 0 n − 1 ( l o g e ( 1 + y i ) − l o g e ( 1 + y ^ i ) ) 2 MSLE(y,\hat{y})=\frac{1}{n}\sum\limits_{i=0}^{n-1}( log_e(1+y_i)-log_e(1+\hat y_i))^2 MSLE(y,y^)=n1i=0n1(loge(1+yi)loge(1+y^i))2
neg_median_absolute_errormetrics.median_absolute_error M a d A E ( y , y ^ ) = m e d i a n ( ∣ y 1 − y ^ 1 ∣ , . . . , ∣ y n − y ^ n ∣ ) MadAE(y,\hat{y})=median(\mid y_1-\hat y_1\mid,...,\mid y_n-\hat y_n\mid) MadAE(y,y^)=median(y1y^1,...,yny^n)
r2metrics.r2_score R 2 ( y , y ^ ) = 1 − ∑ i = 1 n ( y i − y ^ i ) 2 ∑ i = 1 n ( y i − y ‾ i ) 2 R^2(y,\hat{y})=1-\frac{\sum\limits_{i=1}^{n}(y_i-\hat y_i)^2}{\sum\limits_{i=1}^{n}(y_i-\overline y_i)^2} R2(y,y^)=1i=1n(yiyi)2i=1n(yiy^i)2

neg_mean_poisson_deviance

neg_mean_gamma_deviance

metrics.mean_poisson_deviance

metrics.mean_gamma_deviance

D ( y , y ^ ) = 1 n ∑ i = 0 n − 1 { ( y i − y i ^ ) 2 , for p=0(Normal) 2 ( y i l o g ( y i / y ^ i ) + y ^ i − y i ) , for p=1(Poisson) 2 ( l o g ( y ^ i / y i ) + y i / y ^ i − 1 ) , for p=2(Gamma) 2 ( m a x ( y i , 0 ) 2 − p ( 1 − p ) ( 2 − p ) − y i y ^ i 1 − p 1 − p + y ^ i 2 − p 2 − p ) , for p=2(otherwise) D(y,\hat{y})=\frac{1}{n}\sum\limits_{i=0}^{n-1}\begin{cases}( y_i-\hat{y_i})^2,& \text{for p=0(Normal)}\\2(y_ilog(y_i/\hat y_i)+\hat y_i-y_i),& \text{for p=1(Poisson)}\\2(log(\hat y_i/y_i)+y_i/\hat y_i-1),& \text{for p=2(Gamma)}\\2(\frac{max(y_i,0)^{2-p}}{(1-p)(2-p)}-\frac{y_i\hat y_i^{1-p}}{1-p}+\frac{\hat y_i^{2-p}}{2-p}),& \text{for p=2(otherwise)}\end{cases} D(y,y^)=n1i=0n1 (yiyi^)2,2(yilog(yi/y^i)+y^iyi),2(log(y^i/yi)+yi/y^i1),2((1p)(2p)max(yi,0)2p1pyiy^i1p+2py^i2p),for p=0(Normal)for p=1(Poisson)for p=2(Gamma)for p=2(otherwise)
neg_mean_absolute_percentage_errormetrics.mean_absolute_percentage_error M A P E ( y , y ^ ) = 1 n ∑ i = 0 n − 1 ∣ y i − y ^ i ∣ m a x ( ϵ , ∣ y i ∣ ) MAPE(y,\hat{y})=\frac{1}{n}\sum\limits_{i=0}^{n-1}\frac{\mid y_i-\hat y_i\mid}{max(\epsilon,\mid y_i\mid)} MAPE(y,y^)=n1i=0n1max(ϵ,yi)yiy^i
d2_absolute_error_scoremetrics.d2_absolute_error_score D 2 ( y , y ^ ) = 1 − ∑ i = 1 n ∣ y i − y ^ i ∣ ∑ i = 1 n ∣ y i − y ‾ i ∣ D^2(y,\hat{y})=1-\frac{\sum\limits_{i=1}^{n}\mid y_i-\hat y_i\mid}{\sum\limits_{i=1}^{n}\mid y_i-\overline y_i\mid} D2(y,y^)=1i=1nyiyii=1nyiy^i

http://www.kler.cn/a/468627.html

相关文章:

  • SpringMVC的消息转换器
  • python 如何调整word 文档页眉页脚
  • JavaScript语言的字符串处理
  • Oracle 11g rac + Dataguard 环境调整 redo log 大小
  • 【CSS】第一天 基础选择器与文字控制属性
  • 连接Milvus
  • 【NLP高频面题 - 分布式训练篇】ZeRO主要为了解决什么问题?
  • CSS——10.类选择器
  • 【Go学习】-01-6-数据库泛型新特性
  • 如何处理外在关系以及内在关系,思维冲突和纠结
  • 挑战20天刷完leecode100
  • C语言程序设计(第5版)习题解答-第4章
  • stm32HAL库使LED闪烁
  • ArcGIS中怎么把数据提取到指定范围(裁剪、掩膜提取)
  • RabbitMQ-基本使用
  • ChatGPT 主流模型GPT-4/GPT-4o mini的参数规模是多大?
  • 学习扩散模型的完整指南(前提知识、DDPM、稳定扩散、DreamBooth等)
  • php有两个数组map比较 通过id关联,number可能数量变化 比较他们之间增加修改删除
  • 【机器学习:二、线性回归模型】
  • 前端(API)学习笔记(CLASS 4):进阶
  • Unity3D 如何做好项目性能优化详解
  • 面试题 2024/12 28 29
  • 微服务组件——利用SpringCloudGateway网关实现统一拦截服务请求,避免绕过网关请求服务
  • Python入门教程 —— 面向对象进阶
  • Go语言的 的反射(Reflection)基础知识
  • 基于伪分布式模式部署Hadoop集群