一维/二维高斯分布的负对数似然推导
参考wikipedia Multivariate normal distribution 及 Normal distribution
基础公式
多元高斯分布公式
f
(
X
)
=
f
(
x
1
,
x
2
,
.
.
.
,
x
k
)
=
1
(
2
π
)
k
∣
∑
∣
e
−
1
2
(
X
−
μ
)
T
∑
−
1
(
X
−
μ
)
f(X)=f(x_1,x_2,...,x_k)=\frac{1}{\sqrt{{(2\pi)}^k|\sum|}}e^{-\frac{1}{2}(X-\mu)^T\sum^{-1}(X-\mu)}
f(X)=f(x1,x2,...,xk)=(2π)k∣∑∣1e−21(X−μ)T∑−1(X−μ)
其中,
μ
=
(
μ
x
1
,
μ
x
2
,
.
.
.
,
μ
x
k
)
\mu=(\mu_{x_1},\mu_{x_2},...,\mu_{x_k})
μ=(μx1,μx2,...,μxk)
那么一维高斯分布公式
f
(
x
)
=
1
2
π
σ
2
e
−
(
x
−
μ
x
)
2
2
σ
2
f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu_x)^2}{2\sigma^2} }
f(x)=2πσ21e−2σ2(x−μx)2
二维高斯分布公式
f
(
x
,
y
)
=
1
2
π
∣
∑
∣
e
−
1
2
(
x
−
μ
x
y
−
μ
y
)
∑
−
1
(
x
−
μ
x
y
−
μ
y
)
f(x,y)=\frac{1}{2\pi\sqrt{|\sum|}}e^{-\frac{1}{2} \begin{pmatrix}x-\mu_x &y-\mu_y\end{pmatrix}\sum^{-1}\begin{pmatrix}x-\mu_x \\ y-\mu_y\end{pmatrix}}\\
f(x,y)=2π∣∑∣1e−21(x−μxy−μy)∑−1(x−μxy−μy)
为简化推导,后文将 d x = x − μ x d_x=x-\mu_x dx=x−μx 记为x
一维高斯分布的负对数似然
N L L ( x , μ ∣ σ 2 ) = − l o g ( f ( x ) ) = − l o g ( 1 2 π σ 2 e − x 2 2 σ 2 ) = − ( − l o g ( 2 π σ 2 ) − x 2 2 σ 2 ) = l o g ( 2 π ) 2 + l o g ( σ ) + x 2 2 σ 2 NLL(x, \mu|\sigma^2)=-log(f(x))=-log(\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{x^2}{2\sigma^2} })\\ = -(-log(\sqrt{2\pi\sigma^2})-\frac{x^2}{2\sigma^2})\\ =\frac{log(2\pi)}{2}+log(\sigma)+\frac{x^2}{2\sigma^2} NLL(x,μ∣σ2)=−log(f(x))=−log(2πσ21e−2σ2x2)=−(−log(2πσ2)−2σ2x2)=2log(2π)+log(σ)+2σ2x2
二维高斯分布的负对数似然
二维高斯分布公式展开如下
f
(
x
,
y
)
=
1
2
π
∣
∑
∣
e
−
1
2
(
x
y
)
∑
−
1
(
x
y
)
=
1
2
π
∣
∑
∣
e
−
1
2
x
2
σ
y
2
−
2
ρ
x
y
σ
x
σ
y
+
y
2
σ
x
2
d
e
t
(
∣
∑
∣
)
=
1
2
π
∣
∑
∣
e
−
1
2
x
2
σ
y
2
−
2
ρ
x
y
σ
x
σ
y
+
y
2
σ
x
2
(
1
−
ρ
2
)
σ
x
2
σ
y
2
=
1
2
π
σ
x
σ
y
1
−
ρ
2
e
−
1
2
(
1
−
ρ
2
)
(
x
2
σ
x
2
−
2
ρ
x
y
σ
x
σ
y
+
y
2
σ
y
2
)
\begin{align}f(x,y)&=\frac{1}{2\pi\sqrt{|\sum|}}e^{-\frac{1}{2} \begin{pmatrix}x &y\end{pmatrix}\sum^{-1}\begin{pmatrix}x \\ y\end{pmatrix}}\\ &=\frac{1}{2\pi\sqrt{|\sum|}}e^{-\frac{1}{2}\frac{x^2\sigma_y^2-2\rho xy\sigma_x\sigma_y+y^2\sigma_x^2}{det(|\sum|)}}\\ &=\frac{1}{2\pi\sqrt{|\sum|}}e^{-\frac{1}{2}\frac{x^2\sigma_y^2-2\rho xy\sigma_x\sigma_y+y^2\sigma_x^2}{(1-\rho^2)\sigma_x^2\sigma_y^2}}\\ &=\frac{1}{2\pi\sigma_x\sigma_y\sqrt{1-\rho^2}}e^{-\frac{1}{2(1-\rho^2)}(\frac{x^2}{\sigma_x^2}-\frac{2\rho xy}{\sigma_x\sigma_y}+\frac{y^2}{\sigma_y^2})} \end{align}
f(x,y)=2π∣∑∣1e−21(xy)∑−1(xy)=2π∣∑∣1e−21det(∣∑∣)x2σy2−2ρxyσxσy+y2σx2=2π∣∑∣1e−21(1−ρ2)σx2σy2x2σy2−2ρxyσxσy+y2σx2=2πσxσy1−ρ21e−2(1−ρ2)1(σx2x2−σxσy2ρxy+σy2y2)
其中,
∑
=
(
σ
x
2
σ
x
y
σ
x
y
σ
y
2
)
=
(
σ
x
2
ρ
σ
x
σ
y
ρ
σ
x
σ
y
σ
y
2
)
\sum = \begin{pmatrix}{\sigma_x}^2 & \sigma_{xy}\\ \sigma_{xy} &{\sigma_y}^2 \end{pmatrix} = \begin{pmatrix}{\sigma_x}^2 & \rho\sigma_x\sigma_y\\ \rho\sigma_x\sigma_y &{\sigma_y}^2 \end{pmatrix}
∑=(σx2σxyσxyσy2)=(σx2ρσxσyρσxσyσy2)
∑
−
1
=
1
∣
∑
∣
(
σ
y
2
−
ρ
σ
x
σ
y
−
ρ
σ
x
σ
y
σ
x
2
)
{\sum}^{-1} = \frac{1}{|\sum|}\begin{pmatrix} \sigma_y^2 & -\rho\sigma_x\sigma_y\\ -\rho\sigma_x\sigma_y &\sigma_x^2\end{pmatrix}
∑−1=∣∑∣1(σy2−ρσxσy−ρσxσyσx2)
那么负对数似然为
−
l
o
g
(
f
(
x
,
y
)
)
=
−
l
o
g
(
1
2
π
σ
x
σ
y
1
−
ρ
2
e
−
1
2
(
1
−
ρ
2
)
(
x
2
σ
x
2
−
2
ρ
x
y
σ
x
σ
y
+
y
2
σ
y
2
)
)
=
−
(
−
l
o
g
(
2
π
σ
x
σ
y
)
−
1
2
l
o
g
(
1
−
ρ
2
)
+
(
−
1
2
(
1
−
ρ
2
)
(
x
2
σ
x
2
−
2
ρ
x
y
σ
x
σ
y
+
y
2
σ
y
2
)
)
)
=
l
o
g
(
2
π
σ
x
σ
y
)
+
1
2
l
o
g
(
1
−
ρ
2
)
+
1
2
(
1
−
ρ
2
)
(
x
2
σ
x
2
−
2
ρ
x
y
σ
x
σ
y
+
y
2
σ
y
2
)
\begin{align}-log(f(x,y))&=-log(\frac{1}{2\pi\sigma_x\sigma_y\sqrt{1-\rho^2}}e^{-\frac{1}{2(1-\rho^2)}(\frac{x^2}{\sigma_x^2}-\frac{2\rho xy}{\sigma_x\sigma_y}+\frac{y^2}{\sigma_y^2})})\\ &=-(-log(2\pi\sigma_x\sigma_y)-\frac{1}{2}log(1-\rho^2)+(-\frac{1}{2(1-\rho^2)}(\frac{x^2}{\sigma_x^2}-\frac{2\rho xy}{\sigma_x\sigma_y}+\frac{y^2}{\sigma_y^2})))\\ &=log(2\pi\sigma_x\sigma_y)+\frac{1}{2}log(1-\rho^2)+\frac{1}{2(1-\rho^2)}(\frac{x^2}{\sigma_x^2}-\frac{2\rho xy}{\sigma_x\sigma_y}+\frac{y^2}{\sigma_y^2}) \end{align}
−log(f(x,y))=−log(2πσxσy1−ρ21e−2(1−ρ2)1(σx2x2−σxσy2ρxy+σy2y2))=−(−log(2πσxσy)−21log(1−ρ2)+(−2(1−ρ2)1(σx2x2−σxσy2ρxy+σy2y2)))=log(2πσxσy)+21log(1−ρ2)+2(1−ρ2)1(σx2x2−σxσy2ρxy+σy2y2)