当前位置：首页 > article >正文

Python反向传播导图

article 2025/2/20 3:53:38

🎯要点

矩阵式梯度下降
人工神经网络预测
损失函数梯度下降
梯度优化三输入和一输出神经网络
并行坐标可视化传播网络
森林火灾场景特征算法学习
前馈神经网络随机梯度优化,动量随机梯度优化和自适应矩估计
多回归自动微分行驶探测器模型
分类给定点颜色

Python反向传播计算

反向传播是一种用于训练人工神经网络的有效算法，特别是在前馈神经网络中。通过确定应调整哪些权重和偏差，有助于最小化成本函数。在每个时期，模型都会通过调整权重和偏差进行学习，通过向下移动到误差的梯度来最小化损失。因此，它涉及两种最流行的优化算法，例如梯度下降或随机梯度下降。计算反向传播算法中的梯度有助于最小化成本函数，并且可以通过使用微积分中称为链式法则的数学规则来导航神经网络的复杂层来实现。
在这里插入图片描述

$h_{(1,1)}$ ~ $h_{(2,3)}$ 之间是隐藏层，O 是输出层

反向传播算法通过两个不同的通道工作，它们是：前传、后传。

假设神经元具有 sigmoid 激活函数，在网络上执行前向和后向传递。同时假设 y 的实际输出为 0.5，学习率为 1。现在使用反向传播算法执行反向传播。
在这里插入图片描述
实现前传：

在开始计算前向传播之前，我们需要知道两个公式： $a_j=\sum\left(w_i, j * x_i\right)$

$a_{ j }$ 是每个节点的所有输入和权重的加权和，
wi, $j$ - 表示与 $j^{\text {th }}$ 输入到 $^{\text {th }}$ 神经元相关的权重。
$x_i$ - 表示 $j^{\text {th }}$ 输入的值

$y_j=F\left(a_j\right)=\frac{1}{1+e^{-a j}}，y _{ i }-$ 为输出值，F表示激活函数【sigmoid激活函数为此处使用），它将加权和转换为输出值。为了计算前向传播，我们需要计算 $y_3、y_4$ 和 $y_5$ 的输出。

如上图中， $y_3$ 是 $h_1$ ， $y_4$ 是 $h_2$ ， $y_5$ 是 $O_3$

$a_j=\sum\left(w_{i, j} * x_i\right)$ 为了找到 $y_3$ ，我们需要考虑它的传入边及其权重和输入。这里的传入边来自 $X_1$ 和 $X_2$ 。

在 $h_1$ 节点， $\begin{aligned} a_1 & =\left(w_{1,1} x_1\right)+\left(w_{2,1} x_2\right) \\ & =(0.2 * 0.35)+(0.2 * 0.7) \\ & =0.21\end{aligned}$

一旦我们计算了 $a_1$ 值，我们现在可以继续查找 $y_3$ 值：
$\begin{aligned} & y_j=F\left(a_j\right)=\frac{1}{1+e^{-a j}} \\ & y_3=F(0.21)=\frac{1}{1+e^{-0.21}} \\ & y_3=0.56 \end{aligned}$
同样，在 $h_2$ 处查找 $y_4$ 的值，在 $O_3$ 处查找 $y_5$ 的值，
$\begin{aligned} & a 2=\left(w_{1,2} * x_1\right)+\left(w_{2,2} * x_2\right)=(0.3 * 0.35)+(0.3 * 0.7)=0.315 \\ & y_4=F(0.315)=\frac{1}{1+e^{-0.315}} \\ & a 3=\left(w_{1,3} * y_3\right)+\left(w_{2,3} * y_4\right)=(0.3 * 0.57)+(0.9 * 0.59)=0.702 \\ & y_5=F(0.702)=\frac{1}{1+e^{-0.7012}}=0.67 \end{aligned}$
请注意，我们的实际输出是 0.5，但我们得到的是 0.67。为了计算误差，我们可以使用以下公式：

误差 $_j=y_{\text {目标}}-y_5$ ，误差 = 0.5 – 0.67 =-0.17。使用这个误差值，我们将进行反向传播。

代码实现

import numpy as np

class NeuralNetwork:
    def __init__(self, input_size, hidden_size, output_size):
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size

        # Initialize weights
        self.weights_input_hidden = np.random.randn(self.input_size, self.hidden_size)
        self.weights_hidden_output = np.random.randn(self.hidden_size, self.output_size)

        # Initialize the biases
        self.bias_hidden = np.zeros((1, self.hidden_size))
        self.bias_output = np.zeros((1, self.output_size))

    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

    def sigmoid_derivative(self, x):
        return x * (1 - x)

    def feedforward(self, X):
        # Input to hidden
        self.hidden_activation = np.dot(X, self.weights_input_hidden) + self.bias_hidden
        self.hidden_output = self.sigmoid(self.hidden_activation)

        # Hidden to output
        self.output_activation = np.dot(self.hidden_output, self.weights_hidden_output) + self.bias_output
        self.predicted_output = self.sigmoid(self.output_activation)

        return self.predicted_output

    def backward(self, X, y, learning_rate):
        # Compute the output layer error
        output_error = y - self.predicted_output
        output_delta = output_error * self.sigmoid_derivative(self.predicted_output)

        # Compute the hidden layer error
        hidden_error = np.dot(output_delta, self.weights_hidden_output.T)
        hidden_delta = hidden_error * self.sigmoid_derivative(self.hidden_output)

        # Update weights and biases
        self.weights_hidden_output += np.dot(self.hidden_output.T, output_delta) * learning_rate
        self.bias_output += np.sum(output_delta, axis=0, keepdims=True) * learning_rate
        self.weights_input_hidden += np.dot(X.T, hidden_delta) * learning_rate
        self.bias_hidden += np.sum(hidden_delta, axis=0, keepdims=True) * learning_rate

    def train(self, X, y, epochs, learning_rate):
        for epoch in range(epochs):
            output = self.feedforward(X)
            self.backward(X, y, learning_rate)
            if epoch % 4000 == 0:
                loss = np.mean(np.square(y - output))
                print(f&quot;Epoch {epoch}, Loss:{loss}&quot;)

X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([[0], [1], [1], [0]])

nn = NeuralNetwork(input_size=2, hidden_size=4, output_size=1)
nn.train(X, y, epochs=10000, learning_rate=0.1)

# Test the trained model
output = nn.feedforward(X)
print(&quot;Predictions after training:&quot;)
print(output)

输出：

Epoch 0, Loss:0.36270360966344145
Epoch 4000, Loss:0.005546947165311874
Epoch 8000, Loss:0.00202378766386817
Predictions after training:
[[0.02477654]
 [0.95625286]
 [0.96418129]
 [0.04729297]]