【漫话机器学习系列】060.前馈神经网络(Feed Forward Neural Networks, FFNN)
前馈神经网络(Feed Forward Neural Networks, FFNN)
1. 定义
前馈神经网络是一种最基本的人工神经网络结构,是深度学习的基础。数据从输入层开始,依次经过隐藏层,最终传递到输出层,不存在任何循环或反馈。其名称中的“前馈”指的是数据只在网络中向前传播。
2. 结构
前馈神经网络由以下三个主要部分组成:
-
输入层(Input Layer):
- 接受输入数据的特征。
- 输入数据的维度决定了输入层的节点数。
-
隐藏层(Hidden Layers):
- 由一个或多个隐藏层组成,每一层包含若干神经元(节点)。
- 隐藏层通过激活函数引入非线性能力,捕获数据中的复杂模式。
- 层与层之间通过权重和偏置连接,权重决定了各特征的影响力。
-
输出层(Output Layer):
- 输出预测结果。
- 节点数取决于具体任务:
- 回归任务:1个输出节点。
- 二分类任务:1个输出节点(通常使用Sigmoid激活函数)。
- 多分类任务:节点数等于类别数(通常使用Softmax激活函数)。
3. 工作原理
-
前向传播(Forward Propagation):
- 输入数据依次通过每层神经元的加权和,应用激活函数,计算输出值。
- 数学表达:
其中:- :第lll层的权重矩阵。
- :上一层的激活值。
- :第lll层的偏置。
- f:激活函数(如ReLU、Sigmoid)。
-
损失函数计算:
- 根据模型的输出和真实标签计算误差。
- 常用损失函数:
- 回归:均方误差(MSE)。
- 二分类:交叉熵损失(Binary Cross-Entropy)。
- 多分类:交叉熵损失(Categorical Cross-Entropy)。
-
反向传播(Backpropagation):
- 利用损失函数计算梯度,通过链式法则逐层计算权重和偏置的梯度。
- 数学表达:
-
参数更新:
- 使用优化算法(如梯度下降、Adam)更新权重和偏置,最小化损失函数。
4. 激活函数
隐藏层和输出层的激活函数决定了网络的非线性表现能力。常用激活函数包括:
-
Sigmoid:
- 将输出值压缩到(0, 1)之间。
- 缺点:易导致梯度消失。
-
ReLU(Rectified Linear Unit):
- 非线性,计算高效,缓解梯度消失问题。
- 缺点:可能导致神经元“死亡”(Dead Neurons)。
-
Softmax:
- 常用于多分类任务的输出层。
- 将输出值转化为概率分布。
5. 优化算法
前馈神经网络训练时常用的优化算法:
-
梯度下降(Gradient Descent):
- 按照完整数据集计算梯度,更新参数。
- 缺点:计算开销大。
-
随机梯度下降(SGD):
- 每次更新仅基于一个样本计算梯度。
- 收敛较慢。
-
Adam优化器(Adaptive Moment Estimation):
- 结合动量和学习率自适应调整。
- 优点:快速收敛,适合稀疏数据。
6. 应用场景
- 回归分析:如房价预测、股票预测。
- 分类任务:如图像分类、文本分类、垃圾邮件检测。
- 特征提取:用于降维或自动特征生成。
7. 示例代码
以下代码演示了一个简单的前馈神经网络,用于对二分类任务建模:
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
# 生成示例数据
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 构建前馈神经网络
model = Sequential()
model.add(Dense(64, input_dim=X_train.shape[1], activation='relu')) # 隐藏层1
model.add(Dense(32, activation='relu')) # 隐藏层2
model.add(Dense(1, activation='sigmoid')) # 输出层
# 编译模型
model.compile(optimizer=Adam(learning_rate=0.001),
loss='binary_crossentropy',
metrics=['accuracy'])
# 训练模型
model.fit(X_train, y_train, epochs=50, batch_size=32, verbose=1)
# 评估模型
y_pred = (model.predict(X_test) > 0.5).astype(int)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
运行结果
Epoch 1/50
25/25 [==============================] - 1s 1ms/step - loss: 0.6359 - accuracy: 0.6488
Epoch 2/50
25/25 [==============================] - 0s 1ms/step - loss: 0.4942 - accuracy: 0.8213
Epoch 3/50
25/25 [==============================] - 0s 1ms/step - loss: 0.4020 - accuracy: 0.8537
Epoch 4/50
25/25 [==============================] - 0s 1ms/step - loss: 0.3506 - accuracy: 0.8750
Epoch 5/50
25/25 [==============================] - 0s 1ms/step - loss: 0.3187 - accuracy: 0.8913
Epoch 6/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2985 - accuracy: 0.8900
Epoch 7/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2839 - accuracy: 0.8975
Epoch 8/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2724 - accuracy: 0.8963
Epoch 9/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2619 - accuracy: 0.9038
Epoch 10/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2516 - accuracy: 0.9062
Epoch 11/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2439 - accuracy: 0.9087
Epoch 12/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2354 - accuracy: 0.9125
Epoch 13/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2271 - accuracy: 0.9162
Epoch 14/50
25/25 [==============================] - 0s 2ms/step - loss: 0.2210 - accuracy: 0.9200
Epoch 15/50
25/25 [==============================] - 0s 2ms/step - loss: 0.2118 - accuracy: 0.9212
Epoch 16/50
25/25 [==============================] - 0s 1ms/step - loss: 0.2043 - accuracy: 0.9262
Epoch 17/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1983 - accuracy: 0.9337
Epoch 18/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1898 - accuracy: 0.9362
Epoch 19/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1829 - accuracy: 0.9400
Epoch 20/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1761 - accuracy: 0.9450
Epoch 21/50
25/25 [==============================] - 0s 2ms/step - loss: 0.1708 - accuracy: 0.9513
Epoch 22/50
25/25 [==============================] - 0s 2ms/step - loss: 0.1647 - accuracy: 0.9525
Epoch 23/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1577 - accuracy: 0.9500
Epoch 24/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1504 - accuracy: 0.9563
Epoch 25/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1447 - accuracy: 0.9575
Epoch 26/50
25/25 [==============================] - 0s 2ms/step - loss: 0.1401 - accuracy: 0.9600
Epoch 27/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1342 - accuracy: 0.9550
Epoch 28/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1306 - accuracy: 0.9575
Epoch 29/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1231 - accuracy: 0.9650
Epoch 30/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1179 - accuracy: 0.9675
Epoch 31/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1135 - accuracy: 0.9688
Epoch 32/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1077 - accuracy: 0.9688
Epoch 33/50
25/25 [==============================] - 0s 1ms/step - loss: 0.1033 - accuracy: 0.9725
Epoch 34/50
25/25 [==============================] - 0s 3ms/step - loss: 0.0980 - accuracy: 0.9737
Epoch 35/50
25/25 [==============================] - 0s 2ms/step - loss: 0.0934 - accuracy: 0.9787
Epoch 36/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0891 - accuracy: 0.9800
Epoch 37/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0848 - accuracy: 0.9812
Epoch 38/50
25/25 [==============================] - 0s 2ms/step - loss: 0.0817 - accuracy: 0.9800
Epoch 39/50
25/25 [==============================] - 0s 3ms/step - loss: 0.0760 - accuracy: 0.9837
Epoch 40/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0724 - accuracy: 0.9875
Epoch 41/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0686 - accuracy: 0.9862
Epoch 42/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0643 - accuracy: 0.9850
Epoch 43/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0613 - accuracy: 0.9912
Epoch 44/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0589 - accuracy: 0.9912
Epoch 45/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0542 - accuracy: 0.9937
Epoch 46/50
25/25 [==============================] - 0s 2ms/step - loss: 0.0515 - accuracy: 0.9937
Epoch 47/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0482 - accuracy: 0.9950
Epoch 48/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0467 - accuracy: 0.9950
Epoch 49/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0429 - accuracy: 0.9962
Epoch 50/50
25/25 [==============================] - 0s 1ms/step - loss: 0.0399 - accuracy: 0.9962
7/7 [==============================] - 0s 997us/step
Accuracy: 0.82
8. 总结
前馈神经网络是深度学习中最基础的网络结构。尽管其原理简单,但可以通过堆叠多个隐藏层捕获复杂模式。配合优化算法和激活函数,前馈神经网络能在多种任务中表现出色,是深入理解其他神经网络(如卷积神经网络、循环神经网络)的重要基础。