当前位置: 首页 > article >正文

深度学习基础案例5--VGG16人脸识别(体验学习的痛苦与乐趣)

  • 🍨 本文为🔗365天深度学习训练营 中的学习记录博客
  • 🍖 原作者:K同学啊

前言

  • 这次目标本来要达到60%,但是却非常稳定的达到了40%,​😢​​😢​​😢​​😢​;
  • 从上个周末到现在,从最初的13%到现在的60%,自己一个人也学习了不少,体验到了期待到无助,又从无助到期待的循环,一个人查阅论文、修改、跑模型验证,反复验证,在这过程,本来在这过程中想好了很多要写的,但是等到真正写的时候,又突然说不出口了,​🤠​​🤠​​🤠​​🤠​;
  • 最近学校课程多,这周任务又快要结束了,就到后面在不断优化吧,也期待大佬给我提出建议。😢​​😢​​😢​​😢

目标

测试集准确率达到20%

结果

测试集准确率稳定在40%+
没有达到60%+​😢​​😢​​😢​​😢​​😢​​😢​​😢​​😢​​😢​​😢​​😢​​😢​

文章目录

  • 1、模型简介
  • 2、模型训练与优化
    • 1、导入数据
      • 1、导入库
      • 2、查看数据类型
      • 3、数据展示
      • 4、数据预处理
      • 5、数据加载与数据划分
    • 2、构建VGG-16神经网络模型
    • 3、模型训练
      • 1、构建训练函数
      • 2、构建测试集函数
      • 3、设置动态学习率
      • 4、模型正式训练
    • 4、结果展示
    • 5、预测
    • 6、开始优化
    • 7、优化一
    • 8、优化二
  • 3、总结

1、模型简介

VGG16是一个经典的模型,他在之间广泛用于图像分类的工作,也一直取得了很多人的青睐,它拥有13层卷积,3层池化构成,本文将用VGG16来实现对人脸的识别(本次案例数据1800张)。

VGG16模型结构图如下:

在这里插入图片描述

结合本案例,最总优化的模型结构图如下(论文截图)

在这里插入图片描述

2、模型训练与优化

1、导入数据

1、导入库

import torch 
import numpy as np 
import torch.nn as nn
import torchvision 
import warnings   # 忽略警告
import os, PIL, pathlib 

warnings.filterwarnings("ignore")             #忽略警告信息

device = ('cuda' if torch.cuda.is_available() else 'cpu')
device

输出:

'cuda'

2、查看数据类型

获取文件夹下的类别名称

data_dir = './data/'
data_dir = pathlib.Path(data_dir)

data_path = data_dir.glob('*')
classnames = [str(path).split("\\")[1] for path in data_path]
classnames

输出:

['Angelina Jolie',
 'Brad Pitt',
 'Denzel Washington',
 'Hugh Jackman',
 'Jennifer Lawrence',
 'Johnny Depp',
 'Kate Winslet',
 'Leonardo DiCaprio',
 'Megan Fox',
 'Natalie Portman',
 'Nicole Kidman',
 'Robert Downey Jr',
 'Sandra Bullock',
 'Scarlett Johansson',
 'Tom Cruise',
 'Tom Hanks',
 'Will Smith']

3、数据展示

import matplotlib.pyplot as plt 
from PIL import Image

# 文件目录
data_look_dir = './data/AngeLina Jolie/'
# 获得文件名
data_path_list = [f for f in os.listdir(data_look_dir) if f.endswith(('jpg', 'png'))]

fig, axes = plt.subplots(2, 8, figsize=(16, 6))  # fig:画板,ases子图

# 展示一一部分图片
for ax, img_file in zip(axes.flat, data_path_list):
    path_name = os.path.join(data_look_dir, img_file)  # 拼接文件目录
    img = Image.open(path_name)         # 打开文件
    ax.imshow(img)
    ax.axis('off')
    
plt.show()


在这里插入图片描述

4、数据预处理

from torchvision import transforms, datasets 

# 数据预处理,统一格式
data_transform = transforms.Compose([
    transforms.Resize([224, 224]),
    transforms.ToTensor(),
        transforms.Normalize(           
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225] 
    )
])

data_all = './data/'

total_data = datasets.ImageFolder(root=data_all, transform=data_transform)

5、数据加载与数据划分

# 数据集的划分
train_size = int(len(total_data) * 0.8)
test_size = len(total_data) - train_size
train_data, test_data = torch.utils.data.random_split(total_data, [train_size, test_size])
train_data, test_data
输出:(<torch.utils.data.dataset.Subset at 0x2bd8b922fd0>,
 <torch.utils.data.dataset.Subset at 0x2bd8b922ac0>)
# 动态加载数据
batch_size = 32 

train_dl = torch.utils.data.DataLoader(train_data,
                                       batch_size=batch_size,
                                       shuffle=True,
                                       num_workers=4)

test_dl = torch.utils.data.DataLoader(test_data,
                                      batch_size=batch_size,
                                      shuffle=True,
                                      num_workers=4)
# 查看处理后图片格式
for format, data in test_dl:
    print("format: ", format.shape)
    print("data: ", data)
    break 
format:  torch.Size([32, 3, 224, 224])
data:  tensor([14, 12,  4, 13,  6,  3, 13,  4, 16,  0,  2,  4,  5,  6,  7,  2,  6,  0,
        14, 10, 13,  3,  8,  7, 10,  4, 12,  0,  4, 14,  3, 15])

2、构建VGG-16神经网络模型

from torchvision.models import vgg16

# 加载模型
model = vgg16(pretrained=False).to(device)

# vgg16已经通过大量的模型训练,故不需要参数更新
for param in model.parameters():
    param.required_grad = False     # 禁止梯度更新

# 修改全连接层
model.classifier._modules['6'] = nn.Linear(4096, len(classnames))
model.to(device)
model

输出:

VGG(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): ReLU(inplace=True)
    (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): ReLU(inplace=True)
    (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (13): ReLU(inplace=True)
    (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (15): ReLU(inplace=True)
    (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (18): ReLU(inplace=True)
    (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (20): ReLU(inplace=True)
    (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (22): ReLU(inplace=True)
    (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (25): ReLU(inplace=True)
    (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (27): ReLU(inplace=True)
    (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (29): ReLU(inplace=True)
    (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
  (classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=17, bias=True)
  )
)

3、模型训练

1、构建训练函数

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    batch_size = len(dataloader)
    
    train_acc, train_loss = 0, 0
    
    # 模型预测
    for X, y in dataloader:
        X, y = X.to(device), y.to(device)
        
        # 预测
        pred = model(X)
        loss = loss_fn(pred, y)
        
        # 梯度更新
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        # 计算损失和准确率
        train_loss += loss
        train_acc += (pred.argmax(1) == y).type(torch.float64).sum().item()
    
    # 计算总和
    train_acc /= size
    train_loss /= batch_size
    
    return train_acc, train_loss

2、构建测试集函数

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    batch_size = len(dataloader)
    
    test_acc, test_loss = 0, 0 
    
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            
            pred = model(X)
            loss = loss_fn(pred, y)
            
            test_loss += loss.item()
            test_acc += (pred.argmax(1) == y).type(torch.float64).sum().item()
            
    test_acc /= size 
    test_loss /= batch_size
    
    return test_acc, test_loss

3、设置动态学习率

learn_rate = 1e-4 
loss_fn = nn.CrossEntropyLoss()
func = lambda epoch : (0.92 ** (epoch // 2))
optimizer = torch.optim.SGD(model.parameters(), lr=learn_rate)
scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=func)

4、模型正式训练

import copy 

train_acc = []
train_loss = []
test_acc = []
test_loss = []

epoches = 40

best_acc = 0   # 最佳学习率

for epoch in range(epoches):
    
    model.train()
    epoch_train_acc, epoch_trian_loss = train(train_dl, model, loss_fn, optimizer)
    
    # 动态更新学习率
    scheduler.step()
    
    model.eval()
    epoch_test_acc, epoch_test_loss = test(test_dl, model, loss_fn)
    
    # 保存最佳参数模型
    if epoch_test_acc > best_acc:
        best_acc = epoch_test_acc
        best_model = copy.deepcopy(model)
        
    train_acc.append(epoch_train_acc)
    train_loss.append(epoch_trian_loss)
    test_acc.append(epoch_test_acc)
    test_loss.append(epoch_test_loss)
    
    # 保存当前学习率
    lr = optimizer.state_dict()['param_groups'][0]['lr']
    
    template = ('Epoch:{:2d}, Train_acc:{:.1f}%, Train_loss:{:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}, Lr:{:.2E}')
    print(template.format(epoch+1, epoch_train_acc*100, epoch_trian_loss,  epoch_test_acc*100, epoch_test_loss, lr))
    
# 保存最佳模型到文件
path = './best_model.pth'
torch.save(model.state_dict(), path)
输出:
Epoch: 1, Train_acc:5.3%, Train_loss:2.835, Test_acc:5.0%, Test_loss:2.838, Lr:1.00E-04
Epoch: 2, Train_acc:6.0%, Train_loss:2.838, Test_acc:4.4%, Test_loss:2.835, Lr:9.20E-05
Epoch: 3, Train_acc:6.0%, Train_loss:2.837, Test_acc:4.4%, Test_loss:2.835, Lr:9.20E-05
Epoch: 4, Train_acc:5.5%, Train_loss:2.837, Test_acc:4.4%, Test_loss:2.835, Lr:8.46E-05
Epoch: 5, Train_acc:5.1%, Train_loss:2.837, Test_acc:4.4%, Test_loss:2.834, Lr:8.46E-05
Epoch: 6, Train_acc:6.6%, Train_loss:2.833, Test_acc:3.9%, Test_loss:2.834, Lr:7.79E-05
Epoch: 7, Train_acc:6.2%, Train_loss:2.836, Test_acc:3.9%, Test_loss:2.831, Lr:7.79E-05
Epoch: 8, Train_acc:6.8%, Train_loss:2.829, Test_acc:3.9%, Test_loss:2.834, Lr:7.16E-05
Epoch: 9, Train_acc:6.4%, Train_loss:2.834, Test_acc:3.9%, Test_loss:2.832, Lr:7.16E-05
Epoch:10, Train_acc:6.7%, Train_loss:2.833, Test_acc:3.9%, Test_loss:2.832, Lr:6.59E-05
Epoch:11, Train_acc:5.6%, Train_loss:2.833, Test_acc:3.9%, Test_loss:2.832, Lr:6.59E-05
Epoch:12, Train_acc:5.8%, Train_loss:2.835, Test_acc:3.9%, Test_loss:2.832, Lr:6.06E-05
Epoch:13, Train_acc:6.2%, Train_loss:2.832, Test_acc:4.2%, Test_loss:2.830, Lr:6.06E-05
Epoch:14, Train_acc:6.7%, Train_loss:2.832, Test_acc:4.2%, Test_loss:2.830, Lr:5.58E-05
Epoch:15, Train_acc:6.2%, Train_loss:2.834, Test_acc:4.2%, Test_loss:2.831, Lr:5.58E-05
Epoch:16, Train_acc:7.4%, Train_loss:2.828, Test_acc:5.0%, Test_loss:2.828, Lr:5.13E-05
Epoch:17, Train_acc:6.9%, Train_loss:2.834, Test_acc:6.1%, Test_loss:2.831, Lr:5.13E-05
Epoch:18, Train_acc:7.1%, Train_loss:2.834, Test_acc:8.1%, Test_loss:2.829, Lr:4.72E-05
Epoch:19, Train_acc:6.7%, Train_loss:2.833, Test_acc:8.9%, Test_loss:2.829, Lr:4.72E-05
Epoch:20, Train_acc:7.5%, Train_loss:2.829, Test_acc:9.2%, Test_loss:2.827, Lr:4.34E-05
Epoch:21, Train_acc:8.3%, Train_loss:2.831, Test_acc:11.1%, Test_loss:2.828, Lr:4.34E-05
Epoch:22, Train_acc:6.2%, Train_loss:2.830, Test_acc:11.7%, Test_loss:2.828, Lr:4.00E-05
Epoch:23, Train_acc:8.5%, Train_loss:2.828, Test_acc:12.5%, Test_loss:2.829, Lr:4.00E-05
Epoch:24, Train_acc:7.3%, Train_loss:2.830, Test_acc:11.9%, Test_loss:2.827, Lr:3.68E-05
Epoch:25, Train_acc:8.4%, Train_loss:2.827, Test_acc:11.9%, Test_loss:2.828, Lr:3.68E-05
Epoch:26, Train_acc:8.5%, Train_loss:2.830, Test_acc:12.8%, Test_loss:2.828, Lr:3.38E-05
Epoch:27, Train_acc:6.9%, Train_loss:2.831, Test_acc:13.1%, Test_loss:2.828, Lr:3.38E-05
Epoch:28, Train_acc:7.8%, Train_loss:2.828, Test_acc:13.1%, Test_loss:2.828, Lr:3.11E-05
Epoch:29, Train_acc:7.8%, Train_loss:2.826, Test_acc:13.1%, Test_loss:2.828, Lr:3.11E-05
Epoch:30, Train_acc:6.6%, Train_loss:2.831, Test_acc:13.1%, Test_loss:2.827, Lr:2.86E-05
Epoch:31, Train_acc:8.4%, Train_loss:2.828, Test_acc:13.1%, Test_loss:2.828, Lr:2.86E-05
Epoch:32, Train_acc:9.4%, Train_loss:2.828, Test_acc:13.1%, Test_loss:2.829, Lr:2.63E-05
Epoch:33, Train_acc:9.1%, Train_loss:2.829, Test_acc:13.1%, Test_loss:2.826, Lr:2.63E-05
Epoch:34, Train_acc:8.2%, Train_loss:2.828, Test_acc:13.1%, Test_loss:2.826, Lr:2.42E-05
Epoch:35, Train_acc:9.8%, Train_loss:2.829, Test_acc:13.1%, Test_loss:2.827, Lr:2.42E-05
Epoch:36, Train_acc:7.9%, Train_loss:2.827, Test_acc:13.1%, Test_loss:2.826, Lr:2.23E-05
Epoch:37, Train_acc:9.4%, Train_loss:2.827, Test_acc:13.1%, Test_loss:2.825, Lr:2.23E-05
Epoch:38, Train_acc:8.2%, Train_loss:2.829, Test_acc:13.1%, Test_loss:2.826, Lr:2.05E-05
Epoch:39, Train_acc:8.8%, Train_loss:2.827, Test_acc:13.1%, Test_loss:2.827, Lr:2.05E-05
Epoch:40, Train_acc:7.6%, Train_loss:2.826, Test_acc:13.1%, Test_loss:2.824, Lr:1.89E-05

不知道为什么,有时候这里train_loss还存储在GPU中,故需要转换:

if isinstance(train_loss, torch.Tensor) and train_loss.device.type == 'cuda':
    train_loss = train_loss.cpu()
# 转换
train_loss_cpu = [t.detach().cpu().numpy() for t in train_loss if isinstance(t, torch.Tensor)]

# 重新开始
train_loss = np.array(train_loss_cpu)

4、结果展示

import matplotlib.pyplot as plt
#隐藏警告
import warnings
warnings.filterwarnings("ignore")               # 忽略警告信息
plt.rcParams['font.sans-serif']    = ['SimHei'] # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False      # 用来正常显示负号
plt.rcParams['figure.dpi']         = 100        # 分辨率


plt.figure(figsize=(12, 3))

epoch_range = range(epoches)


plt.subplot(1, 2, 1)
plt.plot(epoch_range, train_acc, label='Train Accurary')
plt.plot(epoch_range, test_acc, label='Test Accurary')
plt.title('Accurary')
plt.legend(loc='lower right')

plt.subplot(1, 2, 2)
plt.plot(epoch_range, train_loss, label='Train Loss')
plt.plot(epoch_range, test_loss, label='Test Loss')
plt.title('Loss')
plt.legend(loc='upper right')

plt.show()


在这里插入图片描述

5、预测

from PIL import Image 

classes = list(total_data.class_to_idx) 

def predict_one_image(image_path, model, transform, classes):
    test_img = Image.open(image_path).convert('RGB')  # RGB格式打开
    plt.imshow(test_img)  # 展示预测图片
    
    # 压缩图片,更好训练
    test_img = transform(test_img)
    img = test_img.to(device).unsqueeze(0)
    
    model.eval()
    output = model(img)
    
    _, pred = torch.max(output, 1)
    pred_class = classes[pred]
    
    print('预测结果:', pred_class)
predict_one_image('./data/Angelina Jolie/001_fe3347c0.jpg', model, data_transform, classes)
预测结果: Scarlett Johansson

在这里插入图片描述

6、开始优化

本次实验总结

  • 准确率:

    • 训练集:刚开始极低,后面更是一直不变
    • 测试集:虽然有逐步上升趋势,但是却一直很低
  • 损失率:

    • 测试集和训练集的损失率变化大差不差,都是逐渐减低,但是降低极少,不到0.1,效果不好
  • 最后:总的来说,直接调用VGG的模型,跑出来的效果极差

7、优化一

通过查阅论文、相关博客,最后对以上模型进行了以下优化:

  • 全连接层:减少层数,降低神经元个数,添加Dropout层,修改代码如下:

  • model.classifier = nn.Sequential(
        nn.Linear(512 * 7 * 7, 1024),
        nn.ReLU(inplace=True),
        nn.Dropout(0.5,inplace=False),
        nn.Linear(1024,512),
        nn.ReLU(inplace=True),
        nn.Dropout(0.5),
        nn.Linear(512,len(classnames))) # 修改vgg16模型中最后一层全连接层,输出目标类别个数
    
  • 学习率:由于在模型训练中经常出现准确率不变的情况,故降低学习率,1e-4 --> 1e-3,同时变小自动调整学习率速度,0.88 ** (epoch // 2)

  • 优化器:通过查阅论发现优化器不同也会导致不同结果,实际也得到了验证,SGD—>Adam

  • 图像增强:由于数据量少,故对图像进行随机旋转,以增强数据

最后,通过训练,结果如图:

在这里插入图片描述

  • 效果比上一个模型好很多,训练集准确率来到了40%,但是在后面出现了一个现象,就是训练集准确率一直升高,达到了90%以上,但是测试集准确率却一直没有怎么变化,所以,就出现了到后面测试集的损失值上升的趋势,模型不稳定。

8、优化二

后面继续查阅论文,最后索性,将全部模型进行修改,添加BN层、将激活函数变成LeakyReLU,同时减少全连接层的数量,最后模型效果得到了稳定的提升,但是训练集和测试集的准确率上升一直变得很缓慢,上升不去,但是相比于优化一来说,效果好了很多,结果图如下:

在这里插入图片描述

优化代码

class Work_Net(nn.Module):
    def __init__(self):
        super(Work_Net, self).__init__()
        
        # Block1
        self.block1 = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, padding=1),
            nn.BatchNorm2d(64),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.BatchNorm2d(64),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.1),  # 2、修改激活函数
            nn.MaxPool2d(kernel_size=2, stride=2)
            )
            
        
        # Block 2
        self.block2 = nn.Sequential(
            nn.Conv2d(64, 128, kernel_size=3, padding=1),
            nn.BatchNorm2d(128),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Conv2d(128, 128, kernel_size=3, padding=1),
            nn.BatchNorm2d(128),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.MaxPool2d(kernel_size=2, stride=2)    
        )

        # Block3
        self.block3 = nn.Sequential(
            nn.Conv2d(128, 256, kernel_size=3, padding=1),
            nn.BatchNorm2d(256),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.BatchNorm2d(256),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.BatchNorm2d(256),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.MaxPool2d(kernel_size=2, stride=2)
        )

        # Block 4
        self.block4 = nn.Sequential(
            nn.Conv2d(256, 512, kernel_size=3, padding=1),
            nn.BatchNorm2d(512),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.BatchNorm2d(512),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.BatchNorm2d(512),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.MaxPool2d(kernel_size=2, stride=2)
        )

        # Block5
        self.block5 = nn.Sequential(
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.BatchNorm2d(512),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.BatchNorm2d(512),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Conv2d(512, 512, kernel_size=3, padding=1),
            nn.BatchNorm2d(512),   # 1、添加BN层
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        
        # Block6
        self.classifier = nn.Sequential(
            nn.Linear(512 * 7 * 7, 1024),
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Dropout(0.5),
            nn.Linear(1024, 512),
            nn.LeakyReLU(negative_slope=0.01),  # 2、修改激活函数
            nn.Dropout(0.5),
            nn.Linear(512, len(classnames))
        )
        
    def forward(self, x):
        x = self.block1(x)
        x = self.block2(x)
        x = self.block3(x)
        x = self.block4(x)
        x = self.block5(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        
        return x

3、总结

这一次虽然没有达到目标(测试集准确率60%),但是对深度学习的概念有了更加清晰的认识:

  • 全连接层:全连接层是通过CNN训练得到的结果后,将训练结果进行展开,然后根据目标类型进行分类,如果全连接层展开很大的话,降低神经元数量,增加Dropout层,可以防止过拟合,防止出现测试集、训练集的准确率或者损失率一直不变的情况
  • 激活函数:常用的激活函数主要是ReLU,但是在数据量少的时候也可以尝试用LeaykReLU,考虑特征值计算的时候出现复数的情况;
  • 优化器: 不同优化器可能导致不同的效果,需要结合数据运算的效果来选取;
  • 学习率:过大的学习率会收敛过快,提取不到有效的信息,但是学习率过小,极容易出现训练集、测试集的准确率或者学习率不变的情况;
  • 数据增强:数据量少的时候可以通过数据增强,可以通过transforms.Compose以增加数据

http://www.kler.cn/a/316460.html

相关文章:

  • 深度学习之 LSTM
  • 【JAVA】正则表达式中的中括弧
  • 前端 JS面向对象 原型 prototype
  • #渗透测试#SRC漏洞挖掘#云技术基础02之容器与云
  • dapp获取钱包地址,及签名
  • 《AI 使生活更美好》
  • C++第七节课 运算符重载
  • 航拍房屋检测系统源码分享
  • 对条件语言模型(Conditional Language Model)的目标函数的理解
  • C语言编译四大阶段
  • EasyExcel的基本使用——Java导入Excel数据
  • [C#]winform 使用opencvsharp实现玉米粒计数
  • 基于windows的mysql5.7安装配置教程
  • Vue 实现高级穿梭框 Transfer 封装
  • Qt 模型视图(四):代理类QAbstractItemDelegate
  • 【数字组合】
  • C基础语法2
  • 提升动态数据查询效率:应对数据库成为性能瓶颈的优化方案
  • 【C语言零基础入门篇 - 16】:栈和队列
  • 新一代图像生成E2E FT:深度图微调突破
  • iOS界面布局:屏幕尺寸与安全区域全面指南
  • 什么是unix中的fork函数?
  • 【RabbitMQ】快速上手
  • Spring Boot 2.x基础教程:实现文件上传
  • [Unity Demo]从零开始制作空洞骑士Hollow Knight第五集:再制作更多的敌人
  • 【艾思科蓝】前端框架巅峰对决:React、Vue与Angular的全面解析与实战指南