当前位置: 首页 > article >正文

YOLO11改进-模块-引入CMUNeXt Block 增强全局信息

           在医学图像分割领域面临诸多问题,如 U 形架构卷积网络难以提取全局信息,混合架构因计算资源受限在实际医疗场景应用受阻,轻量化网络在保证性能与提取全局信息上存在矛盾。因此,设计了CMUNeXt Block,CMUNeXt Block 采用大核深度可分离卷积替代普通卷积来提取全局信息,凭借深度可分离卷积减少参数和计算成本以维持轻量化,同时综合利用卷积归纳偏置和全局信息提取能力,有效解决了这些问题。  

1. CMUNeXt Block结构介绍       

  1. 有效提取全局信息:通过使用大核深度可分离卷积替换普通卷积,先利用大核的深度可分离卷积提取每个通道的全局信息,再借助残差连接,解决了卷积网络难以提取全局信息的问题。
  2. 保持轻量化优势:深度可分离卷积本身相比普通卷积能有效减少网络参数和计算成本,并且通过合理设计,如倒置瓶颈设计等,在提取全局信息的同时保持了网络的轻量化,使得网络既能够提取全局信息,又适合在移动和边缘设备上部署。
  3. 综合利用归纳偏置和全局信息提取:利用卷积的归纳偏置更好地拟合稀缺医疗数据,同时解决了全局信息提取的问题,克服了传统卷积网络和 Transformer-based 网络在医疗图像分割应用中的不足。

2. YOLOv11与CMUNeXt Block的结合

        1. 本文将CMUNeXt Block替换传统的conv,增强模型全局信息的提取。

        2. 本文将CMUNeXt Block与C3K2相结合,使用CMUNeXt Block替换conv模块,构建C3k2_CMUNeXt模块。

3. CMUNeXt Block代码部分

import torch
import torch.nn as nn
from .block import C2f, C3k

class Residual(nn.Module):
    def __init__(self, fn):
        super().__init__()
        self.fn = fn

    def forward(self, x):
        return self.fn(x) + x


class CMUNeXtBlock(nn.Module):
    def __init__(self, ch_in, ch_out, k=3, s=2, depth=1):
        super(CMUNeXtBlock, self).__init__()
        self.block = nn.Sequential(
            *[nn.Sequential(
                Residual(nn.Sequential(
                    # deep wise
                    nn.Conv2d(ch_in, ch_in, kernel_size=(k, k), groups=ch_in, padding=(k // 2, k // 2)),
                    nn.GELU(),
                    nn.BatchNorm2d(ch_in)
                )),
                nn.Conv2d(ch_in, ch_in * 4, kernel_size=(1, 1)),
                nn.GELU(),
                nn.BatchNorm2d(ch_in * 4),
                nn.Conv2d(ch_in * 4, ch_in, kernel_size=(1, 1)),
                nn.GELU(),
                nn.BatchNorm2d(ch_in)
            ) for i in range(depth)]
        )
        self.up = Conv(ch_in, ch_out, k, s)

    def forward(self, x):
        x = self.block(x)
        x = self.up(x)
        return x


def autopad(k, p=None, d=1):  # kernel, padding, dilation
    """Pad to 'same' shape outputs."""
    if d > 1:
        k = d * (k - 1) + 1 if isinstance(k, int) else [d * (x - 1) + 1 for x in k]  # actual kernel-size
    if p is None:
        p = k // 2 if isinstance(k, int) else [x // 2 for x in k]  # auto-pad
    return p


class Conv(nn.Module):
    """Standard convolution with args(ch_in, ch_out, kernel, stride, padding, groups, dilation, activation)."""

    default_act = nn.SiLU()  # default activation

    def __init__(self, c1, c2, k=1, s=1, p=None, g=1, d=1, act=True):
        """Initialize Conv layer with given arguments including activation."""
        super().__init__()
        self.conv = nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groups=g, dilation=d, bias=False)
        self.bn = nn.BatchNorm2d(c2)
        self.act = self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity()

    def forward(self, x):
        """Apply convolution, batch normalization and activation to input tensor."""
        return self.act(self.bn(self.conv(x)))

    def forward_fuse(self, x):
        """Perform transposed convolution of 2D data."""
        return self.act(self.conv(x))

class Bottleneck_CMUNeX(nn.Module):
    """Standard bottleneck."""

    def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):
        """Initializes a standard bottleneck module with optional shortcut connection and configurable parameters."""
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, k[0], 1)
        self.cv2 = CMUNeXtBlock(c_, c2, 3, 1)
        self.add = shortcut and c1 == c2

    def forward(self, x):
        """Applies the YOLO FPN to input data."""
        return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))


# 在c3k=True时,使用普通的Bottleneck,为false的时候我们使用Bottleneck_RepVGG提取特征
class C3k2_CMUNeX(C2f):
    """Faster Implementation of CSP Bottleneck with 2 convolutions."""

    def __init__(self, c1, c2, n=1, c3k=False, e=0.5, g=1, shortcut=True):
        """Initializes the C3k2 module, a faster CSP Bottleneck with 2 convolutions and optional C3k blocks."""
        super().__init__(c1, c2, n, shortcut, g, e)
        self.m = nn.ModuleList(
            C3k(self.c, self.c, 2, shortcut, g) if c3k else Bottleneck_CMUNeX(self.c, self.c, shortcut, g) for _ in range(n)
        )


if __name__ == '__main__':
    CMUN = CMUNeXtBlock(256,128,3,2)
    #创建一个输入张量
    batch_size = 8
    input_tensor=torch.randn(batch_size, 256, 64, 64 )
    #运行模型并打印输入和输出的形状
    output_tensor =CMUN(input_tensor)
    print("Input shape:",input_tensor.shape)
    print("0utput shape:",output_tensor.shape)

 4. 将CMUNeXt Block引入到YOLOv11中

第一: 将下面的核心代码复制到D:\bilibili\model\YOLO11\ultralytics-main\ultralytics\nn路径下,如下图所示。

第二:在task.py中导入CMUNeXtBlock, C3k2_CMUNeX包

第三:在task.py中的模型配置部分下面代码

        第一个改进需要修改的地方

         第二个改进需要修改的地方

第四:将模型配置文件复制到YOLOV11.YAMY文件中

        第一个改进的配置文件

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolo11n.yaml' will call yolo11.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPs
  s: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPs
  m: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPs
  l: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPs
  x: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs

# YOLO11n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 2, C3k2, [256, False, 0.25]]
  - [-1, 1, CMUNeXtBlock, [256, 3, 2]] # 3-P3/8
  - [-1, 2, C3k2, [512, False, 0.25]]
  - [-1, 1, CMUNeXtBlock, [512, 3, 2]] # 5-P4/16
  - [-1, 2, C3k2, [512, True]]
  - [-1, 1, CMUNeXtBlock, [1024, 3, 2]] # 7-P5/32
  - [-1, 2, C3k2, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9
  - [-1, 2, C2PSA, [1024]] # 10

# YOLO11n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 2, C3k2, [512, False]] # 13

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 2, C3k2, [256, False]] # 16 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 13], 1, Concat, [1]] # cat head P4
  - [-1, 2, C3k2, [512, False]] # 19 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 10], 1, Concat, [1]] # cat head P5
  - [-1, 2, C3k2, [1024, True]] # 22 (P5/32-large)

  - [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)

        第二个改进的配置文件

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolo11n.yaml' will call yolo11.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPs
  s: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPs
  m: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPs
  l: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPs
  x: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs

# YOLO11n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 2, C3k2_CMUNeX, [256, False, 0.25]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 2, C3k2_CMUNeX, [512, False, 0.25]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 2, C3k2_CMUNeX, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 2, C3k2_CMUNeX, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9
  - [-1, 2, C2PSA, [1024]] # 10

# YOLO11n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 2, C3k2_CMUNeX, [512, False]] # 13

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 2, C3k2_CMUNeX, [256, False]] # 16 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 13], 1, Concat, [1]] # cat head P4
  - [-1, 2, C3k2_CMUNeX, [512, False]] # 19 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 10], 1, Concat, [1]] # cat head P5
  - [-1, 2, C3k2_CMUNeX, [1024, True]] # 22 (P5/32-large)

  - [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)

第五:运行成功


from ultralytics.models import NAS, RTDETR, SAM, YOLO, FastSAM, YOLOWorld

if __name__=="__main__":


    # 使用自己的YOLOv11.yamy文件搭建模型并加载预训练权重训练模型
    model = YOLO(r"D:\bilibili\model\YOLO11\ultralytics-main\ultralytics\cfg\models\11\yolo11_cmunex.yaml")\
        .load(r'D:\bilibili\model\YOLO11\ultralytics-main\yolo11n.pt')  # build from YAML and transfer weights

    results = model.train(data=r'D:\bilibili\model\ultralytics-main\ultralytics\cfg\datasets\VOC_my.yaml',
                          epochs=100, imgsz=640, batch=4)

 


http://www.kler.cn/a/385758.html

相关文章:

  • python-文件内容操作
  • 压缩指令的使用
  • c++ 类和对象(中)
  • Fish Agent V0.13B:Fish Audio的语音处理新突破,AI语音助手的未来已来!
  • Day44 | 动态规划 :状态机DP 买卖股票的最佳时机IV买卖股票的最佳时机III
  • 《基于Oracle的SQL优化》读书笔记
  • 树莓派上安装与配置 Nginx Web 服务器教程
  • 使用 axios 拦截器实现请求和响应的统一处理(附常见面试题)
  • OPPO开源Diffusion多语言适配器—— MultilingualSD3-adapter 和 ChineseFLUX.1-adapter
  • 【Android】ubutun 创建Androidstudio桌面快捷方式
  • 初始MQ(安装使用RabbitMQ,了解交换机)
  • HarmonyOs DevEco Studio小技巧28--部分鸿蒙生命周期详解
  • 陀螺仪原理探析
  • uni-app - - - - - 钉钉小程序 uni.showToast回调函数不执行问题(PC端钉钉小程序 接口API回调函数不执行)
  • 前端自动化测试selenium在最新探索使用
  • Chromium 中chrome.webRequest扩展接口定义c++
  • 稀土阻燃协效剂-氢氧化镁的应用
  • React核心概念与特点
  • python包管理工具pip和conda的使用对比
  • 如何编写有效的Prompt模板:提升大模型性能的关键
  • 数据结构---详解顺序表
  • python项目使用sqlalchemy的order_by方法报错‘Out of sort memory‘的解决方案
  • FP独立站引流革命:GG斗篷技术解锁流量新策略
  • Linux挖矿病毒(kswapd0进程使cpu爆满)
  • Linux环境基础开发工具的使用_yum源_vim_Git控制器
  • 【HGT】文献精讲:Heterogeneous Graph Transformer