当前位置: 首页 > news >正文

YOLOv5改进(十)-- 轻量化模型MobileNetv4

文章目录

  • 一、原理
  • 二、YOLOv5 添加 MobileNetV4 模块
    • 2.1 MobileNetV4的代码实现
    • 2.2 新增 yaml文件
    • 2.3 注册模块
    • 2.4 替换函数
    • 2.5 执行程序
    • 2.6 完整代码
  • 三、目标检测系列文章

一、原理

在这里插入图片描述

在这里插入图片描述

论文地址:MobileNetV4 - Universal Models for the Mobile Ecosystem

官方代码: MobileNetV4代码仓库

**MobileNetV4(MNv4)**是最新一代的MobileNet,其核心原理在于实现了跨各种移动设备的高效性能,旨在优化各种移动设备的性能。它引入了通用倒瓶颈(UIB)搜索块,这是一种统一且灵活的结构,结合了倒瓶颈、ConvNext、前馈网络(FFN)和新的额外深度可分离卷积(ExtraDW)变体。与此同时,MobileNetV4还推出了专为移动加速器优化的Mobile MQA注意力块,提高了39%的推理速度。通过改进的神经架构搜索(NAS)方法,MobileNetV4显著提高了搜索效率,创建了在CPU、DSP、GPU以及专用加速器(如Apple Neural Engine和Google Pixel EdgeTPU)上表现优异的模型。此外,MobileNetV4还采用了一种新的蒸馏技术,进一步提高了模型的准确性。整体上,MobileNetV4通过整合UIB、Mobile MQA和改进的NAS方法,成功打造出一系列在移动设备上表现最优的模型,兼顾计算效率和精度,实现了在多种硬件平台上的帕累托最优性能。

以下是其架构背后的关键原则和组件:

1)通用倒置瓶颈 (UIB)

  • 倒置瓶颈块:MobileNetV4 以倒置瓶颈 (IB) 块的成功为基础,这是之前 MobileNets 的核心功能。
  • 灵活性和统一性:UIB 结合了多种微架构,包括倒置瓶颈、ConvNext、前馈网络 (FFN) 和创新的额外深度 (ExtraDW) 变体。这种统一使其能够有效地处理不同类型的操作。
  • 可选深度卷积:UIB 包括可选深度卷积,可增强其混合空间和通道信息的计算效率和灵活性。

2)移动 MQA(移动量化注意力)

  • 效率:此注意力模块专门针对移动加速器进行了优化,与传统的多头注意力 (MHSA) 相比,推理速度显著提高了 39%。
  • 移动设备上的性能:移动 MQA 模块经过量身定制,可充分利用移动硬件的功能,确保更快的处理速度和更低的延迟。

3)神经架构搜索 (NAS)

  • 优化的搜索方案:MobileNetV4 引入了一种改进的 NAS 流程,其中包括一个两阶段方法:粗粒度搜索和细粒度搜索。此方法提高了搜索过程的效率和有效性,从而可以创建高性能模型。
  • 硬件感知:NAS 方法考虑了各种移动硬件平台的限制和功能,确保生成的模型针对各种设备进行了优化,从 CPU 到专用加速器,如 Apple Neural Engine 和 Google Pixel EdgeTPU。

4)蒸馏技术

  • 准确度增强:MobileNetV4 采用了一种新颖的蒸馏技术,该技术使用具有不同增强的混合数据集和平衡的类内数据。该技术提高了模型的泛化能力和准确性。
  • 性能:通过这种蒸馏技术增强的 MNv4-Hybrid-Large 模型在 ImageNet-1K 上实现了令人印象深刻的 87% 的准确率,在 Pixel 8 EdgeTPU 上的运行时间仅为 3.8 毫秒。

5)帕累托最优

  • 性能平衡:MobileNetV4 模型旨在在各种硬件平台上实现帕累托最优,这意味着它们在计算效率和准确性之间提供了最佳平衡。
  • 独立于硬件的效率:这些模型经过测试,证明在不同类型的硬件上表现良好,从 CPU 和 GPU 到 DSP 和专用加速器,使其用途广泛且适用性广泛。

6)设计考虑

  • 运算强度:MobileNetV4 的设计考虑了不同硬件的运算强度,平衡了计算负载和内存带宽,以最大限度地提高性能。
  • 层优化:MobileNetV4 的初始层设计为计算密集型,以提高模型容量和准确性,而最终层则专注于即使在高 RP(脊点)硬件上也能保持准确性。

总之,MobileNetV4 集成了先进的技术和创新,创建了一系列在各种移动设备上都高效准确的模型。其设计原则强调灵活性、硬件感知优化以及计算效率和模型性能之间的平衡。

二、YOLOv5 添加 MobileNetV4 模块

MobileNetV4处理图像的主要步骤可以概括为以下几个关键阶段:

1)图像预处理

在输入模型之前,图像会经过一些预处理步骤,这通常包括:

  • **缩放:**将图像调整到指定的输入大小(如224x224)。
  • **归一化:**对图像像素值进行归一化处理,使其值在特定范围内(如0到1或-1到1)。

2)初始卷积层

图像首先通过一个标准卷积层,该层具有较大的卷积核(如3x3或5x5),用于捕捉低级别的特征,例如边缘和纹理。这一层通常是计算密集型的,以便在早期捕获更多的信息。

3)倒瓶颈块(Inverted Bottleneck Block)

在MobileNetV4中,倒瓶颈块(IB块)是核心组件之一。每个倒瓶颈块包含以下子步骤:

  • **扩展卷积:**首先使用1x1卷积扩展特征图的通道数。
  • **深度可分离卷积:**然后使用深度可分离卷积(Depthwise Separable Convolution)在空间维度上进行卷积操作。
  • **压缩卷积:**最后通过1x1卷积将通道数压缩回去。

4)通用倒瓶颈块(Universal Inverted Bottleneck, UIB)

MobileNetV4引入了UIB,这种块可以灵活地调整,以适应不同的计算需求:

  • **标准倒瓶颈:**与传统倒瓶颈类似,但增加了更多的灵活性。
  • **ConvNext变体:**结合了现代卷积架构的优点。
  • **前馈网络(FFN):**适用于需要更多非线性变换的场景。
  • **额外深度可分离卷积(ExtraDW):**用于提高计算效率和特征提取能力。

5)Mobile MQA注意力块

这个模块是优化后的注意力机制,专门设计用于移动设备:

  • **注意力计算:**对特征图进行加权操作,以增强重要特征并抑制不重要的部分。
  • **快速推理:**该模块在硬件加速器上实现了快速的推理速度,提高了整体的处理效率。

6)全局平均池化

在最后的卷积层之后,模型会对特征图进行全局平均池化(Global Average Pooling),将特征图转化为固定长度的向量。这一步骤通过取每个特征图的平均值,将空间维度消除,仅保留通道维度的信息。

7)全连接层和分类层

最后,通过全连接层(Fully Connected Layer)和Softmax分类层,将特征向量转换为最终的分类结果。

通过上述步骤,MobileNetV4能够高效地处理和分类输入图像,特别适用于资源受限的移动设备上,同时保证了较高的精度和性能。

在这里插入图片描述

2.1 MobileNetV4的代码实现

在yolov5/models/下新建mobilenetv4.py,并粘贴下面的代码

"""
Creates a MobileNetV4 Model as defined in:
Danfeng Qin, Chas Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Banbury, Chengxi Ye, Berkin Akin, Vaibhav Aggarwal, Tenghui Zhu, Daniele Moro, Andrew Howard. (2024).
MobileNetV4 - Universal Models for the Mobile Ecosystem
arXiv preprint arXiv:2404.10518.
"""from typing import Any, Callable, Dict, List, Mapping, Optional, Tuple, Unionimport torch
import torch.nn as nnMNV4ConvSmall_BLOCK_SPECS = {"conv0": {"block_name": "convbn","num_blocks": 1,"block_specs": [[3, 32, 3, 2]]},"layer1": {"block_name": "convbn","num_blocks": 2,"block_specs": [[32, 32, 3, 2],[32, 32, 1, 1]]},"layer2": {"block_name": "convbn","num_blocks": 2,"block_specs": [[32, 96, 3, 2],[96, 64, 1, 1]]},"layer3": {"block_name": "uib","num_blocks": 6,"block_specs": [[64, 96, 5, 5, True, 2, 3],[96, 96, 0, 3, True, 1, 2],[96, 96, 0, 3, True, 1, 2],[96, 96, 0, 3, True, 1, 2],[96, 96, 0, 3, True, 1, 2],[96, 96, 3, 0, True, 1, 4],]},"layer4": {"block_name": "uib","num_blocks": 6,"block_specs": [[96,  128, 3, 3, True, 2, 6],[128, 128, 5, 5, True, 1, 4],[128, 128, 0, 5, True, 1, 4],[128, 128, 0, 5, True, 1, 3],[128, 128, 0, 3, True, 1, 4],[128, 128, 0, 3, True, 1, 4],]},  "layer5": {"block_name": "convbn","num_blocks": 2,"block_specs": [[128, 960, 1, 1],[960, 1280, 1, 1]]}
}MNV4ConvMedium_BLOCK_SPECS = {"conv0": {"block_name": "convbn","num_blocks": 1,"block_specs": [[3, 32, 3, 2]]},"layer1": {"block_name": "fused_ib","num_blocks": 1,"block_specs": [[32, 48, 2, 4.0, True]]},"layer2": {"block_name": "uib","num_blocks": 2,"block_specs": [[48, 80, 3, 5, True, 2, 4],[80, 80, 3, 3, True, 1, 2]]},"layer3": {"block_name": "uib","num_blocks": 8,"block_specs": [[80,  160, 3, 5, True, 2, 6],[160, 160, 3, 3, True, 1, 4],[160, 160, 3, 3, True, 1, 4],[160, 160, 3, 5, True, 1, 4],[160, 160, 3, 3, True, 1, 4],[160, 160, 3, 0, True, 1, 4],[160, 160, 0, 0, True, 1, 2],[160, 160, 3, 0, True, 1, 4]]},"layer4": {"block_name": "uib","num_blocks": 11,"block_specs": [[160, 256, 5, 5, True, 2, 6],[256, 256, 5, 5, True, 1, 4],[256, 256, 3, 5, True, 1, 4],[256, 256, 3, 5, True, 1, 4],[256, 256, 0, 0, True, 1, 4],[256, 256, 3, 0, True, 1, 4],[256, 256, 3, 5, True, 1, 2],[256, 256, 5, 5, True, 1, 4],[256, 256, 0, 0, True, 1, 4],[256, 256, 0, 0, True, 1, 4],[256, 256, 5, 0, True, 1, 2]]},  "layer5": {"block_name": "convbn","num_blocks": 2,"block_specs": [[256, 960, 1, 1],[960, 1280, 1, 1]]}
}MNV4ConvLarge_BLOCK_SPECS = {"conv0": {"block_name": "convbn","num_blocks": 1,"block_specs": [[3, 24, 3, 2]]},"layer1": {"block_name": "fused_ib","num_blocks": 1,"block_specs": [[24, 48, 2, 4.0, True]]},"layer2": {"block_name": "uib","num_blocks": 2,"block_specs": [[48, 96, 3, 5, True, 2, 4],[96, 96, 3, 3, True, 1, 4]]},"layer3": {"block_name": "uib","num_blocks": 11,"block_specs": [[96,  192, 3, 5, True, 2, 4],[192, 192, 3, 3, True, 1, 4],[192, 192, 3, 3, True, 1, 4],[192, 192, 3, 3, True, 1, 4],[192, 192, 3, 5, True, 1, 4],[192, 192, 5, 3, True, 1, 4],[192, 192, 5, 3, True, 1, 4],[192, 192, 5, 3, True, 1, 4],[192, 192, 5, 3, True, 1, 4],[192, 192, 5, 3, True, 1, 4],[192, 192, 3, 0, True, 1, 4]]},"layer4": {"block_name": "uib","num_blocks": 13,"block_specs": [[192, 512, 5, 5, True, 2, 4],[512, 512, 5, 5, True, 1, 4],[512, 512, 5, 5, True, 1, 4],[512, 512, 5, 5, True, 1, 4],[512, 512, 5, 0, True, 1, 4],[512, 512, 5, 3, True, 1, 4],[512, 512, 5, 0, True, 1, 4],[512, 512, 5, 0, True, 1, 4],[512, 512, 5, 3, True, 1, 4],[512, 512, 5, 5, True, 1, 4],[512, 512, 5, 0, True, 1, 4],[512, 512, 5, 0, True, 1, 4],[512, 512, 5, 0, True, 1, 4]]},  "layer5": {"block_name": "convbn","num_blocks": 2,"block_specs": [[512, 960, 1, 1],[960, 1280, 1, 1]]}
}MNV4HybridConvMedium_BLOCK_SPECS = {}MNV4HybridConvLarge_BLOCK_SPECS = {}MODEL_SPECS = {"MobileNetV4ConvSmall": MNV4ConvSmall_BLOCK_SPECS,"MobileNetV4ConvMedium": MNV4ConvMedium_BLOCK_SPECS,"MobileNetV4ConvLarge": MNV4ConvLarge_BLOCK_SPECS,"MobileNetV4HybridMedium": MNV4HybridConvMedium_BLOCK_SPECS,"MobileNetV4HybridLarge": MNV4HybridConvLarge_BLOCK_SPECS,
}def make_divisible(value: float,divisor: int,min_value: Optional[float] = None,round_down_protect: bool = True,) -> int:"""This function is copied from here "https://github.com/tensorflow/models/blob/master/official/vision/modeling/layers/nn_layers.py"This is to ensure that all layers have channels that are divisible by 8.Args:value: A `float` of original value.divisor: An `int` of the divisor that need to be checked upon.min_value: A `float` of  minimum value threshold.round_down_protect: A `bool` indicating whether round down more than 10%will be allowed.Returns:The adjusted value in `int` that is divisible against divisor."""if min_value is None:min_value = divisornew_value = max(min_value, int(value + divisor / 2) // divisor * divisor)# Make sure that round down does not go down by more than 10%.if round_down_protect and new_value < 0.9 * value:new_value += divisorreturn int(new_value)def conv_2d(inp, oup, kernel_size=3, stride=1, groups=1, bias=False, norm=True, act=True):conv = nn.Sequential()padding = (kernel_size - 1) // 2conv.add_module('conv', nn.Conv2d(inp, oup, kernel_size, stride, padding, bias=bias, groups=groups))if norm:conv.add_module('BatchNorm2d', nn.BatchNorm2d(oup))if act:conv.add_module('Activation', nn.ReLU6())return convclass InvertedResidual(nn.Module):def __init__(self, inp, oup, stride, expand_ratio, act=False):super(InvertedResidual, self).__init__()self.stride = strideassert stride in [1, 2]hidden_dim = int(round(inp * expand_ratio))self.block = nn.Sequential()if expand_ratio != 1:self.block.add_module('exp_1x1', conv_2d(inp, hidden_dim, kernel_size=1, stride=1))self.block.add_module('conv_3x3', conv_2d(hidden_dim, hidden_dim, kernel_size=3, stride=stride, groups=hidden_dim))self.block.add_module('red_1x1', conv_2d(hidden_dim, oup, kernel_size=1, stride=1, act=act))self.use_res_connect = self.stride == 1 and inp == oupdef forward(self, x):if self.use_res_connect:return x + self.block(x)else:return self.block(x)class UniversalInvertedBottleneckBlock(nn.Module):def __init__(self, inp, oup, start_dw_kernel_size, middle_dw_kernel_size, middle_dw_downsample,stride,expand_ratio):super().__init__()# Starting depthwise conv.self.start_dw_kernel_size = start_dw_kernel_sizeif self.start_dw_kernel_size:            stride_ = stride if not middle_dw_downsample else 1self._start_dw_ = conv_2d(inp, inp, kernel_size=start_dw_kernel_size, stride=stride_, groups=inp, act=False)# Expansion with 1x1 convs.expand_filters = make_divisible(inp * expand_ratio, 8)self._expand_conv = conv_2d(inp, expand_filters, kernel_size=1)# Middle depthwise conv.self.middle_dw_kernel_size = middle_dw_kernel_sizeif self.middle_dw_kernel_size:stride_ = stride if middle_dw_downsample else 1self._middle_dw = conv_2d(expand_filters, expand_filters, kernel_size=middle_dw_kernel_size, stride=stride_, groups=expand_filters)# Projection with 1x1 convs.self._proj_conv = conv_2d(expand_filters, oup, kernel_size=1, stride=1, act=False)# Ending depthwise conv.# this not used# _end_dw_kernel_size = 0# self._end_dw = conv_2d(oup, oup, kernel_size=_end_dw_kernel_size, stride=stride, groups=inp, act=False)def forward(self, x):if self.start_dw_kernel_size:x = self._start_dw_(x)# print("_start_dw_", x.shape)x = self._expand_conv(x)# print("_expand_conv", x.shape)if self.middle_dw_kernel_size:x = self._middle_dw(x)# print("_middle_dw", x.shape)x = self._proj_conv(x)# print("_proj_conv", x.shape)return xdef build_blocks(layer_spec):if not layer_spec.get('block_name'):return nn.Sequential()block_names = layer_spec['block_name']layers = nn.Sequential()if block_names == "convbn":schema_ = ['inp', 'oup', 'kernel_size', 'stride']args = {}for i in range(layer_spec['num_blocks']):args = dict(zip(schema_, layer_spec['block_specs'][i]))layers.add_module(f"convbn_{i}", conv_2d(**args))elif block_names == "uib":schema_ =  ['inp', 'oup', 'start_dw_kernel_size', 'middle_dw_kernel_size', 'middle_dw_downsample', 'stride', 'expand_ratio']args = {}for i in range(layer_spec['num_blocks']):args = dict(zip(schema_, layer_spec['block_specs'][i]))layers.add_module(f"uib_{i}", UniversalInvertedBottleneckBlock(**args))elif block_names == "fused_ib":schema_ = ['inp', 'oup', 'stride', 'expand_ratio', 'act']args = {}for i in range(layer_spec['num_blocks']):args = dict(zip(schema_, layer_spec['block_specs'][i]))layers.add_module(f"fused_ib_{i}", InvertedResidual(**args))else:raise NotImplementedErrorreturn layersclass MobileNetV4(nn.Module):def __init__(self, model):# MobileNetV4ConvSmall  MobileNetV4ConvMedium  MobileNetV4ConvLarge# MobileNetV4HybridMedium  MobileNetV4HybridLarge"""Params to initiate MobilenNetV4Args:model : support 5 types of models as indicated in "https://github.com/tensorflow/models/blob/master/official/vision/modeling/backbones/mobilenet.py"        """super().__init__()assert model in MODEL_SPECS.keys()self.model = modelself.spec = MODEL_SPECS[self.model]# conv0self.conv0 = build_blocks(self.spec['conv0'])# layer1self.layer1 = build_blocks(self.spec['layer1'])# layer2self.layer2 = build_blocks(self.spec['layer2'])# layer3self.layer3 = build_blocks(self.spec['layer3'])# layer4self.layer4 = build_blocks(self.spec['layer4'])# layer5   self.layer5 = build_blocks(self.spec['layer5'])self.features = nn.ModuleList([self.conv0, self.layer1, self.layer2, self.layer3, self.layer4, self.layer5])     self.channel = [i.size(1) for i in self.forward(torch.randn(1, 3, 640, 640))]def forward(self, x):input_size = x.size(2)scale = [4, 8, 16, 32]features = [None, None, None, None]for f in self.features:x = f(x)if input_size // x.size(2) in scale:features[scale.index(input_size // x.size(2))] = xreturn featuresdef MobileNetV4ConvSmall():model = MobileNetV4('MobileNetV4ConvSmall')return modeldef MobileNetV4ConvMedium():model = MobileNetV4('MobileNetV4ConvMedium')return modeldef MobileNetV4ConvLarge():model = MobileNetV4('MobileNetV4ConvLarge')return modeldef MobileNetV4HybridMedium():model = MobileNetV4('MobileNetV4HybridMedium')return modeldef MobileNetV4HybridLarge():model = MobileNetV4('MobileNetV4HybridLarge')return model

2.2 新增 yaml文件

在下/yolov5/models下新建文件 **yolov5_MobileNetv4.yaml**并将下面代码复制进去

# YOLOv5 🚀 by Ultralytics, AGPL-3.0 license# Parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple
anchors:- [10,13, 16,30, 33,23]  # P3/8- [30,61, 62,45, 59,119]  # P4/16- [116,90, 156,198, 373,326]  # P5/32# YOLOv5 v6.0 backbone
backbone:# [from, number, module, args][[-1, 1, MobileNetV4ConvSmall, []],#4[-1, 1, SPPF, [1024, 5]],  # 5]# YOLOv5 v6.0 head
head:[[-1, 1, Conv, [512, 1, 1]],[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 3], 1, Concat, [1]],  # cat backbone P4[-1, 3, C3, [512, False]],  # 9[-1, 1, Conv, [256, 1, 1]],[-1, 1, nn.Upsample, [None, 2, 'nearest']],[[-1, 2], 1, Concat, [1]],  # cat backbone P3[-1, 3, C3, [256, False]],  # 13 (P3/8-small)[-1, 1, Conv, [256, 3, 2]],[[-1, 10], 1, Concat, [1]],  # cat head P4[-1, 3, C3, [512, False]],  # 16 (P4/16-medium)[-1, 1, Conv, [512, 3, 2]],[[-1, 5], 1, Concat, [1]],  # cat head P5[-1, 3, C3, [1024, False]],  # 19 (P5/32-large)[[13, 16, 19], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)]

2.3 注册模块

替换yolo.pyparse_model函数

def parse_model(d, ch):  # model_dict, input_channels(3)# Parse a YOLOv5 model.yaml dictionaryLOGGER.info(f"\n{'':>3}{'from':>18}{'n':>3}{'params':>10}  {'module':<40}{'arguments':<30}")anchors, nc, gd, gw, act = d['anchors'], d['nc'], d['depth_multiple'], d['width_multiple'], d.get('activation')if act:Conv.default_act = eval(act)  # redefine default activation, i.e. Conv.default_act = nn.SiLU()LOGGER.info(f"{colorstr('activation:')} {act}")  # printna = (len(anchors[0]) // 2) if isinstance(anchors, list) else anchors  # number of anchorsno = na * (nc + 5)  # number of outputs = anchors * (classes + 5)is_backbone = Falselayers, save, c2 = [], [], ch[-1]  # layers, savelist, ch outfor i, (f, n, m, args) in enumerate(d['backbone'] + d['head']):  # from, number, module, argstry:t = mm = eval(m) if isinstance(m, str) else m  # eval stringsexcept:passfor j, a in enumerate(args):with contextlib.suppress(NameError):try:args[j] = eval(a) if isinstance(a, str) else a  # eval stringsexcept:args[j] = an = n_ = max(round(n * gd), 1) if n > 1 else n  # depth gainif m in {Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv,BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x}:c1, c2 = ch[f], args[0]if c2 != no:  # if not outputc2 = make_divisible(c2 * gw, 8)args = [c1, c2, *args[1:]]if m in {BottleneckCSP, C3, C3TR, C3Ghost, C3x}:args.insert(2, n)  # number of repeatsn = 1elif m is nn.BatchNorm2d:args = [ch[f]]elif m is Concat:c2 = sum(ch[x] for x in f)# TODO: channel, gw, gdelif m in {Detect, Segment}:args.append([ch[x] for x in f])if isinstance(args[1], int):  # number of anchorsargs[1] = [list(range(args[1] * 2))] * len(f)if m is Segment:args[3] = make_divisible(args[3] * gw, 8)elif m is Contract:c2 = ch[f] * args[0] ** 2elif m is Expand:c2 = ch[f] // args[0] ** 2elif isinstance(m, str): t = mm = timm.create_model(m, pretrained=args[0], features_only=True)c2 = m.feature_info.channels()elif m in {MobileNetV4ConvSmall}:m = m(*args)c2 = m.channelelse:c2 = ch[f]if isinstance(c2, list):is_backbone = Truem_ = mm_.backbone = Trueelse:m_ = nn.Sequential(*(m(*args) for _ in range(n))) if n > 1 else m(*args)  # modulet = str(m)[8:-2].replace('__main__.', '')  # module typenp = sum(x.numel() for x in m_.parameters())  # number paramsm_.i, m_.f, m_.type, m_.np = i + 4 if is_backbone else i, f, t, np  # attach index, 'from' index, type, number paramsLOGGER.info(f'{i:>3}{str(f):>18}{n_:>3}{np:10.0f}  {t:<40}{str(args):<30}')  # printsave.extend(x % (i + 4 if is_backbone else i) for x in ([f] if isinstance(f, int) else f) if x != -1)  # append to savelistlayers.append(m_)if i == 0:ch = []if isinstance(c2, list):ch.extend(c2)for _ in range(5 - len(ch)):ch.insert(0, 0)else:ch.append(c2)return nn.Sequential(*layers), sorted(save)

2.4 替换函数

替换yolo.py_forward_once函数

    def _forward_once(self, x, profile=False, visualize=False):y, dt = [], []  # outputsfor m in self.model:if m.f != -1:  # if not from previous layerx = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layersif profile:self._profile_one_layer(m, x, dt)if hasattr(m, 'backbone'):x = m(x)for _ in range(5 - len(x)):x.insert(0, None)for i_idx, i in enumerate(x):if i_idx in self.save:y.append(i)else:y.append(None)x = x[-1]else:x = m(x)  # runy.append(x if m.i in self.save else None)  # save outputif visualize:feature_visualization(x, m.type, m.i, save_dir=visualize)return x

2.5 执行程序

  • 自行替换 --data 数据集,cfg/classification.yaml
  • 自行替换 --weights 模型权重, weights/yolov5s.pt
nohup python train.py --batch-size 16 --epochs 200 --cfg models/yolov5_mobilenetv4.yaml --data cfg/classification.yaml --weights weights/yolov5s.pt --device 0 > myout.file 2>&1 &

2.6 完整代码

本文完整代码下载

三、目标检测系列文章

  1. YOLOv5s网络模型讲解(一看就会)
  2. 生活垃圾数据集(YOLO版)
  3. YOLOv5如何训练自己的数据集
  4. 双向控制舵机(树莓派版)
  5. 树莓派部署YOLOv5目标检测(详细篇)
  6. YOLO_Tracking 实践 (环境搭建 & 案例测试)
  7. 目标检测:数据集划分 & XML数据集转YOLO标签
  8. DeepSort行人车辆识别系统(实现目标检测+跟踪+统计)
  9. YOLOv5参数大全(parse_opt篇)
  10. YOLOv5改进(一)-- 轻量化YOLOv5s模型
  11. YOLOv5改进(二)-- 目标检测优化点(添加小目标头检测)
  12. YOLOv5改进(三)-- 引进Focaler-IoU损失函数
  13. YOLOv5改进(四)–轻量化模型ShuffleNetv2
  14. YOLOv5改进(五)-- 轻量化模型MobileNetv3
  15. YOLOv5改进(六)–引入YOLOv8中C2F模块
  16. YOLOv5改进(七)–改进损失函数EIoU、Alpha-IoU、SIoU、Focal-EIOU
  17. YOLOv5改进(八)–引入Soft-NMS非极大值抑制
  18. YOLOv5改进(九)–引入BiFPN模块
  19. 基于YOLOv10的车辆统计跟踪与车速计算应用
  20. 初探 YOLOv8(训练参数解析)
  21. YOLOv8不同模型对比和onnx部署详解
  22. 如何利用YOLOv8训练自己的数据集 && 3种加载模型场景讲解
  23. YOLOv8改进(一)-- 轻量化模型ShuffleNetV2
  24. 如何使用Labelimg查看已经标注好的YOLO数据集标注情况
  25. YOLOv5、YOLOv6、YOLOv7、YOLOv8、YOLOv9、YOLOv10、YOLOv11、YOLOv12的网络结构图

相关文章:

  • Sharding-JDBC 系列专题 - 第十篇:ShardingSphere 生态与未来趋势
  • PHYBench:首个大规模物理场景下的复杂推理能力评估基准
  • C++23文本编码革新:迈向更现代的字符处理
  • 2025年3月电子学会青少年机器人技术(五级)等级考试试卷-理论综合
  • 10.接口而非实现编程
  • CentOS 7上Memcached的安装、配置及高可用架构搭建
  • LLM学习笔记4——本地部署Docker、vLLM和Qwen2.5-32B-Instruct实现OpenManus的使用
  • Qt 中线程使用
  • shell练习题(1)
  • 腾讯云服务器安全——服务防火墙端口放行
  • 哈希表Hash table【C++】
  • CE第二次作业
  • C++学习之游戏服务器开发十七游戏房间容器化
  • KakaoPage中文版:正版授权,一站式阅读体验
  • 遨游三防|30200mAh、双露营灯三防平板,见证堆料天花板
  • 反射,枚举,lambda表达式
  • 《CBOW 词向量转化实战:让自然语言处理 “读懂” 文字背后的含义》
  • 数据库安装和升级和双主配置
  • 使用CNNS和编码器-解码器结构进行人群计数的深度学习模型的设计与实现
  • 13、性能优化:魔法的流畅之道——React 19 memo/lazy
  • 四川公布一起影视盗版案例:1个网站2人团伙盗售30多万部
  • 从息屏24小时到息屏1小时,姚明在深圳开启落地试点
  • 外卖口水战四天,京东美团市值合计蒸发超千亿港元
  • 上海银行一季度净赚逾62亿增2.3%,不良贷款率与上年末持平
  • 何立峰出席跨境贸易便利化专项行动部署会并讲话
  • 美媒称特朗普考虑大幅下调对华关税、降幅或超一半,外交部回应