当前位置：首页 > news >正文

基于Yolo11的无人机小目标检测系统的设计与性能优化改进项目实现

news 2025/7/9 16:04:47

项目简介

基于Yolo11的无人机小目标检测系统的设计与性能优化改进的目标检测

项目名称

基于Yolo11的无人机小目标检测系统的设计与性能优化改进

项目简介

该项目旨在开发一个基于YOLO11的无人机目标检测系统，能够实时识别并定位无人机拍摄过程中捕捉的小目标。考虑到无人机拍摄的目标通常较小，系统将采用特定的调优策略，以提高小目标的检测精度和召回率。

数据

数据集下载

https://github.com/VisDrone/VisDrone-Dataset

数据预处理

1.获取数据集的labels目标框标签

训练集数据

from PIL import Image
from tqdm import tqdm
from pathlib import Path
import os
def labelsplit(path):
    # 假设 dir 是一个 Path 对象，指向你想要处理的目录
    # 获取当前路径
    # d:\huaqing\code\detect_plane_project_4
    current = os.path.dirname(__file__)
    txt_dir = os.path.join(current, 'datasets', path, 'annotations')
    img_dir = os.path.join(current, 'datasets', path, 'images')
    labels_dir = os.path.join(current, 'datasets', path, 'labels')

    # 转化为相对路径
    # txt的文件路径
    txt_path = os.path.relpath(txt_dir)
    img_path = os.path.relpath(img_dir)
    labels_path = os.path.relpath(labels_dir)


    txt_path = Path(txt_path)
    # 遍历txt文件夹下的所有txt文件
    # pbar = os.listdir(txt_path)
    pbar = tqdm((txt_path).glob('*.txt'), desc=f'Converting {txt_path}')
    for f in pbar:
        # 【对图片的操作】
        # 构建对应的图像文件路径并获取尺寸
        txt_name = f.name.split('.')[0]
        img_path_jpg = Path(os.path.join(img_path, f.name)).with_suffix('.jpg')    
        img_size = Image.open(img_path_jpg).size

        # 路径是否存在检测
        if not (os.path.exists(img_path) and os.path.exists(img_path) and os.path.exists(labels_path) and os.path.exists(img_path_jpg)):
            print("Warning: Image file not found")


        lines = []
        with open(f, 'r') as file:# read annotation.txt
            for row in [x.split(',') for x in file.read().strip().splitlines()]:
                # 0表示无效狂，所以舍去
                if row[4] == '0':  # VisDrone 'ignored regions' class 0
                    continue
                # 目标1索引
                cls = int(row[5]) - 1
                # # 中心点坐标
                center_x = (float(row[0]) + (float(row[2]) / 2)) / img_size[0]
                center_y = (float(row[1]) + (float(row[3]) / 2)) / img_size[1]
                labels_txt = str(cls) + ' ' + str(center_x) + ' ' + str(center_y) + ' ' + str(float(row[2]) / img_size[0]) + ' ' + str(float(row[3]) / img_size[1])
                lines.append(labels_txt)
                with open(os.path.join(labels_path, txt_name + '.txt'), 'w') as file:
                    for i in range(len(lines)):
                        file.write(lines[i])
                        file.write('\n')
                    file.close()

if __name__ == '__main__':
    labelsplit('VisDrone2019-DET-train')

测试集数据（修改传入的文件路径，其余代码一致）

if __name__ == '__main__':
    labelsplit('VisDrone2019-DET-val')

模型训练

编写数据集的配置文件

# Ultralytics YOLO 🚀, AGPL-3.0 license
# VisDrone2019-DET dataset https://github.com/VisDrone/VisDrone-Dataset by Tianjin University
# Documentation: https://docs.ultralytics.com/datasets/detect/visdrone/
# Example usage: yolo train data=VisDrone.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── VisDrone  ← downloads here (2.3 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ./datasets # dataset root dir
train: VisDrone2019-DET-train # train images (relative to 'path')  6471 images
val: VisDrone2019-DET-val # val images (relative to 'path')  548 images
test: VisDrone2019-DET-test-dev # test images (optional)  1610 images

# Classes
names:
  0: pedestrian
  1: people
  2: bicycle
  3: car
  4: van
  5: truck
  6: tricycle
  7: awning-tricycle
  8: bus
  9: motor
  10.other

修改模型的检测分类数

车牌目标检测只检测车牌，因此模型输出分类数为1

nc: 10  # number of classes

模型参数的修改

# 1.【增加网络层】
# YOLO11n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 2, C3k2, [256, False, 0.25]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 2, C3k2, [512, False, 0.25]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 2, C3k2, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 2, C3k2, [1024, True]]
  - [-1, 2, CBAM, [1024, 7]] # 新增CBAM，后面的层级自然都+1
  - [-1, 1, SPPF, [1024, 5]] # 9
  - [-1, 2, C2PSA, [1024]] # 10

#2.【修改网络层级关系，因为是在第九层加入的CBAM，所以九层以后的层级需要加一层】
# YOLO11n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 2, C3k2, [512, False]] # 13

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 2, C3k2, [256, False]] # 16 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 14], 1, Concat, [1]] # cat head P4 13->14
  - [-1, 2, C3k2, [512, False]] # 19 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 11], 1, Concat, [1]] # cat head P5 10->11
  - [-1, 2, C3k2, [1024, True]] # 22 (P5/32-large)

  - [[17, 20, 23], 1, Detect, [nc]] # Detect(P3, P4, P5)
16->17 19->20 22->23
# 3.【yolo11本身写有CBAM，所以只需要逐层将其导出就行】

训练模型权重

yolo detect train data=cfg\datasets\VisDrone.yamlmodel=ultralytics\ultralytics\cfg\models\11\yolo11m.yaml  epochs=30 imgsz=640

训练过程可视化

不同训练轮次下各类训练指标的折线图

第一幅图：在不同的分类中的训练预测出的数量,其中车辆的检测目标最多，摩托车检测的数据最少
第二幅图：预测框的形状情况，由图可以看出小中目标居多
第三幅图：预测框的中心坐标的分布情况
第四幅图：预测框的长宽分布情况

模型验证

yolo detect val model=.\runs\detect\train23\weights\best.pt data=.\data\VisDrone.yaml

模型应用

模型加载并使用

from ultralytics import YOLO

model = YOLO(r'..\runs\detect\trian22\weights\best.pt')
model.predict(r"..\datasets\VisDrone2019-DET-test-challenge\images\0000000_02309_d_0000006.jpg", save=True)