Step 1: install darknet
reference resources: Windows builds Darknet framework environment - yolov4 GPU_ Optimistic lishan blog - CSDN blog
The source code description of darknet has also briefly introduced how to use data sets to train networks
Step 2: create VOC format data set
Collect the data set you need online, or take relevant videos yourself, and then extract frames of pictures
Most of the network public data sets have been attached with marked xml files
The links of the recommended public data sets in the field of transportation (including driverless, traffic signs and vehicle detection) are as follows: [intelligent transportation data set] a collection of data sets in the field of intelligent transportation (I) - Propeller AI Studio - artificial intelligence learning and training community
There are also some datasets that do not have annotation files in xml format. At this time, other methods need to be used to convert them into xml files
for example BITVehicle The dataset provides a mark file in mat format. You need to write a program to read the data and generate the xml file of the corresponding picture
For datasets without labels or those taken by yourself, you need to use labelimg tool to calibrate, generate xml files, and use pip installation command to install them
pip install labelimg
Open labelimg command
labelimg
Create a new VOCdevkit folder and create other subfolders by referring to the directory distribution of VOC dataset (there are no files in the folder at this time)
-
Put all the pictures of the dataset into the JPEGImages folder
-
Put the xml annotation files corresponding to all pictures into Annotations
-
Generate txt related files and put them in the Main folder (using gen_files.py file)
File from blog: yolov4 trains its own dataset_ sinat_ Blog of 28371057 - CSDN blog_ yolov4 trains its own dataset
#!/usr/bin/env python # -*- coding: utf-8 -*- # file: gen_files.py # Generate txt files required for training import os import random root_path = './VOCdevkit/VOC2007' xmlfilepath = root_path + '/Annotations' txtsavepath = root_path + '/ImageSets/Main' if not os.path.exists(root_path): print("cannot find such directory: " + root_path) exit() if not os.path.exists(txtsavepath): os.makedirs(txtsavepath) trainval_percent = 0.9 # Proportion of training verification set train_percent = 0.8 # Proportion of training set total_xml = os.listdir(xmlfilepath) num = len(total_xml) tv = int(num * trainval_percent) tr = int(tv * train_percent) trainval = random.sample(range(num), tv) train = random.sample(trainval, tr) print("train and val size:", tv) print("train size:", tr) ftrainval = open(txtsavepath + '/trainval.txt', 'w') ftest = open(txtsavepath + '/test.txt', 'w') ftrain = open(txtsavepath + '/train.txt', 'w') fval = open(txtsavepath + '/val.txt', 'w') for i in range(num): name = total_xml[i][:-4] + '\n' if i in trainval: ftrainval.write(name) if i in train: ftrain.write(name) else: fval.write(name) else: ftest.write(name) ftrainval.close() ftrain.close() fval.close() ftest.close()
-
Generate the final txt file and label folder (using voc_label.py file)
The file comes from the scripts folder under darknet
#!/usr/bin/env python # -*- coding: utf-8 -*- # file: voc_label.py # Generate the final txt file and label folder import xml.etree.ElementTree as ET import pickle import os from os import listdir, getcwd from os.path import join import platform sets = [('2007', 'train'), ('2007', 'val'), ('2007', 'test')] classes = ["Bus", "Microbus", "Minivan", "Sedan", "SUV", "Truck"] def convert(size, box): dw = 1. / (size[0]) dh = 1. / (size[1]) x = (box[0] + box[1]) / 2.0 - 1 y = (box[2] + box[3]) / 2.0 - 1 w = box[1] - box[0] h = box[3] - box[2] x = x * dw w = w * dw y = y * dh h = h * dh return x, y, w, h def convert_annotation(year, image_id): in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml' % (year, image_id)) out_file = open('VOCdevkit/VOC%s/labels/%s.txt' % (year, image_id), 'w') tree = ET.parse(in_file) root = tree.getroot() size = root.find('size') w = int(size.find('width').text) h = int(size.find('height').text) for obj in root.iter('object'): # difficult = obj.find('difficult').text cls = obj.find('name').text # if cls not in classes or int(difficult) == 1: # continue cls_id = classes.index(cls) xmlbox = obj.find('bndbox') b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text)) bb = convert((w, h), b) out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n') wd = getcwd() for year, image_set in sets: if not os.path.exists('VOCdevkit/VOC%s/labels/' % year): os.makedirs('VOCdevkit/VOC%s/labels/' % year) image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt' % (year, image_set)).read().strip().split() list_file = open('%s_%s.txt' % (year, image_set), 'w') for image_id in image_ids: list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n' % (wd, year, image_id)) print("Processing image: %s" % image_id) convert_annotation(year, image_id) list_file.close() # if platform.system().lower() == 'windows': # os.system("type 2007_train.txt 2007_val.txt > train.txt") # os.system("type 2007_train.txt 2007_val.txt 2007_test.txt > train.all.txt") # elif platform.system().lower() == 'linux': # os.system("cat 2007_train.txt 2007_val.txt > train.txt") # os.system("cat 2007_train.txt 2007_val.txt 2007_test.txt > train.all.txt") print("done")
-
Replication 2007_test.txt,2007_train.txt,2007_val.txt to the data folder, search and copy voc.data and voc.names under darknet to the data folder
-
When training yolov4-tiny, search and copy yolov4-tiny.conv.29 and yolov4-tiny-custom.cfg under darknet to the data folder
Note: the file placement path is not absolute. You can decide by yourself, as long as you know the meaning of the corresponding file, and then specify the correct path in the subsequent configuration process
Step 3: configure network structure and training parameters
-
xxx..names file
Training category name assignment: voc.names
Replace with the category name of your own dataset, one row at a time, and there should be no empty rows
-
xxx.data file
Data path file specification: voc.data
Classes: Specifies the number of classes
train: Specifies the training dataset image path to read txt
valid: Specifies the image path of the validation dataset to read txt
Test: Specifies the test dataset image path to read txt
names: Specifies the category name to read the file
backup: Specifies the path to save the training weight file
For example:
classes= 6 train = D:\DataSet\data/2007_train.txt valid = D:\DataSet\data/2007_val.txt test = D:\DataSet\data/2007_test.txt names = D:\DataSet\data/voc.names backup = D:\DataSet\data/backup
-
yoloxxx.cfg
Training network parameter configuration: yolov4-tiny-custom.cfg
Note the main modifications:
[net]:
Batch: the batch size. The stronger the graphics card, this value can be set higher. Otherwise, the default value is used or reduced appropriately
subdivisions: how many equal parts are divided into each batch
[yolo]:
Classes: number of classes
anchors: preselector size
Note: yolov4 has three yolo layers and needs to be changed in three places. Yolov4 tiny has two yolo layers and needs to be changed in two places
[convolutional]
filters: above each yolo layer, there is a corresponding revolutionary. The value is: (classes+5) X 3. If the number of categories is 6, the value is 33
Note: yolov4 has three yolo layers, corresponding to the revolution, which needs to be changed in three places; yolov4 tiny has two yolo layers, corresponding to the revolution, which needs to be changed in two places
-
yoloxxx.conv.xx
Pre training weight file of yolov4: yolov4.conv.137
Pre training weight file of yolov4 tiny: yolov4 tiny.conv.29
-
Create backup folder
Specify where to save network training weights
Step 4: Training
1. Modify a priori box
The K-means algorithm is used to calculate the size of the highest priority check box
darknet.exe detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416
In the command, data/obj.data is the specified xxx.data file path, and the number 9 represents the number of categories, for example:
D:\darknet\build\darknet\x64\darknet.exe detector calc_anchors data/voc.data -num_of_clusters 6 -width 416 -height 416
After running, the anchors.txt file will be generated, the contents of the file will be copied, and the anchors under [yolo] in yoloxxx.cfg file will be modified
2. Start training command
darknet.exe detector train data/obj.data yolo-obj.cfg yolov4.conv.137
In the command, data/obj.data is the specified xxx.data file path, Yolo obj.cfg is the specified network structure file parameter, and yolov4.conv.137 is the pre training weight file, for example:
D:\darknet\build\darknet\x64\darknet.exe detector train data/voc.data data/yolov4-tiny-custom.cfg data/yolov4-tiny.conv.29 -map