Problems encountered in reading xml files from VOC format to YOLO tag

Training DIOR remote sensing data sets using YOLOV5:

It is planned to use YOLOV5 to train DIOR remote sensing dataset. Because the annotation format of DIOR dataset is VOC format (in fact, there are two versions of annotation, one is VOC format and the other is OBB format, both of which are xml files. Because the official website cannot log in, the baidu cloud link on CSDN is downloaded, so there may be label errors.), You need to convert VOC format to YOLO format before you can start training.

Problem Description:

In CSDN, we found the method of converting VOC shared by the great God to YOLO format and extracting the coordinates in the xml tag. The link address is: Voc_label.py The original code is as follows:

# -*- coding: utf-8 -*-
import xml.etree.ElementTree as ET
import os
from os import getcwd

sets = ['train', 'val', 'test']
classes = ["a", "b"]   # Change to your own category
abs_path = os.getcwd()
print(abs_path)

def convert(size, box):
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1]) / 2.0 - 1
    y = (box[2] + box[3]) / 2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return x, y, w, h

def convert_annotation(image_id):
    in_file = open('/home/trainingai/zyang/yolov5/paper_data/Annotations/%s.xml' % (image_id), encoding='UTF-8')
    out_file = open('/home/trainingai/zyang/yolov5/paper_data/labels/%s.txt' % (image_id), 'w')
    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    for obj in root.iter('object'):
        # difficult = obj.find('difficult').text
        difficult = obj.find('Difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
             float(xmlbox.find('ymax').text))
        b1, b2, b3, b4 = b
        # Mark out of range correction
        if b2 > w:
            b2 = w
        if b4 > h:
            b4 = h
        b = (b1, b2, b3, b4)
        bb = convert((w, h), b)
        out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

wd = getcwd()
for image_set in sets:
    if not os.path.exists('/home/trainingai/zyang/yolov5/paper_data/labels/'):
        os.makedirs('/home/trainingai/zyang/yolov5/paper_data/labels/')
    image_ids = open('/home/trainingai/zyang/yolov5/paper_data/ImageSets/Main/%s.txt' % (image_set)).read().strip().split()
    list_file = open('paper_data/%s.txt' % (image_set), 'w')
    for image_id in image_ids:
        list_file.write(abs_path + '/paper_data/images/%s.jpg\n' % (image_id))
        convert_annotation(image_id)
    list_file.close()

##Problem:
  1. Path problem. If it is running on windows, change the path to the path of Windows version.
  2. Error in diffcult tag
  3. The error report is shown in the figure:

Cause analysis:

1. The path of Ubuntu version does not match the path of Windows version. You can modify it.
2. An error will be reported when parsing the diffcult tag. The reason for my error here is that my annotation file does not contain the diffcult tag. Just comment it out. In another case, the diffcult tag written in the original code capitalizes the initial letter, as shown in the figure:

You can cancel the first annotation by changing the second annotation from diffcult to diffcult. Specifically, you should modify it according to the label of the dataset you use.
3. Don't forget to modify the category and modify the target detection category to conform to your own dataset format.

Solution:

For the red error, go back to line 45, that is, there is an error in parsing the 'bndbox' tag. Try to print out the analysis results, as shown in the following figure:


It can be found that the parsing result of the last line is empty. The type is NoneType, so an error will be reported.
Open the dimension file and find that the file is as follows:

The code parses bndbox, but the label file is rbndbox. Because the data is not downloaded on the official website, there may be an error. This label should be a label in another OBB label file, although I don't know what it means.
The processing method is to replace the label of the marked file, or modify the code, and add the robndbox label when judging.

Tags: Python xml

Posted on Fri, 22 Oct 2021 09:31:59 -0400 by Bengo