Labelme creates 21 key point data sets of the hand and converts them to COCO format

     Because the tutor's project is hand posture estimation, after some technical pre research, it was finally selected Open-MMLab of MMPose As the basic framework to do the underlying architecture. First, let's talk about the advantages of open mmlab:

1. High quality code: modularity is very good and easy to expand
2. Support for multiple research directions: there are now more than a dozen research directions, including MMCV (basis for computer vision research), MMDetection (target detection toolbox), MMDetection3D (3D target detection), MMSegmentation (semantic segmentation toolbox), MMClassification (classification toolbox), MMPose (pose estimation toolbox), MMAction2 (action understanding toolbox) MMFashion (visual fashion analysis toolbox), MMEditing (image and video editing toolbox), MMFlow (optical flow toolbox), etc
3. Active community: community communication is generally through QQ group or wechat group. Any questions are answered in the community in a relatively timely manner. Moreover, for some business ideas, community volunteers or officials will also give some ideas or suggestions, which is very good.
4. High performance: the off the shelf SOTA algorithm and pre training model provided are significantly better than the official implementation
5. Rich documents: the official documents have very detailed Chinese and English tutorials in different research directions. Novices can quickly start according to the tutorials and praise one

     During the technical pre research, I saw the flying pulp including Baidu( PP ), open-minded Tianyuan( MegEngine ), they have done very well, but there are fewer research directions supported.
     The experimental process and some precautions are shown below. I hope to help the predestined friends, o( ̄)  ̄) o

1: Environmental preparation

1.1 basic environment

Operating system: Windows 10
Python: 3.8
Anaconda: 4.10.1 (this version is not important)

Anaconda environment installation. If there is no installation, please search and install by yourself, or refer to my previous installation tutorial( Windows10 installation Anaconda + CUDA + CUDNN )

1.2 installing Labelme

  • Please refer to the official installation address:

1.2.1: create anaconda virtual environment

conda create -n labelme python=3.8

1.2.2: activate virtual environment

conda activate labelme

1.2.3: dependency of labelme installation

conda install pillow
conda install pyqt

1.2.3: installing labelme

pip install labelme

1.2.4: Operation


2. Mark pictures

The marking process is very simple, that is, the table frame of the hand and the 21 key points of the hand
After finishing, it looks like this:

3. Label conversion

     The popular annotations are COCO and VOC. Because open mmlab supports COCO better, it will be converted to a data set in COCO format.

The code is as follows, with detailed comments.

import os
import sys
import glob
import json
import shutil
import argparse
import numpy as np
import PIL.Image
import os.path as osp
from tqdm import tqdm
from labelme import utils
from sklearn.model_selection import train_test_split

class Labelme2coco_keypoints():
    def __init__(self, args):
        Lableme Key dataset to COCO Constructor for dataset:

            args: Parameters entered on the command line
                - class_name Root class name


        self.classname_to_id = {args.class_name: 1}
        self.images = []
        self.annotations = []
        self.categories = []
        self.ann_id = 0
        self.img_id = 0

    def save_coco_json(self, instance, save_path):
        json.dump(instance, open(save_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=1)

    def read_jsonfile(self, path):
        with open(path, "r", encoding='utf-8') as f:
            return json.load(f)

    def _get_box(self, points):
        min_x = min_y = np.inf
        max_x = max_y = 0
        for x, y in points:
            min_x = min(min_x, x)
            min_y = min(min_y, y)
            max_x = max(max_x, x)
            max_y = max(max_y, y)
        return [min_x, min_y, max_x - min_x, max_y - min_y]

    def _get_keypoints(self, points, keypoints, num_keypoints):
        analysis labelme Raw data, generation coco Dimensioned key objects

        For example:
            "keypoints": [
                67.06149888292556,  # Value of x
                122.5043507571318,  # Value of y
                1,                  # Equivalent to the Z value. If it is a 2D key, 0: invisible and 1: visible.


        if points[0] == 0 and points[1] == 0:
            visable = 0
            visable = 1
            num_keypoints += 1
        keypoints.extend([points[0], points[1], visable])
        return keypoints, num_keypoints

    def _image(self, obj, path):
        analysis labelme of obj Objects, generating coco of image object

        The generation includes: id,file_name,height,width 4 Attributes

                "file_name": "training/rgb/00031426.jpg",
                "height": 224,
                "width": 224,
                "id": 31426


        image = {}

        img_x = utils.img_b64_to_arr(obj['imageData'])  # Get the imageData attribute of the original labelme tag and convert it into array through the tool method of labelme
        image['height'], image['width'] = img_x.shape[:-1]  # Get the width and height of the picture

        # self.img_id = int(os.path.basename(path).split(".json")[0])
        self.img_id = self.img_id + 1
        image['id'] = self.img_id

        image['file_name'] = os.path.basename(path).replace(".json", ".jpg")

        return image

    def _annotation(self, bboxes_list, keypoints_list, json_path):
        generate coco tagging

            bboxes_list:  Rectangular callout box
            keypoints_list:  Key points
            json_path: json File path


        if len(keypoints_list) != args.join_num * len(bboxes_list):
            print('you loss {} keypoint(s) with file {}'.format(args.join_num * len(bboxes_list) - len(keypoints_list), json_path))
            print('Please check !!!')
        i = 0
        for object in bboxes_list:
            annotation = {}
            keypoints = []
            num_keypoints = 0

            label = object['label']
            bbox = object['points']
            annotation['id'] = self.ann_id
            annotation['image_id'] = self.img_id
            annotation['category_id'] = int(self.classname_to_id[label])
            annotation['iscrowd'] = 0
            annotation['area'] = 1.0
            annotation['segmentation'] = [np.asarray(bbox).flatten().tolist()]
            annotation['bbox'] = self._get_box(bbox)

            for keypoint in keypoints_list[i * args.join_num: (i + 1) * args.join_num]:
                point = keypoint['points']
                annotation['keypoints'], num_keypoints = self._get_keypoints(point[0], keypoints, num_keypoints)
            annotation['num_keypoints'] = num_keypoints

            i += 1
            self.ann_id += 1

    def _init_categories(self):
        initialization COCO Label category for

        For example:
        "categories": [
                "supercategory": "hand",
                "id": 1,
                "name": "hand",
                "keypoints": [
                "skeleton": [

        for name, id in self.classname_to_id.items():
            category = {}

            category['supercategory'] = name
            category['id'] = id
            category['name'] = name
            # 21 key point data
            category['keypoint'] = [ "wrist",
            # category['keypoint'] = [str(i + 1) for i in range(args.join_num)]


    def to_coco(self, json_path_list):
        Labelme Convert original label to coco Dataset format, including labels and images

            json_path_list: Directory of the original dataset



        for json_path in tqdm(json_path_list):
            obj = self.read_jsonfile(json_path)  # Parse a label file
            self.images.append(self._image(obj, json_path))  # Parse picture
            shapes = obj['shapes']  # Read labelme shape annotation

            bboxes_list, keypoints_list = [], []
            for shape in shapes:
                if shape['shape_type'] == 'rectangle':  # bboxs
                    bboxes_list.append(shape)           # keypoints
                elif shape['shape_type'] == 'point':

            self._annotation(bboxes_list, keypoints_list, json_path)

        keypoints = {}
        keypoints['info'] = {'description': 'Monitor Dataset', 'version': 1.0, 'year': 2020}
        keypoints['license'] = ['Acer']
        keypoints['images'] = self.images
        keypoints['annotations'] = self.annotations
        keypoints['categories'] = self.categories
        return keypoints

def init_dir(base_path):
    initialization COCO Folder structure of the dataset;
    coco - annotations  #Label file path
         - train        #Training data set
         - val          #Validation dataset
        base_path: Root path of dataset placement
    if not os.path.exists(os.path.join(base_path, "coco", "annotations")):
        os.makedirs(os.path.join(base_path, "coco", "annotations"))
    if not os.path.exists(os.path.join(base_path, "coco", "train")):
        os.makedirs(os.path.join(base_path, "coco", "train"))
    if not os.path.exists(os.path.join(base_path, "coco", "val")):
        os.makedirs(os.path.join(base_path, "coco", "val"))

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument("--class_name", "--n", help="class name", type=str, required=True)
    parser.add_argument("--input", "--i", help="json file path (labelme)", type=str, required=True)
    parser.add_argument("--output", "--o", help="output file path (coco format)", type=str, required=True)
    parser.add_argument("--join_num", "--j", help="number of join", type=int, required=True)
    parser.add_argument("--ratio", "--r", help="train and test split ratio", type=float, default=0.12)
    args = parser.parse_args()

    labelme_path = args.input
    saved_coco_path = args.output

    init_dir(saved_coco_path)  # Initializes the folder structure of the COCO dataset

    json_list_path = glob.glob(labelme_path + "/*.json")
    train_path, val_path = train_test_split(json_list_path, test_size=args.ratio)
    print('{} for training'.format(len(train_path)),
          '\n{} for testing'.format(len(val_path)))
    print('Start transform please wait ...')

    l2c_train = Labelme2coco_keypoints(args)  # Construct dataset generation class

    # Generate training set
    train_keypoints = l2c_train.to_coco(train_path)
    l2c_train.save_coco_json(train_keypoints, os.path.join(saved_coco_path, "coco", "annotations", "keypoints_train.json"))

    # Generate validation set
    l2c_val = Labelme2coco_keypoints(args)
    val_instance = l2c_val.to_coco(val_path)
    l2c_val.save_coco_json(val_instance, os.path.join(saved_coco_path, "coco", "annotations", "keypoints_val.json"))

    # Copy the original pictures of labelme to the training set and verification set
    for file in train_path:
        shutil.copy(file.replace("json", "bmp"), os.path.join(saved_coco_path, "coco", "train"))
    for file in val_path:
        shutil.copy(file.replace("json", "bmp"), os.path.join(saved_coco_path, "coco", "val"))

The code is a little messy. Optimize it at the weekend. O(∩∩) O ha ha~
Welcome students with questions to leave messages for discussion. I'm just entering the pit. The road of alchemy has just started. Come on~~~~

[1]. labelme batch json_to_dataset conversion
[2]. A new chapter in OpenMMLab
[3]. Labelme tutorial
[4]. m5823779/labelme2coco-keypoints

Tags: AI Computer Vision

Posted on Fri, 19 Nov 2021 05:37:29 -0500 by Wardy7