Because the tutor's project is hand posture estimation, after some technical pre research, it was finally selected Open-MMLab of MMPose As the basic framework to do the underlying architecture. First, let's talk about the advantages of open mmlab:
1. High quality code: modularity is very good and easy to expand
2. Support for multiple research directions: there are now more than a dozen research directions, including MMCV (basis for computer vision research), MMDetection (target detection toolbox), MMDetection3D (3D target detection), MMSegmentation (semantic segmentation toolbox), MMClassification (classification toolbox), MMPose (pose estimation toolbox), MMAction2 (action understanding toolbox) MMFashion (visual fashion analysis toolbox), MMEditing (image and video editing toolbox), MMFlow (optical flow toolbox), etc
3. Active community: community communication is generally through QQ group or wechat group. Any questions are answered in the community in a relatively timely manner. Moreover, for some business ideas, community volunteers or officials will also give some ideas or suggestions, which is very good.
4. High performance: the off the shelf SOTA algorithm and pre training model provided are significantly better than the official implementation
5. Rich documents: the official documents have very detailed Chinese and English tutorials in different research directions. Novices can quickly start according to the tutorials and praise one
During the technical pre research, I saw the flying pulp including Baidu( PP ), open-minded Tianyuan( MegEngine ), they have done very well, but there are fewer research directions supported.
The experimental process and some precautions are shown below. I hope to help the predestined friends, o( ̄)  ̄) o
1: Environmental preparation
1.1 basic environment
Operating system: Windows 10
Python: 3.8
Anaconda: 4.10.1 (this version is not important)
Anaconda environment installation. If there is no installation, please search and install by yourself, or refer to my previous installation tutorial( Windows10 installation Anaconda + CUDA + CUDNN )
1.2 installing Labelme
- Please refer to the official installation address: https://github.com/wkentaro/labelme
1.2.1: create anaconda virtual environment
conda create -n labelme python=3.8
1.2.2: activate virtual environment
conda activate labelme
1.2.3: dependency of labelme installation
conda install pillow conda install pyqt
1.2.3: installing labelme
pip install labelme
1.2.4: Operation
lableme
2. Mark pictures
The marking process is very simple, that is, the table frame of the hand and the 21 key points of the hand
After finishing, it looks like this:
3. Label conversion
The popular annotations are COCO and VOC. Because open mmlab supports COCO better, it will be converted to a data set in COCO format.
The code is as follows, with detailed comments.
import os import sys import glob import json import shutil import argparse import numpy as np import PIL.Image import os.path as osp from tqdm import tqdm from labelme import utils from sklearn.model_selection import train_test_split class Labelme2coco_keypoints(): def __init__(self, args): """ Lableme Key dataset to COCO Constructor for dataset: Args args: Parameters entered on the command line - class_name Root class name """ self.classname_to_id = {args.class_name: 1} self.images = [] self.annotations = [] self.categories = [] self.ann_id = 0 self.img_id = 0 def save_coco_json(self, instance, save_path): json.dump(instance, open(save_path, 'w', encoding='utf-8'), ensure_ascii=False, indent=1) def read_jsonfile(self, path): with open(path, "r", encoding='utf-8') as f: return json.load(f) def _get_box(self, points): min_x = min_y = np.inf max_x = max_y = 0 for x, y in points: min_x = min(min_x, x) min_y = min(min_y, y) max_x = max(max_x, x) max_y = max(max_y, y) return [min_x, min_y, max_x - min_x, max_y - min_y] def _get_keypoints(self, points, keypoints, num_keypoints): """ analysis labelme Raw data, generation coco Dimensioned key objects For example: "keypoints": [ 67.06149888292556, # Value of x 122.5043507571318, # Value of y 1, # Equivalent to the Z value. If it is a 2D key, 0: invisible and 1: visible. 82.42582269256718, 109.95672933232304, 1, ..., ], """ if points[0] == 0 and points[1] == 0: visable = 0 else: visable = 1 num_keypoints += 1 keypoints.extend([points[0], points[1], visable]) return keypoints, num_keypoints def _image(self, obj, path): """ analysis labelme of obj Objects, generating coco of image object The generation includes: id,file_name,height,width 4 Attributes Example: { "file_name": "training/rgb/00031426.jpg", "height": 224, "width": 224, "id": 31426 } """ image = {} img_x = utils.img_b64_to_arr(obj['imageData']) # Get the imageData attribute of the original labelme tag and convert it into array through the tool method of labelme image['height'], image['width'] = img_x.shape[:-1] # Get the width and height of the picture # self.img_id = int(os.path.basename(path).split(".json")[0]) self.img_id = self.img_id + 1 image['id'] = self.img_id image['file_name'] = os.path.basename(path).replace(".json", ".jpg") return image def _annotation(self, bboxes_list, keypoints_list, json_path): """ generate coco tagging Args: bboxes_list: Rectangular callout box keypoints_list: Key points json_path: json File path """ if len(keypoints_list) != args.join_num * len(bboxes_list): print('you loss {} keypoint(s) with file {}'.format(args.join_num * len(bboxes_list) - len(keypoints_list), json_path)) print('Please check !!!') sys.exit() i = 0 for object in bboxes_list: annotation = {} keypoints = [] num_keypoints = 0 label = object['label'] bbox = object['points'] annotation['id'] = self.ann_id annotation['image_id'] = self.img_id annotation['category_id'] = int(self.classname_to_id[label]) annotation['iscrowd'] = 0 annotation['area'] = 1.0 annotation['segmentation'] = [np.asarray(bbox).flatten().tolist()] annotation['bbox'] = self._get_box(bbox) for keypoint in keypoints_list[i * args.join_num: (i + 1) * args.join_num]: point = keypoint['points'] annotation['keypoints'], num_keypoints = self._get_keypoints(point[0], keypoints, num_keypoints) annotation['num_keypoints'] = num_keypoints i += 1 self.ann_id += 1 self.annotations.append(annotation) def _init_categories(self): """ initialization COCO Label category for For example: "categories": [ { "supercategory": "hand", "id": 1, "name": "hand", "keypoints": [ "wrist", "thumb1", "thumb2", ..., ], "skeleton": [ ] } ] """ for name, id in self.classname_to_id.items(): category = {} category['supercategory'] = name category['id'] = id category['name'] = name # 21 key point data category['keypoint'] = [ "wrist", "thumb1", "thumb2", "thumb3", "thumb4", "forefinger1", "forefinger2", "forefinger3", "forefinger4", "middle_finger1", "middle_finger2", "middle_finger3", "middle_finger4", "ring_finger1", "ring_finger2", "ring_finger3", "ring_finger4", "pinky_finger1", "pinky_finger2", "pinky_finger3", "pinky_finger4"] # category['keypoint'] = [str(i + 1) for i in range(args.join_num)] self.categories.append(category) def to_coco(self, json_path_list): """ Labelme Convert original label to coco Dataset format, including labels and images Args: json_path_list: Directory of the original dataset """ self._init_categories() for json_path in tqdm(json_path_list): obj = self.read_jsonfile(json_path) # Parse a label file self.images.append(self._image(obj, json_path)) # Parse picture shapes = obj['shapes'] # Read labelme shape annotation bboxes_list, keypoints_list = [], [] for shape in shapes: if shape['shape_type'] == 'rectangle': # bboxs bboxes_list.append(shape) # keypoints elif shape['shape_type'] == 'point': keypoints_list.append(shape) self._annotation(bboxes_list, keypoints_list, json_path) keypoints = {} keypoints['info'] = {'description': 'Monitor Dataset', 'version': 1.0, 'year': 2020} keypoints['license'] = ['Acer'] keypoints['images'] = self.images keypoints['annotations'] = self.annotations keypoints['categories'] = self.categories return keypoints def init_dir(base_path): """ initialization COCO Folder structure of the dataset; coco - annotations #Label file path - train #Training data set - val #Validation dataset Args: base_path: Root path of dataset placement """ if not os.path.exists(os.path.join(base_path, "coco", "annotations")): os.makedirs(os.path.join(base_path, "coco", "annotations")) if not os.path.exists(os.path.join(base_path, "coco", "train")): os.makedirs(os.path.join(base_path, "coco", "train")) if not os.path.exists(os.path.join(base_path, "coco", "val")): os.makedirs(os.path.join(base_path, "coco", "val")) if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument("--class_name", "--n", help="class name", type=str, required=True) parser.add_argument("--input", "--i", help="json file path (labelme)", type=str, required=True) parser.add_argument("--output", "--o", help="output file path (coco format)", type=str, required=True) parser.add_argument("--join_num", "--j", help="number of join", type=int, required=True) parser.add_argument("--ratio", "--r", help="train and test split ratio", type=float, default=0.12) args = parser.parse_args() labelme_path = args.input saved_coco_path = args.output init_dir(saved_coco_path) # Initializes the folder structure of the COCO dataset json_list_path = glob.glob(labelme_path + "/*.json") train_path, val_path = train_test_split(json_list_path, test_size=args.ratio) print('{} for training'.format(len(train_path)), '\n{} for testing'.format(len(val_path))) print('Start transform please wait ...') l2c_train = Labelme2coco_keypoints(args) # Construct dataset generation class # Generate training set train_keypoints = l2c_train.to_coco(train_path) l2c_train.save_coco_json(train_keypoints, os.path.join(saved_coco_path, "coco", "annotations", "keypoints_train.json")) # Generate validation set l2c_val = Labelme2coco_keypoints(args) val_instance = l2c_val.to_coco(val_path) l2c_val.save_coco_json(val_instance, os.path.join(saved_coco_path, "coco", "annotations", "keypoints_val.json")) # Copy the original pictures of labelme to the training set and verification set for file in train_path: shutil.copy(file.replace("json", "bmp"), os.path.join(saved_coco_path, "coco", "train")) for file in val_path: shutil.copy(file.replace("json", "bmp"), os.path.join(saved_coco_path, "coco", "val"))
The code is a little messy. Optimize it at the weekend. O(∩∩) O ha ha~
Welcome students with questions to leave messages for discussion. I'm just entering the pit. The road of alchemy has just started. Come on~~~~
reference
[1]. labelme batch json_to_dataset conversion
[2]. A new chapter in OpenMMLab
[3]. Labelme tutorial
[4]. m5823779/labelme2coco-keypoints