The recognition rate of mask detection is amazing. This Python project is open-source

Yun Qi Hao: https://yqh.aliyun.com
The first-hand cloud information, the selected cloud enterprise case base of different industries, and the best practices extracted from many successful cases help you to make cloud decision!

Yesterday, I saw an interesting open source project on GitHub, which can test whether we wear masks. After running the program test, I found that the recognition rate is very high, and it also adapts to different environments, so I shared it with you.
First of all, I would like to thank AIZOOTech for its open source project, FaceMaskDetection A kind of

testing environment

We use:

  • Windows system;
  • Software: PyCharm;
  • Use model: TensorFlow.

First look at the effect:

The handsome Hu Ge was detected without a mask. The red frame is to circle the face part. The font above: NoMask, accuracy rate 1 (i.e. it is 100% sure that there is no mask).

Can it be detected in the case of multiple people? As shown in the figure below.

Not bad, this model can detect multiple people at the same time, and has high accuracy.

Someone with a mask, someone without a mask, can you detect it?

Wow, this model is great. I found uncle with mask and two guys without mask.

Next, let's analyze this project in detail:

  • It supports 5 mainstream deep learning frameworks (PyTorch, TensorFlow, MXNet, Keras and Caffe), and has written the interface; you can select the appropriate framework according to your own environment, such as TensorFlow; all models are under the models folder.
  • Nearly 8000 data and models of face masks have been disclosed. The data set comes from the WIDER Face and MAFA data set. The annotation has been modified and verified (mainly because the face position definitions of MAFA and WIDER Face are different, so the annotation has been modified) and it has been open-source.

model structure

In this project, the SSD architecture is used. In order to make the model run on the browser and terminal devices in real time, the model is designed to be very small, with only 1.015 million parameters. The model structure is in the appendix of this paper.

The input size of the model is 260x260, and there are only eight roll up layers in the backbone network. In addition to the location and classification layers, there are only 24 layers in total (the number of channels in each layer is 3264128), so the model is very small, with only 1015000 parameters. The model can basically detect the ordinary face, but for the small face, the detection effect is certainly not as good as the large model.

The web page uses Tensorflow.js library, so the model runs completely in the browser. The speed of running depends on the configuration of the computer.

The model connects the location classification layer on five volume layers, and its size and anchor setting information are shown in the table below.

Structure analysis of project package catalog

After downloading the FaceMaskDetection package, decompress it as follows:

How to run the program?

Take TensorFlow model as an example, the TensorFlow version in the code should be 1.x;

If the TensorFlow version is a friend of 2.x, the corresponding function is modified to tf.compat.v1.xxxx to make the function compatible with 1.x version.

If you want to run pictures:

python tenforflow_infer.py  --img-path /path/to/your/img

For example, if the author places some pictures in the img directory, choose demo2.jpg.

python tenforflow_infer.py  --img-path  img/demo2.jpg

Operation result:

If you want to run the run video:

python tenforflow_infer.py --img-mode 0 --video-path /path/to/video

/path/to/video is the path + video name of the video.

If you want to use the camera to detect in real time:

python tenforflow_infer.py --img-mode 0 --video-path 0

Here, 0 represents the device number in the computer; 0 is the camera of the computer by default.

If you want to use an external camera, you can change it to 1 (for example, connect an external USB camera).

Here, take a look at the code tenforflow_infer.py:

# -*- coding:utf-8 -*-
import cv2
import time
import argparse

import numpy as np
from PIL import Image
from keras.models import model_from_json
from utils.anchor_generator import generate_anchors
from utils.anchor_decode import decode_bbox
from utils.nms import single_class_non_max_suppression
from load_model.tensorflow_loader import load_tf_model, tf_inference


#sess, graph = load_tf_model('FaceMaskDetection-master\models\face_mask_detection.pb')
sess, graph = load_tf_model('models\face_mask_detection.pb')
# anchor configuration
feature_map_sizes = [[33, 33], [17, 17], [9, 9], [5, 5], [3, 3]]
anchor_sizes = [[0.04, 0.056], [0.08, 0.11], [0.16, 0.22], [0.32, 0.45], [0.64, 0.72]]
anchor_ratios = [[1, 0.62, 0.42]] * 5

# generate anchors
anchors = generate_anchors(feature_map_sizes, anchor_sizes, anchor_ratios)

#For inference, the batch size is 1, and the output shape of the model is [1, N, 4], so the dim of the anchor point is extended to [1, anchor_num, 4]
anchors_exp = np.expand_dims(anchors, axis=0)
id2class = {0: 'Mask', 1: 'NoMask'}


def inference(image, conf_thresh=0.5, iou_thresh=0.4, target_shape=(160, 160), draw_result=True, show_result=True):
    '''  Main functions of detection reasoning
   # : param image: 3D numpy image array
    #  : param conf_thresh: the minimum threshold value of classification probability.
   #  : param IOU? Threshold
   #  : param target_shape: model input size.
   #  : param draw "result: whether to drag the border into the image.
   #  : param show_result: whether to display the image.
    '''
    # image = np.copy(image)
    output_info = []
    height, width, _ = image.shape
    image_resized = cv2.resize(image, target_shape)
    image_np = image_resized / 255.0  # Normalized to 0 ~ 1
    image_exp = np.expand_dims(image_np, axis=0)
    y_bboxes_output, y_cls_output = tf_inference(sess, graph, image_exp)

    # remove the batch dimension, for batch is always 1 for inference.
    y_bboxes = decode_bbox(anchors_exp, y_bboxes_output)[0]
    y_cls = y_cls_output[0]
    # To speed up, perform a single class NMS instead of a multi class NMS.
    bbox_max_scores = np.max(y_cls, axis=1)
    bbox_max_score_classes = np.argmax(y_cls, axis=1)

    # Keep? IDX is the active bounding box after nms.
    keep_idxs = single_class_non_max_suppression(y_bboxes, bbox_max_scores, conf_thresh=conf_thresh,iou_thresh=iou_thresh)
    for idx in keep_idxs:
        conf = float(bbox_max_scores[idx])
        class_id = bbox_max_score_classes[idx]
        bbox = y_bboxes[idx]
        # Crop the coordinates to avoid exceeding the image boundary.
        xmin = max(0, int(bbox[0] * width))
        ymin = max(0, int(bbox[1] * height))
        xmax = min(int(bbox[2] * width), width)
        ymax = min(int(bbox[3] * height), height)

        if draw_result:
            if class_id == 0:
                color = (0, 255, 0)
            else:
                color = (255, 0, 0)
            cv2.rectangle(image, (xmin, ymin), (xmax, ymax), color, 2)
            cv2.putText(image, "%s: %.2f" % (id2class[class_id], conf), (xmin + 2, ymin - 2),
                        cv2.FONT_HERSHEY_SIMPLEX, 1, color)
        output_info.append([class_id, conf, xmin, ymin, xmax, ymax])

    if show_result:
        Image.fromarray(image).show()
    return output_info


def run_on_video(video_path, output_video_name, conf_thresh):
    cap = cv2.VideoCapture(video_path)
    height = cap.get(cv2.CAP_PROP_FRAME_HEIGHT)
    width = cap.get(cv2.CAP_PROP_FRAME_WIDTH)
    fps = cap.get(cv2.CAP_PROP_FPS)
    fourcc = cv2.VideoWriter_fourcc(*'XVID')
    #writer = cv2.VideoWriter(output_video_name, fourcc, int(fps), (int(width), int(height)))
    total_frames = cap.get(cv2.CAP_PROP_FRAME_COUNT)
    if not cap.isOpened():
        raise ValueError("Video open failed.")
        return
    status = True
    idx = 0
    while status:
        start_stamp = time.time()
        status, img_raw = cap.read()
        img_raw = cv2.cvtColor(img_raw, cv2.COLOR_BGR2RGB)
        read_frame_stamp = time.time()
        if (status):
            inference(img_raw,
                      conf_thresh,
                      iou_thresh=0.5,
                      target_shape=(260, 260),
                      draw_result=True,
                      show_result=False)
            cv2.imshow('image', img_raw[:, :, ::-1])
            cv2.waitKey(1)
            inference_stamp = time.time()
            # writer.write(img_raw)
            write_frame_stamp = time.time()
            idx += 1
            print("%d of %d" % (idx, total_frames))
            print("read_frame:%f, infer time:%f, write time:%f" % (read_frame_stamp - start_stamp,
                                                                   inference_stamp - read_frame_stamp,
                                                                   write_frame_stamp - inference_stamp))
    # writer.release()


if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Face Mask Detection")
    parser.add_argument('--img-mode', type=int, default=0, help='set 1 to run on image, 0 to run on video.')  #Set to 1: detect pictures or 0: detect video files (real-time image data)
    parser.add_argument('--img-path', type=str, help='path to your image.')
    parser.add_argument('--video-path', type=str, default='0', help='path to your video, `0` means to use camera.')
    # parser.add_argument('--hdf5', type=str, help='keras hdf5 file')
    args = parser.parse_args()
    if args.img_mode:
        imgPath = args.img_path
        #img = cv2.imread("imgPath")
        img = cv2.imread(imgPath)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        inference(img, show_result=True, target_shape=(260, 260))
    else:
        video_path = args.video_path
        if args.video_path == '0':
            video_path = 0
        run_on_video(video_path, '', conf_thresh=0.5)

Test set PR curve

Because the wire face is a data set with complex tasks and the model is designed very small, the PR curve of the face is not so sexy. At this point, we can design a large model to improve the detection effect of small face.

Thanks again for AIZOOTech's open source project, FaceMaskDetection.

Yun Qi Hao: https://yqh.aliyun.com
The first-hand cloud information, the selected cloud enterprise case base of different industries, and the best practices extracted from many successful cases help you to make cloud decision!

Original release time: March 8, 2020
Author: a small tree x
This article comes from:“ Official account of AI technology base ”, you can pay attention to“ AI technology base"

Tags: Python github Windows Pycharm

Posted on Sun, 08 Mar 2020 23:49:46 -0400 by shaitan