Human image segmentation based on image segmentation network HRNet

Human image segmentation based on image segmentation network HRNet

Portrait segmentation is a very common application in the field of image segmentation. PaddleSeg launched a portrait segmentation PPSeg model trained on large-scale portrait data to meet the needs of a variety of use scenarios on the server, mobile and Web. This tutorial provides the whole process application guide from training to deployment, as well as the actual effect experience of video stream portrait segmentation and background replacement. The latest release of ultra lightweight portrait segmentation model supports real-time segmentation of Web and mobile scenes.

Recently, baidu video conference has launched the virtual background function, which supports background switching and background virtualization during web video conference. The portrait for background model adopts our ultra lightweight model ppseg Lite. Welcome Baidu home page Experience the effect in the lower right corner!

If you think this case is helpful to you, welcome Star to collect it. It's not easy to lose ~, the link shows the way:
https://github.com/PaddlePaddle/awesome-DeepLearning

1. Scheme design

In this tutorial, take the pictures or videos uploaded by the user as the input, use the portrait segmentation model trained based on the supervised persons dataset or the prepared influence model to carry out the portrait segmentation experiment, and return the segmentation results to the user in real time.

2. Environmental construction and preparation

  1. Install PaddlePaddle

Version requirements

  • PaddlePaddle >= 2.0.2

  • Python >= 3.7+

Because the computation cost of image segmentation model is high, it is recommended to use PaddleSeg in GPU version PaddlePaddle. It is recommended to install CUDA environment above 10.0. For installation tutorials, see PaddlePaddle official website.

  1. Installing the PaddleSeg package
!pip install paddleseg
  1. Download PaddleSeg warehouse
# If you are running in a local environment, use git to download PaddleSeg code
#!git clone https://github.com/PaddlePaddle/PaddleSeg.git

# For quick experience, we have downloaded the PaddleSeg code here. After decompression, you can directly execute the subsequent code
!unzip PaddleSeg.zip

In the following case, all commands are executed in the PaddleSeg/contrib/HumanSeg directory.

%cd PaddleSeg/contrib/HumanSeg

3. Data processing

This tutorial uses supervise.ly Experiments were carried out on the published portrait segmentation data set supervisory persons. The supervised portrait segmentation dataset contains 5711 images and 6884 portrait annotations, and all data are finely labeled.


(image source [1])

In this tutorial, we randomly select a small part from the supervised portrait segmentation data set and convert it into PaddleSeg, which can directly load the data format. At the same time, we provide the portrait video of the front camera of the mobile phone_ Test.mp4. Quick download by running the following code:

reference

[1]Releasing "Supervisely Person" dataset for teaching machines to segment humans. Supervise.ly

!python data/download_data.py

4. Model construction

Currently, PaddleSeg provides the following models for portrait segmentation tasks, including general portrait segmentation model and bust segmentation model.

  • Generic human segmentation model

PPSeg opens the three portrait models trained on large-scale portrait data to meet the needs of a variety of use scenarios on the server, mobile and Web.

Model nameModel descriptionCheckpointInference Model
PPSeg-ServerThe high-precision model is applicable to the portrait scene with complex background on the server GPU. The model structure is Deeplabv3+/ResNet50, and the input size is (512512)ppseg_server_ckptppseg_server_inference
PPSeg-MobileThe lightweight model is applicable to the front camera scene of the CPU at the mobile end or service end. The model structure is HRNet_w18_samll_v1, input size (192192)ppseg_mobile_ckptppseg_mobile_inference
PPSeg-LiteThe ultra lightweight model is suitable for real-time segmentation scenes on the Web or mobile end, such as mobile selfie and Web video conference. The model structure is Baidu self-developed model, and the input size is (192192)ppseg_lite_ckptppseg_lite_inference

NOTE:

  • Where Checkpoint is the model weight, which is used in the fine tuning scene.

  • The influence model is a prediction deployment model, which contains the calculation diagram structure of model.pdmodel, model.pdiparams model parameters, and model configuration information based on deploy.yaml.

  • Among them, the influence model is applicable to the predictive deployment of CPU and GPU on the server side, and is applicable to the deployment of end-side devices such as mobile terminal through paddy Lite. See more paddy Lite deployment instructions Paste Lite document

Model performance

Model nameInput SizeFLOPSParametersCalculation timeModel size
PPSeg-Server512x512114G26.8M37.96ms103Mb
PPSeg-Mobile192x192584M1.54M13.17ms5.9Mb
PPSeg-Lite192x192121M137K10.51ms543Kb

Test environment: Nvidia Tesla V100 single card.

  • Bust segmentation model

For the Portrait segmentation scene, PPSeg has opened the bust segmentation model, which has been applied to Baidu video conference.

Model nameModel descriptionCheckpointInference Model
PPSeg-LiteThe ultra lightweight model is suitable for real-time segmentation scenes on the Web or mobile end, such as mobile selfie and Web video conference. The model structure is Baidu self-developed model, and the recommended input size is (398224)ppseg_lite_portrait_ckptppseg_lite_portrait_inference

Model performance

Model nameInput SizeFLOPSParametersCalculation timeModel size
PPSeg-Lite398x224266M137K23.49ms543Kb
PPSeg-Lite288x162138M137K15.62ms543Kb

Test environment: the graph structure is optimized by using pad.js converter and deployed on the Web. The graphics card model is AMD Radeon Pro 5300M 4 GB.

Execute the following script to quickly download all checkpoints as a pre training model.

!python pretrained_model/download_pretrained_model.py

5. Model training

In this case, based on the above large-scale data pre training model, in the extraction part supervise.ly Perform fine tuning on the dataset. In order to facilitate you to quickly experience the effect of portrait segmentation model, this tutorial selects the relatively lightweight HRNet w18 small v1 model for experiment. The training commands are as follows:

!export CUDA_VISIBLE_DEVICES=0 # Set 1 available card
# Please execute the following command under windows
# set CUDA_VISIBLE_DEVICES=0
!python train.py \
--config configs/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely.yml \
--save_dir saved_model/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely \
--save_interval 100 --do_eval --use_vdl

NOTE:

If you want to change the training configuration, you need to modify configs / FCN_ hrnetw18_ small_ v1_ humanseg_ 192x192_ mini_ Specific parameters in the super.yml configuration file.

More command line help can be viewed by running the following command:

!python train.py --help

6. Model evaluation

Here, we use the validation set to evaluate the training completed model.

!python val.py \
--config configs/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely.yml \
--model_path saved_model/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely/best_model/model.pdparams

7. Model prediction

Here, we use the following commands for model prediction, and the prediction results are saved in. / output/result / folder by default.

!python predict.py \
--config configs/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely.yml \
--model_path saved_model/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely/best_model/model.pdparams \
--image_path data/human_image.jpg

8. Model export

Here, you can also export the trained model into static graph model and ppseg Lite model to facilitate subsequent model deployment.

  • Export model as static graph model

Make sure it is in the PaddleSeg directory and execute the following script:

!export CUDA_VISIBLE_DEVICES=0 # Set 1 available card
# Please execute the following command under windows
# set CUDA_VISIBLE_DEVICES=0
!python ../../export.py \
--config configs/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely.yml \
--model_path saved_model/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely/best_model/model.pdparams \
--save_dir export_model/fcn_hrnetw18_small_v1_humanseg_192x192_mini_supervisely_with_softmax \
--without_argmax --with_softmax
  • Export ppseg Lite model
!python ../../export.py \
--config ../../configs/ppseg_lite/ppseg_lite_export_398x224.yml \
--save_dir export_model/ppseg_lite_portrait_398x224_with_softmax \
--model_path pretrained_model/ppseg_lite_portrait_398x224/model.pdparams \
--without_argmax --with_softmax

Export script parameter interpretation

Parameter namepurposeRequiredDefault value
configconfiguration fileyes-
save_dirSave root path of model and visual DL log filesnooutput
model_pathPath of pre training model parametersnoThe value specified in the configuration file
with_softmaxAdd softmax operator at the end of the network. Since PaddleSeg networking returns logits by default, it can be set to True if you want to obtain the probability value of the deployment modelnoFalse
without_argmaxWhether the argmax operator is not added at the end of the network. Since PaddleSeg networking returns logits by default, in order to obtain the prediction results directly from the deployment model, we add argmax operator at the end of the network by defaultnoFalse

Result file

output
  ├── deploy.yaml            # Deployment related profiles
  ├── model.pdiparams        # Static graph model parameters
  ├── model.pdiparams.info   # Parameter additional information, generally no need to pay attention to
  └── model.pdmodel          # Static graph model file

9. Model deployment

  • Web side deployment

See Web side deployment tutorial

  • Mobile deployment

See Mobile deployment tutorial

10. Quick experience

Here we provide you with the trained influence model for you to quickly experience the portrait segmentation function.

  • Download influence model

Execute the following script to quickly download all influence models

!python export_model/download_export_model.py
  • Video stream portrait segmentation

Combined with the prediction results and segmentation results of DIS (dense inverse search based method) optical flow algorithm, the video stream portrait segmentation is improved.

# Real time segmentation through computer camera
!python bg_replace.py \
--config export_model/ppseg_lite_portrait_398x224_with_softmax/deploy.yaml

# Segmentation of portrait video
!python bg_replace.py \
--config export_model/deeplabv3p_resnet50_os8_humanseg_512x512_100k_with_softmax/deploy.yaml \
--video_path data/video_test.mp4
# Segmentation of portrait video
!python bg_replace.py \
--config export_model/deeplabv3p_resnet50_os8_humanseg_512x512_100k_with_softmax/deploy.yaml \
--video_path data/video_test.mp4

The video segmentation results are as follows:

  • Video stream background replacement

Replace the background according to the selected background. The background can be a picture or a video.

# Carry out real-time background replacement processing through computer camera, or through '-- background_video_path 'incoming background video
!python bg_replace.py \
--config export_model/ppseg_lite_portrait_398x224_with_softmax/deploy.yaml \
--input_shape 224 398 \
--bg_img_path data/background.jpg

# For background replacement of portrait video, you can also use '-- background'_ video_ Path 'incoming background video
!python bg_replace.py \
--config export_model/deeplabv3p_resnet50_os8_humanseg_512x512_100k_with_softmax/deploy.yaml \
--bg_img_path data/background.jpg \
--video_path data/video_test.mp4

# Background replacement for a single image
!python bg_replace.py \
--config export_model/ppseg_lite_portrait_398x224_with_softmax/deploy.yaml \
--input_shape 224 398 \
--img_path data/human_image.jpg
# For background replacement of portrait video, you can also use '-- background'_ video_ Path 'incoming background video
!python bg_replace.py \
--config export_model/deeplabv3p_resnet50_os8_humanseg_512x512_100k_with_softmax/deploy.yaml \
--bg_img_path data/background.jpg \
--video_path data/video_test.mp4

# Background replacement for a single image
!python bg_replace.py \
--config export_model/ppseg_lite_portrait_398x224_with_softmax/deploy.yaml \
--input_shape 224 398 \
--img_path data/human_image.jpg

The background replacement results are as follows:

NOTE:

Video segmentation takes a few minutes. Please wait patiently.

The Portrait model is suitable for wide screen shooting scenes, and the vertical screen effect will be slightly worse.

resources

For more resources, please refer to:

data sources

The data set of this case comes from: https://supervise.ly/

Tags: Computer Vision image processing paddlepaddle

Posted on Sat, 04 Dec 2021 22:02:59 -0500 by ExuLt4Nt