Face animation -- AnimeGAN quick start

Recently, I saw a hot project on Github, so I tried to play under clone.
Project Name: animegan2 pytorch
Project address: https://github.com/bryandlee/animegan2-pytorch

Project introduction

AnimeGAN is a study from Wuhan University and Hubei University of technology, which adopts the combination of neural style migration + generative confrontation network (GAN).

AnimeGAN has proposed to use Tensorflow framework since last year. At present, the second generation version of the project has been developed to support pytroch framework.

Original paper: AnimeGAN: A Novel Lightweight GAN for Photo Animation
https://github.com/TachibanaYoshino/AnimeGAN/blob/master/doc/Chen2020_Chapter_AnimeGAN.pdf

Network structure (the picture is excerpted from the original paper, and the principle will not be studied in detail for the time being):

The project can realize the animation of real-life avatars, as shown in the following dynamic diagram:

Project start

File Preview

Due to the slow downloading of foreign servers, I have backed up all files of the project and other files to be downloaded. The acquisition method can jump to the end of the text.

samples: used to place some sample pictures
weghts: weight file
colab_demo/demo.ipynb: two use cases
convert_weights.py: used to convert the tensorflow model in the weights folder into a pytorch model file
hubconf.py: get the preloaded model on python.hub

In use, only two files are needed:
1,demo.ipynb
2. weights folder (including four models corresponding to different picture sizes, beautification degree, robustness, etc.)

Environment installation

First, open demo.ipynb in jupyter lab, and then configure the environment according to import.

Pytoch installation

The project needs to use the pytoch framework. For the installation of the framework, see my previous blog:
Super simple pytorch (GPU version) installation tutorial (personal test is effective)

cmake installation

During dlib library installation, an error will be reported during direct installation:

CMake must be installed to build the following extensions: dlib

Before that, cmake installation is required.
Installation method:

pip install cmake

face_recognition installation

After installing cmake, don't hurry to install dlib. First install face_recognition, which contains a version of dlib.
Installation method:

pip install face_recognition

dlib inspection and installation

After installing face_recognition, a dlib is built in. At this time, you can try to run the second program in the demo. If there is no error in dlib, you can skip the following steps.
If an error is reported, the version of dlib may not match the version of python, so it needs to be uninstalled and reinstalled.
Unloading method:

pip uninstall dlib

If you use pip to install dlib, you will probably report an error. Therefore, you need to download the dlib package and install it locally.

The following is the correspondence between different versions of dlib and python

Python version 3.5: version 19.4.0
Python version 3.6: 19.6.0;
Python version 3.7: 19.14.0
Python version 3.8: 19.19.0;

My installed python version is 3.8, so I need to download version 19.19.0 of dlib.
Download link: https://pypi.org/project/dlib/19.19.0/#files

Download a file in. tar.gz format:
After downloading, unzip it. Switch to the unzipped folder on the command line and run:

python setup.py

After running, the corresponding dlib version is installed.

scipy installation

scipy is a science toolkit containing many calculation formulas, which are useful later. The installation is relatively simple:

pip install scipy

requests installation (optional)

Those who have played crawlers know that requests can easily send requests to the server. If you need to obtain network resources, you need to install this library. You don't need to install local image loading.
Installation method:

pip install  requests

Feature point detection database download

The dlib previously installed can be used to detect face features, but the library itself does not have a feature point detection library, and an error will be reported during operation:

Unable to open shape_predictor_68_face_landmarks.dat

Solution:
Place the shape_predictor_68_face_landmarks.dat file in the same folder as the project.
This file has been added to the file I backed up at the end of the document.

Weight file placement

Run the first paragraph of the code. It will use torch.hub.load to download the model file. If your network speed is not fast, it is likely to interrupt the download and generate an error. In fact, the downloaded file is the same as the cloned project file, but it will be added in the buffer of drive C.
Add location as shown in the figure:

It generates two folders under the corresponding folder:

It doesn't matter if the download fails. Just copy the corresponding files in the project.
First file content (i.e. project file)

Content of the second file (storing the weight file)

After it is placed, run the first section of code here, it will not be downloaded online, but will prompt that the file already exists.

Interpretation of code function

The first code is mainly used to set the environment.

#@title Load Face2Paint model

import torch 
from PIL import Image

device = "cuda" if torch.cuda.is_available() else "cpu"
model = torch.hub.load("bryandlee/animegan2-pytorch:main", "generator", device=device).eval()
face2paint = torch.hub.load("bryandlee/animegan2-pytorch:main", "face2paint", device=device, side_by_side=True)

The second code, the core code, cuts the face part through dlib, detects all face feature points, labels and generates pictures.

#@title Face Detector & FFHQ-style Alignment

# https://github.com/woctezuma/stylegan2-projecting-images

import os
import dlib
import collections
from typing import Union, List
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt


def get_dlib_face_detector(predictor_path: str = "shape_predictor_68_face_landmarks.dat"):

    if not os.path.isfile(predictor_path):
        model_file = "shape_predictor_68_face_landmarks.dat.bz2"
        os.system(f"wget http://dlib.net/files/{model_file}")
        os.system(f"bzip2 -dk {model_file}")

    detector = dlib.get_frontal_face_detector()
    shape_predictor = dlib.shape_predictor(predictor_path)

    def detect_face_landmarks(img: Union[Image.Image, np.ndarray]):
        if isinstance(img, Image.Image):
            img = np.array(img)
        faces = []
        dets = detector(img)
        for d in dets:
            shape = shape_predictor(img, d)
            faces.append(np.array([[v.x, v.y] for v in shape.parts()]))
        return faces
    
    return detect_face_landmarks


def display_facial_landmarks(
    img: Image, 
    landmarks: List[np.ndarray],
    fig_size=[15, 15]
):
    plot_style = dict(
        marker='o',
        markersize=4,
        linestyle='-',
        lw=2
    )
    pred_type = collections.namedtuple('prediction_type', ['slice', 'color'])
    pred_types = {
        'face': pred_type(slice(0, 17), (0.682, 0.780, 0.909, 0.5)),
        'eyebrow1': pred_type(slice(17, 22), (1.0, 0.498, 0.055, 0.4)),
        'eyebrow2': pred_type(slice(22, 27), (1.0, 0.498, 0.055, 0.4)),
        'nose': pred_type(slice(27, 31), (0.345, 0.239, 0.443, 0.4)),
        'nostril': pred_type(slice(31, 36), (0.345, 0.239, 0.443, 0.4)),
        'eye1': pred_type(slice(36, 42), (0.596, 0.875, 0.541, 0.3)),
        'eye2': pred_type(slice(42, 48), (0.596, 0.875, 0.541, 0.3)),
        'lips': pred_type(slice(48, 60), (0.596, 0.875, 0.541, 0.3)),
        'teeth': pred_type(slice(60, 68), (0.596, 0.875, 0.541, 0.4))
    }

    fig = plt.figure(figsize=fig_size)
    ax = fig.add_subplot(1, 1, 1)
    ax.imshow(img)
    ax.axis('off')

    for face in landmarks:
        for pred_type in pred_types.values():
            ax.plot(
                face[pred_type.slice, 0],
                face[pred_type.slice, 1],
                color=pred_type.color, **plot_style
            )
    plt.show()



# https://github.com/NVlabs/ffhq-dataset/blob/master/download_ffhq.py

import PIL.Image
import PIL.ImageFile
import numpy as np
import scipy.ndimage


def align_and_crop_face(
    img: Image.Image,
    landmarks: np.ndarray,
    expand: float = 1.0,
    output_size: int = 1024, 
    transform_size: int = 4096,
    enable_padding: bool = True,
):
    # Parse landmarks.
    # pylint: disable=unused-variable
    lm = landmarks
    lm_chin          = lm[0  : 17]  # left-right
    lm_eyebrow_left  = lm[17 : 22]  # left-right
    lm_eyebrow_right = lm[22 : 27]  # left-right
    lm_nose          = lm[27 : 31]  # top-down
    lm_nostrils      = lm[31 : 36]  # top-down
    lm_eye_left      = lm[36 : 42]  # left-clockwise
    lm_eye_right     = lm[42 : 48]  # left-clockwise
    lm_mouth_outer   = lm[48 : 60]  # left-clockwise
    lm_mouth_inner   = lm[60 : 68]  # left-clockwise

    # Calculate auxiliary vectors.
    eye_left     = np.mean(lm_eye_left, axis=0)
    eye_right    = np.mean(lm_eye_right, axis=0)
    eye_avg      = (eye_left + eye_right) * 0.5
    eye_to_eye   = eye_right - eye_left
    mouth_left   = lm_mouth_outer[0]
    mouth_right  = lm_mouth_outer[6]
    mouth_avg    = (mouth_left + mouth_right) * 0.5
    eye_to_mouth = mouth_avg - eye_avg

    # Choose oriented crop rectangle.
    x = eye_to_eye - np.flipud(eye_to_mouth) * [-1, 1]
    x /= np.hypot(*x)
    x *= max(np.hypot(*eye_to_eye) * 2.0, np.hypot(*eye_to_mouth) * 1.8)
    x *= expand
    y = np.flipud(x) * [-1, 1]
    c = eye_avg + eye_to_mouth * 0.1
    quad = np.stack([c - x - y, c - x + y, c + x + y, c + x - y])
    qsize = np.hypot(*x) * 2

    # Shrink.
    shrink = int(np.floor(qsize / output_size * 0.5))
    if shrink > 1:
        rsize = (int(np.rint(float(img.size[0]) / shrink)), int(np.rint(float(img.size[1]) / shrink)))
        img = img.resize(rsize, PIL.Image.ANTIALIAS)
        quad /= shrink
        qsize /= shrink

    # Crop.
    border = max(int(np.rint(qsize * 0.1)), 3)
    crop = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1]))))
    crop = (max(crop[0] - border, 0), max(crop[1] - border, 0), min(crop[2] + border, img.size[0]), min(crop[3] + border, img.size[1]))
    if crop[2] - crop[0] < img.size[0] or crop[3] - crop[1] < img.size[1]:
        img = img.crop(crop)
        quad -= crop[0:2]

    # Pad.
    pad = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1]))))
    pad = (max(-pad[0] + border, 0), max(-pad[1] + border, 0), max(pad[2] - img.size[0] + border, 0), max(pad[3] - img.size[1] + border, 0))
    if enable_padding and max(pad) > border - 4:
        pad = np.maximum(pad, int(np.rint(qsize * 0.3)))
        img = np.pad(np.float32(img), ((pad[1], pad[3]), (pad[0], pad[2]), (0, 0)), 'reflect')
        h, w, _ = img.shape
        y, x, _ = np.ogrid[:h, :w, :1]
        mask = np.maximum(1.0 - np.minimum(np.float32(x) / pad[0], np.float32(w-1-x) / pad[2]), 1.0 - np.minimum(np.float32(y) / pad[1], np.float32(h-1-y) / pad[3]))
        blur = qsize * 0.02
        img += (scipy.ndimage.gaussian_filter(img, [blur, blur, 0]) - img) * np.clip(mask * 3.0 + 1.0, 0.0, 1.0)
        img += (np.median(img, axis=(0,1)) - img) * np.clip(mask, 0.0, 1.0)
        img = PIL.Image.fromarray(np.uint8(np.clip(np.rint(img), 0, 255)), 'RGB')
        quad += pad[:2]

    # Transform.
    img = img.transform((transform_size, transform_size), PIL.Image.QUAD, (quad + 0.5).flatten(), PIL.Image.BILINEAR)
    if output_size < transform_size:
        img = img.resize((output_size, output_size), PIL.Image.ANTIALIAS)

    return img

The third part of the code, set the image import mode (the first network to read pictures, second local loading), then call the algorithm defined in the second paragraph, and produce the result output.

import requests

# img = Image.open(requests.get("https://upload.wikimedia.org/wikipedia/commons/8/85/Elon_Musk_Royal_Society_%28crop1%29.jpg", stream=True).raw).convert("RGB")
img = Image.open(r"C:\Users\hp\Desktop\bidao.png").convert("RGB")

face_detector = get_dlib_face_detector()
landmarks = face_detector(img)

display_facial_landmarks(img, landmarks, fig_size=[5, 5])

for landmark in landmarks:
    face = align_and_crop_face(img, landmark, expand=1.3)
    display(face2paint(model=model, img=face, size=512))

View effect

Well, after a long and complex environment configuration, you can see the effect by importing the picture.
Let me take Bi Dao's picture as an example:
Original drawing:

After face extraction and annotation:

Animation comparison:

The result is very Amazing!
However, due to the low resolution of the original image, the animation does not seem to be complete. For example, the ear still retains the three-dimensional feature, which is still a little away from the real two-dimensional feature.

Resource acquisition

I have packed the documents of this project on my WeChat official account, and I can get it back by "anime".

Tags: Python AI GAN

Posted on Sat, 27 Nov 2021 13:54:24 -0500 by jaret