Recently, I saw a hot project on Github, so I tried to play under clone.
Project Name: animegan2 pytorch
Project address: https://github.com/bryandlee/animegan2-pytorch
Project introduction
AnimeGAN is a study from Wuhan University and Hubei University of technology, which adopts the combination of neural style migration + generative confrontation network (GAN).
AnimeGAN has proposed to use Tensorflow framework since last year. At present, the second generation version of the project has been developed to support pytroch framework.
Original paper: AnimeGAN: A Novel Lightweight GAN for Photo Animation
https://github.com/TachibanaYoshino/AnimeGAN/blob/master/doc/Chen2020_Chapter_AnimeGAN.pdf
Network structure (the picture is excerpted from the original paper, and the principle will not be studied in detail for the time being):
The project can realize the animation of real-life avatars, as shown in the following dynamic diagram:
Project start
File Preview
Due to the slow downloading of foreign servers, I have backed up all files of the project and other files to be downloaded. The acquisition method can jump to the end of the text.
samples: used to place some sample pictures
weghts: weight file
colab_demo/demo.ipynb: two use cases
convert_weights.py: used to convert the tensorflow model in the weights folder into a pytorch model file
hubconf.py: get the preloaded model on python.hub
In use, only two files are needed:
1,demo.ipynb
2. weights folder (including four models corresponding to different picture sizes, beautification degree, robustness, etc.)
Environment installation
First, open demo.ipynb in jupyter lab, and then configure the environment according to import.
Pytoch installation
The project needs to use the pytoch framework. For the installation of the framework, see my previous blog:
Super simple pytorch (GPU version) installation tutorial (personal test is effective)
cmake installation
During dlib library installation, an error will be reported during direct installation:
CMake must be installed to build the following extensions: dlib
Before that, cmake installation is required.
Installation method:
pip install cmake
face_recognition installation
After installing cmake, don't hurry to install dlib. First install face_recognition, which contains a version of dlib.
Installation method:
pip install face_recognition
dlib inspection and installation
After installing face_recognition, a dlib is built in. At this time, you can try to run the second program in the demo. If there is no error in dlib, you can skip the following steps.
If an error is reported, the version of dlib may not match the version of python, so it needs to be uninstalled and reinstalled.
Unloading method:
pip uninstall dlib
If you use pip to install dlib, you will probably report an error. Therefore, you need to download the dlib package and install it locally.
The following is the correspondence between different versions of dlib and python
Python version 3.5: version 19.4.0
Python version 3.6: 19.6.0;
Python version 3.7: 19.14.0
Python version 3.8: 19.19.0;
My installed python version is 3.8, so I need to download version 19.19.0 of dlib.
Download link: https://pypi.org/project/dlib/19.19.0/#files
Download a file in. tar.gz format:
After downloading, unzip it. Switch to the unzipped folder on the command line and run:
python setup.py
After running, the corresponding dlib version is installed.
scipy installation
scipy is a science toolkit containing many calculation formulas, which are useful later. The installation is relatively simple:
pip install scipy
requests installation (optional)
Those who have played crawlers know that requests can easily send requests to the server. If you need to obtain network resources, you need to install this library. You don't need to install local image loading.
Installation method:
pip install requests
Feature point detection database download
The dlib previously installed can be used to detect face features, but the library itself does not have a feature point detection library, and an error will be reported during operation:
Unable to open shape_predictor_68_face_landmarks.dat
Solution:
Place the shape_predictor_68_face_landmarks.dat file in the same folder as the project.
This file has been added to the file I backed up at the end of the document.
Weight file placement
Run the first paragraph of the code. It will use torch.hub.load to download the model file. If your network speed is not fast, it is likely to interrupt the download and generate an error. In fact, the downloaded file is the same as the cloned project file, but it will be added in the buffer of drive C.
Add location as shown in the figure:
It generates two folders under the corresponding folder:
It doesn't matter if the download fails. Just copy the corresponding files in the project.
First file content (i.e. project file)
Content of the second file (storing the weight file)
After it is placed, run the first section of code here, it will not be downloaded online, but will prompt that the file already exists.
Interpretation of code function
The first code is mainly used to set the environment.
#@title Load Face2Paint model import torch from PIL import Image device = "cuda" if torch.cuda.is_available() else "cpu" model = torch.hub.load("bryandlee/animegan2-pytorch:main", "generator", device=device).eval() face2paint = torch.hub.load("bryandlee/animegan2-pytorch:main", "face2paint", device=device, side_by_side=True)
The second code, the core code, cuts the face part through dlib, detects all face feature points, labels and generates pictures.
#@title Face Detector & FFHQ-style Alignment # https://github.com/woctezuma/stylegan2-projecting-images import os import dlib import collections from typing import Union, List import numpy as np from PIL import Image import matplotlib.pyplot as plt def get_dlib_face_detector(predictor_path: str = "shape_predictor_68_face_landmarks.dat"): if not os.path.isfile(predictor_path): model_file = "shape_predictor_68_face_landmarks.dat.bz2" os.system(f"wget http://dlib.net/files/{model_file}") os.system(f"bzip2 -dk {model_file}") detector = dlib.get_frontal_face_detector() shape_predictor = dlib.shape_predictor(predictor_path) def detect_face_landmarks(img: Union[Image.Image, np.ndarray]): if isinstance(img, Image.Image): img = np.array(img) faces = [] dets = detector(img) for d in dets: shape = shape_predictor(img, d) faces.append(np.array([[v.x, v.y] for v in shape.parts()])) return faces return detect_face_landmarks def display_facial_landmarks( img: Image, landmarks: List[np.ndarray], fig_size=[15, 15] ): plot_style = dict( marker='o', markersize=4, linestyle='-', lw=2 ) pred_type = collections.namedtuple('prediction_type', ['slice', 'color']) pred_types = { 'face': pred_type(slice(0, 17), (0.682, 0.780, 0.909, 0.5)), 'eyebrow1': pred_type(slice(17, 22), (1.0, 0.498, 0.055, 0.4)), 'eyebrow2': pred_type(slice(22, 27), (1.0, 0.498, 0.055, 0.4)), 'nose': pred_type(slice(27, 31), (0.345, 0.239, 0.443, 0.4)), 'nostril': pred_type(slice(31, 36), (0.345, 0.239, 0.443, 0.4)), 'eye1': pred_type(slice(36, 42), (0.596, 0.875, 0.541, 0.3)), 'eye2': pred_type(slice(42, 48), (0.596, 0.875, 0.541, 0.3)), 'lips': pred_type(slice(48, 60), (0.596, 0.875, 0.541, 0.3)), 'teeth': pred_type(slice(60, 68), (0.596, 0.875, 0.541, 0.4)) } fig = plt.figure(figsize=fig_size) ax = fig.add_subplot(1, 1, 1) ax.imshow(img) ax.axis('off') for face in landmarks: for pred_type in pred_types.values(): ax.plot( face[pred_type.slice, 0], face[pred_type.slice, 1], color=pred_type.color, **plot_style ) plt.show() # https://github.com/NVlabs/ffhq-dataset/blob/master/download_ffhq.py import PIL.Image import PIL.ImageFile import numpy as np import scipy.ndimage def align_and_crop_face( img: Image.Image, landmarks: np.ndarray, expand: float = 1.0, output_size: int = 1024, transform_size: int = 4096, enable_padding: bool = True, ): # Parse landmarks. # pylint: disable=unused-variable lm = landmarks lm_chin = lm[0 : 17] # left-right lm_eyebrow_left = lm[17 : 22] # left-right lm_eyebrow_right = lm[22 : 27] # left-right lm_nose = lm[27 : 31] # top-down lm_nostrils = lm[31 : 36] # top-down lm_eye_left = lm[36 : 42] # left-clockwise lm_eye_right = lm[42 : 48] # left-clockwise lm_mouth_outer = lm[48 : 60] # left-clockwise lm_mouth_inner = lm[60 : 68] # left-clockwise # Calculate auxiliary vectors. eye_left = np.mean(lm_eye_left, axis=0) eye_right = np.mean(lm_eye_right, axis=0) eye_avg = (eye_left + eye_right) * 0.5 eye_to_eye = eye_right - eye_left mouth_left = lm_mouth_outer[0] mouth_right = lm_mouth_outer[6] mouth_avg = (mouth_left + mouth_right) * 0.5 eye_to_mouth = mouth_avg - eye_avg # Choose oriented crop rectangle. x = eye_to_eye - np.flipud(eye_to_mouth) * [-1, 1] x /= np.hypot(*x) x *= max(np.hypot(*eye_to_eye) * 2.0, np.hypot(*eye_to_mouth) * 1.8) x *= expand y = np.flipud(x) * [-1, 1] c = eye_avg + eye_to_mouth * 0.1 quad = np.stack([c - x - y, c - x + y, c + x + y, c + x - y]) qsize = np.hypot(*x) * 2 # Shrink. shrink = int(np.floor(qsize / output_size * 0.5)) if shrink > 1: rsize = (int(np.rint(float(img.size[0]) / shrink)), int(np.rint(float(img.size[1]) / shrink))) img = img.resize(rsize, PIL.Image.ANTIALIAS) quad /= shrink qsize /= shrink # Crop. border = max(int(np.rint(qsize * 0.1)), 3) crop = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1])))) crop = (max(crop[0] - border, 0), max(crop[1] - border, 0), min(crop[2] + border, img.size[0]), min(crop[3] + border, img.size[1])) if crop[2] - crop[0] < img.size[0] or crop[3] - crop[1] < img.size[1]: img = img.crop(crop) quad -= crop[0:2] # Pad. pad = (int(np.floor(min(quad[:,0]))), int(np.floor(min(quad[:,1]))), int(np.ceil(max(quad[:,0]))), int(np.ceil(max(quad[:,1])))) pad = (max(-pad[0] + border, 0), max(-pad[1] + border, 0), max(pad[2] - img.size[0] + border, 0), max(pad[3] - img.size[1] + border, 0)) if enable_padding and max(pad) > border - 4: pad = np.maximum(pad, int(np.rint(qsize * 0.3))) img = np.pad(np.float32(img), ((pad[1], pad[3]), (pad[0], pad[2]), (0, 0)), 'reflect') h, w, _ = img.shape y, x, _ = np.ogrid[:h, :w, :1] mask = np.maximum(1.0 - np.minimum(np.float32(x) / pad[0], np.float32(w-1-x) / pad[2]), 1.0 - np.minimum(np.float32(y) / pad[1], np.float32(h-1-y) / pad[3])) blur = qsize * 0.02 img += (scipy.ndimage.gaussian_filter(img, [blur, blur, 0]) - img) * np.clip(mask * 3.0 + 1.0, 0.0, 1.0) img += (np.median(img, axis=(0,1)) - img) * np.clip(mask, 0.0, 1.0) img = PIL.Image.fromarray(np.uint8(np.clip(np.rint(img), 0, 255)), 'RGB') quad += pad[:2] # Transform. img = img.transform((transform_size, transform_size), PIL.Image.QUAD, (quad + 0.5).flatten(), PIL.Image.BILINEAR) if output_size < transform_size: img = img.resize((output_size, output_size), PIL.Image.ANTIALIAS) return img
The third part of the code, set the image import mode (the first network to read pictures, second local loading), then call the algorithm defined in the second paragraph, and produce the result output.
import requests # img = Image.open(requests.get("https://upload.wikimedia.org/wikipedia/commons/8/85/Elon_Musk_Royal_Society_%28crop1%29.jpg", stream=True).raw).convert("RGB") img = Image.open(r"C:\Users\hp\Desktop\bidao.png").convert("RGB") face_detector = get_dlib_face_detector() landmarks = face_detector(img) display_facial_landmarks(img, landmarks, fig_size=[5, 5]) for landmark in landmarks: face = align_and_crop_face(img, landmark, expand=1.3) display(face2paint(model=model, img=face, size=512))
View effect
Well, after a long and complex environment configuration, you can see the effect by importing the picture.
Let me take Bi Dao's picture as an example:
Original drawing:
After face extraction and annotation:
Animation comparison:
The result is very Amazing!
However, due to the low resolution of the original image, the animation does not seem to be complete. For example, the ear still retains the three-dimensional feature, which is still a little away from the real two-dimensional feature.
Resource acquisition
I have packed the documents of this project on my WeChat official account, and I can get it back by "anime".