AR face recognition Three.js + tensorflow.js

summary

In this phase, Three.js and tensorflow.js are used to complete an AR face recognition project. The main front-end framework used is Vue. The main function points in the project can realize face replacement. After the camera detects the face, the face will be replaced with a prepared face model. Then when our face moves and changes, the model will follow the movement and change, On the model, we can draw lines and logo s, draw freely, and then return to the previous step and delete
Let's look at the video effects first:

AR face recognition

Introduction to tensorflow.js

The three.js library will not be introduced. I believe you are already familiar with it. Let's mainly introduce the tensorflow.js library.
Tensorflow.js is a library based on deeplearn.js, which can directly create deep learning modules on the browser. Using it, CNN (convolutional neural network), RNN (cyclic neural network) and so on can be created on the browser, and these models can be trained using the GPU processing capacity of the terminal. Therefore, the server GPU may not be required to train the neural network. This time, we mainly use tensorflow.js to complete face recognition.

Project construction

The front-end framework is vue, which uses version 2.6. The three.js version uses version 124. Tensorflow uses @ tensorflow models / face landmarks detection, tensorflow / tfjs backend webgl, tensorflow / tfjs converter, tensorflow / tfjs core

Initialize camera

The front-end interface is very simple. Just a small icon is ok. Just write a little style

  1. Calling init function in vue file
let useCamera = true, video;
export async function init() {
	if (useCamera) {
     await setupCamera();
     video.play();
     video.width = video.videoWidth;
     video.height = video.videoHeight;
     await facemesh.init(video);
  	}
}
  1. Set camera parameters
async function setupCamera() {
  video = document.createElement('video');
  // navigator.mediaDevices.getUserMedia prompts the user to grant permission to use media input. Media input will generate a MediaStream containing the track of the requested media type.

  const stream = await navigator.mediaDevices.getUserMedia({
    // Turn off audio
    audio: false,
    video: {
      // On the mobile device, it means that the front camera is preferred
      facingMode: 'user',
      // Judge whether it is a mobile terminal. If it is a mobile terminal, it is adaptive, and the PC terminal is 640 * 640
      width: mobile ? undefined : 640,
      height: mobile ? undefined : 640
    }
  });
  
  video.srcObject = stream;
  return new Promise((resolve) => {
    // Execute JavaScript after the metadata of the video is loaded
    video.onloadedmetadata = () => {
      resolve(video);
    };
  });
}
  1. Initialize tensorflow
    tf.setBackend sets the backend (cpu, webgl, wasm) responsible for creating tensors and performing operations on them. Webgl is mainly used here, and the maximum number of face recognition is 1
const state = {
  backend: 'webgl',
  maxFaces: 1
};
async function init(video) {
  // tf.setBackend sets the backend (cpu, webgl, wasm, etc.) responsible for creating tensors and performing operations on them
  await tf.setBackend(state.backend);

  // Maximum number of faces detected in facemesh.load input
  model = await faceLandmarksDetection.load(
    faceLandmarksDetection.SupportedPackages.mediapipeFacemesh,
    {
      maxFaces: state.maxFaces,
      detectorModelUrl: 'model/blazeface/model.json',
      // Optional parameters used to specify custom iris model url or tf.io.IOHandler object
      irisModelUrl: 'model/iris/model.json',
      // Optional parameters used to specify custom facemesh model url or tf.io.IOHandler object
      modelUrl: 'model/facemesh/model.json'
    }
  );
}

There is a big pit here. detectorModelUrl, irisModelUrl, modelUrl will go automatically if the path is not set https://tfhub.dev/ Get the face detection model on, but this website is inaccessible in China, so we need to modify the path, we can go https://hub.tensorflow.google.cn/ Download the face detection model we need from the website and put the package file After downloading, replace the corresponding path
Specific instructions can be found on this link: https://www.npmjs.com/package/@tensorflow-models/face-landmarks-detection

At this point, we should be able to see that the running state of the camera has been turned on

Initialize three.js

Here is a routine step to initialize three.js. We create a three.js scene as large as video video

threeEl = await three.init(video);
async function init(video) {
  width = video ? video.width : 640;
  const height = video ? video.height : 640;
  const ratio = width / height;
  const fov = 50;
  const near = 1;
  const far = 5000;
  camera = new THREE.PerspectiveCamera(fov, ratio, near, far);
  camera.position.z = height;
  camera.position.x = -width / 2;
  camera.position.y = -height / 2;

  renderer = new THREE.WebGLRenderer({ antialias: true, alpha: true });
  renderer.setPixelRatio(window.devicePixelRatio);
  renderer.setSize(width, height);
  scene = new THREE.Scene();
  if (video) {
    // Create a video map
    addVideoSprite(video);
  }

  // Initialize ray
  raycaster = new THREE.Raycaster();

  // Add face model
  await addObjMesh();

  return renderer.domElement;
}

The key step is to attach the video as a map to the elf, so that we can present our video in three.js. Here, the principle of the project implementation can be understood. Obtain the video stream through the video tag, and then attach the video stream as a map to the corresponding mesh. At this time, we can do anything we want to do, add models and add MES h. Any three. JS can do
Add mesh

function addVideoSprite(video) {
  videoTexture = new THREE.Texture(video);
  videoTexture.minFilter = THREE.LinearFilter;
  const videoSprite = new THREE.Sprite(
    new THREE.MeshBasicMaterial({
      map: videoTexture,
      depthWrite: false
    })
  );
  const width = video.width;
  const height = video.height;
  videoSprite.center.set(0.5, 0.5);
  videoSprite.scale.set(width, height, 1);
  videoSprite.position.copy(camera.position);
  videoSprite.position.z = 0;
  scene.add(videoSprite);
}

Add face model

function addObjMesh() {
  const loader = new OBJLoader();
  return new Promise((resolve, reject) => {
    loader.load('model/facemesh.obj', (obj) => {
      obj.traverse((child) => {
        if (child instanceof THREE.Mesh) {
          const mat = new THREE.MeshNormalMaterial({
            side: THREE.DoubleSide
          });
          if (!params.debug) {
            mat.transparent = true;
            mat.opacity = 0;
          }
          baseMesh = new THREE.Mesh(child.geometry, mat);
          scene.add(baseMesh);
          resolve();
        }
      });
    });
  });
}

The scene and ELF are generated. We only need to render the scene. When rendering with window.requestAnimationFrame(update), we must update our video map and set the needsUpdate of the video map to true, otherwise it will not be updated

function update(facemesh) {
  if (videoTexture) {
    videoTexture.needsUpdate = true;
  }

  renderer.render(scene, camera);
}

At this point, we should be able to see real-time video in our interface

ok, let's do an experiment and add a rotating box in the middle of the video
The code is simple

var geometry = new THREE.BoxBufferGeometry( 200, 200, 200 );
var material = new THREE.MeshBasicMaterial( {color: 0x00ff00, depthTest:false} );
cube = new THREE.Mesh( geometry, material );
cube.position.copy(camera.position)
cube.position.z = 0;
scene.add( cube );

In this way, we can see that a box will appear in the middle of our video. At this time, we can think about the ar application in our mobile phone. For example, some models, animations or picture i introductions appear at a fixed position. Can we complete it


That's all for this article. In the next article, we'll introduce how to generate a face model~

If you don't understand, you can add QQ: 2781128388

Tags: Javascript TensorFlow AR three

Posted on Sat, 30 Oct 2021 00:08:10 -0400 by searain