3 asymptotic growth generation countermeasure network (PGGAN)

1. Import database

import tensorflow as tf
from tensorflow import keras as K

2. Asymptotic smooth growth of high-resolution layer (the first innovation)

        In professional terms, the training process is developing from several low-resolution convolution layers to multiple high-resolution layers. First train the early layers, and then introduce higher resolution layers. However, even adding one layer at a time will have a great impact on the training. What PGGAN does is smoothly increase these layers to give the system time to apply higher resolution

        However, instead of immediately jumping to this resolution, a new layer with high resolution is smoothly added through parameter a


          As shown in the figure, after training the 16 * 16 resolution with sufficient iterations, another transposed convolution is introduced into the generator G (G represents the generator and D is the discriminator) and another convolution is introduced into the authenticator to generate 32 * 32 layers. There are two paths: (1-a) the layer multiplied by the nearest neighbor interpolation to increase the scale, (a) the output layer multiplied by the additional transposed convolution, and the two are spliced

def upscale_layer(layer, upscale_factor):
	By factor (int) Enlarge the layer (tensor), where
       The tensor is [Group, height, width, channel] 
	height, width = layer.get_shape()[1:3]
	size = (upscale_factor * height, upscale_factor * width)
	upscaled_layer = tf.image.resize_nearest_neighbor(layer, size)
	return upscaled_layer

def smoothly_merge_last_layer(list_of_layers, alpha):
	Threshold based alpha Merge smoothly in layers.
    This function assumes that all layers are already in RGB Yes.
    This is the function of the generator.
    :list_of_layers : Items should be tensors sorted by size
    :alpha : float \in (0,1) 
	# Hint!
    # If you are using tensorflow instead of Keras, remember the scope
	last_fully_trained_layer = list_of_layers[-2]
	# Now we have the first layer of training
	last_layer_upscaled = upscale_layer(last_fully_trained_layer, 2)

	# The new layer hasn't been fully trained yet
	larger_native_layer = list_of_layers[-1]

	# This ensures that the merge code can be run
	assert larger_native_layer.get_shape() == last_layer_upscaled.get_shape()

	# This code block should take advantage of the broadcast function
	new_layer = (1-alpha) * upscaled_layer + larger_native_layer * alpha

	return new_layer

3. Small batch standard deviation (the second innovation)

        Before learning the small batch standard deviation, we need to understand what is pattern collapse (in pattern collapse, some patterns or classes are not well represented in the generated samples, that is, the number 8 is not generated in the mnist dataset)

        Essentially, an additional scalar statistic is calculated for the authenticator. This statistic is the standard deviation of all pixels in a small batch generated by the generator or from real data.

def minibatch_std_layer(layer, group_size=4):
   The small batch standard deviation of one layer will be calculated.
   Will be in the pre specified tf Use within the scope Keras Perform this operation.
   Suppose the layer is float32 Data type. Otherwise, validation is required/Casting.
   Note: in Keras There is a more effective way to do this, but only for
   Clarity and consistency with major implementations (for understanding)
   This is more specific. Try this as an exercise. 
   # hint!
   # If you use Tensorflow instead of Keras, always remember the scope
   # Small batch group must be divisible by (or < =) group_size 
  group_size = K.backend.minimum(group_size, tf.shape(layer)[0])

   # Just get some shape information so that we can use it
   # They serve as shorthand and ensure default values 
  shape = list(K.int_shape(input))
  shape[0] = tf.shape(input)[0]

   # Reshape so that we can operate at the minibatch level
   # In this code, we assume that the layer is:
   # [group (G), small batch (M), width (W), height (H), channel (C)]
   # Note, however, that different implementations use Theano specific
   # Sequential substitution 
  minibatch = K.backend.reshape(layer, (group_size, -1, shape[1], shape[2], shape[3]))

  # Group centered [M, W, H, C]  
  minibatch -= tf.reduce_mean(minibatch, axis=0, keepdims=True)
  # Calculate the variance of group [M, W, H, C] 
  minibatch = tf.reduce_mean(K.backend.square(minibatch), axis = 0)
  # Calculate the standard deviation of the Group [M,W,H,C]
  minibatch = K.backend.square(minibatch + 1e8)
  # Average the feature map and pixels [M,1,1,1] 
  minibatch = tf.reduce_mean(minibatch, axis=[1,2,4], keepdims=True)
  # Add a layer for each group and pixel 
  minibatch = K.backend.tile(minibatch, [group_size, 1, shape[2], shape[3]])
  # Added as a new feature map 
  return K.backend.concatenate([layer, minibatch], axis=1)

4. Balanced learning rate (the third innovation)

            A dark magic that no one can tell,

def equalize_learning_rate(shape, gain, fan_in=None):
     #This adjusts the weight of each layer by the following constants
     #It is an initializer so that we can adjust the variance in the dynamic range
     #In different characteristics
     #Shape: the shape of the tensor (layer): These are the dimensions of each layer.
     #For example, [4,4,48,3]. In this case,
     #    [kernel_size,kernel_size,number_of_filters,feature_maps]. 
     #   But this will depend slightly on your implementation.
     #Gain: usually sqrt(2)
     #fan_in: adjust the number of incoming connections according to Xavier's / He's initialization 
    # The default value is the product of all shape dimensions minus the characteristic graph dim - this gives us the number of afferent connections per neuron 
    if fan_in is None: fan_in = np.prod(shape[:-1])
    # Initialization constants are used here
    std = gain / K.sqrt(fan_in)
    # Create a constant outside the adjustment
    wscale = K.constant(std, name='wscale', dtype=np.float32)
    # Get the weight and adjust it using the broadcast mechanism
    adjusted_weights = K.get_value('layer', shape=shape, 
            initializer=tf.initializers.random_normal()) * wscale
    return adjusted_weights

5. Pixel level feature normalization in generator

          Motivation is the stability of training, and one of the early signs of training divergence is the explosive growth of characteristics

All points in the image are mapped to a set of vectors, and then normalized  

Pixel level feature normalization is only used for generators

def pixelwise_feat_norm(inputs, **kwargs):
	The pixel by pixel feature normalization proposed by
    Krizevsky et al. 2012. Return normalized input
    :inputs : Keras / TF layer 
	normalization_constant = K.backend.sqrt(K.backend.mean(
					inputs**2, axis=-1, keepdims=True) + 1.0e-8)
	return inputs / normalization_constant

6. Run PGGAN

import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_hub as hub

with tf.Graph().as_default():
    # Import PGGAN from TFHub in advance
    module = hub.Module("progan-128_1")
    #Dimensions sampled at run time
    latent_dim = 512

    # Change the seed to get different faces
    latent_vector = tf.random.normal([1, latent_dim], seed=1337)

    # Use this module to generate images from potential space
    interpolated_images = module(latent_vector)

    # Run Tensorflow session to get the image of (1128128, 3)
    with tf.compat.v1.Session() as session:
      image_out = session.run(interpolated_images)


https://tfhub.dev/google/progan-128/1 Download TFHub here, and you can call the model they store in it

  A 128 * 128 face image can be obtained by running the code

Tags: AI TensorFlow

Posted on Fri, 19 Nov 2021 05:49:35 -0500 by jeffkee