Tensorflow Faster RCNN Source Parsing (TFFRCNN) network.py Common Network Layer Processing (including Decorators, etc.)

This blog is on github CharlesShang/TFFRCNN Edition Source Parsing Series Code Notes

--------------------------------------------------------------------------------------------------------------------------------------------

Wu Jiang, the author of this paper

------Click here to link to the original text of Blog Garden------

 

If not stated, this paper defaults to call VGGnet_test network in the test phase. The default padding mode in the code is SAME, which is different from VALID, DEFAULT_PADDING='SAME'

1. Two decorators (for example, the execution of conv1_1)

@layer
def conv(self, input, k_h, k_w, c_o, s_h, s_w, name, biased=True,relu=True, padding=DEFAULT_PADDING, trainable=True):

Note that @layer is an ornament, which is equivalent to conv=layer(conv)

@include_original
def layer(op):

This sentence is equivalent to layer=include_original(layer)

def include_original(dec):
    """ Meta decorator, which make the original function callable (via f._original() )"""
    def meta_decorator(f):
        decorated = dec(f)
        decorated._original = f
        return decorated
    return meta_decorator

I don't understand the decorator here include_original set of decorator layer. It is known from the annotations that the decorator include_original realizes the callability of the original function. In addition, why do you pass in self, referring to which class of instantiated objects, and why do you end up returning self in the layer decorator?

Therefore, when conv (3,3,64,1,1,1, name='conv1_1', trainable=False) is called in VGnet_test.py, it is equivalent to layer(conv)(3,3,64,1,1,name='conv1_1',trainable=False)

The layer(conv) returns the layer_decorated function pointer, and the op is assigned to conv, which is equivalent to executing layer_decorated (3,3,64,1,1,1, name='conv1_1', trainable=False)

Among them, args is (3,3,64,1,1), kwargs is {'name':'conv1_1','trainable':'False'}.

@include_original
def layer(op):
    def layer_decorated(self, *args, **kwargs):
        # Automatically set a name if not provided.
        # setdefault For self-built functions, if a dictionary exists'name'Key, then return the corresponding value Values such as conv1_1,Otherwise, default value self.get_unique_name(op.__name__)Assignment and return
        name = kwargs.setdefault('name', self.get_unique_name(op.__name__))
        # Figure out the layer inputs.
        # as VGGnet_test.py Network, in conv1_1 Previously called self.feed('data')bring self.inputs Not empty
        # here self.inputs The list contains only one element, which is self.layers['data']=self.data = tf.placeholder(tf.float32, shape=[None, None, None, 3])That is, the scaled image data
        if len(self.inputs)==0:              # No input
            raise RuntimeError('No input variables found for layer %s.'%name)
        elif len(self.inputs)==1:            # Single input
            layer_input = self.inputs[0]
        else:
            layer_input = list(self.inputs)  # The case of multiple inputs
        # Perform the operation and get the output.
        # Perform equivalent processing layer operations, such as conv
        layer_output = op(self, layer_input, *args, **kwargs)
        # Add to layer LUT.
        #stay self.layers Record the output of this layer in the dictionary
        self.layers[name] = layer_output
        # This output is now the input for the next layer.
        #Layer output adopt feed Functions added to self.inputs List, as input to the next layer
        self.feed(layer_output)
        # Return self for chained calls.   Link call
        return self
    return layer_decorated

It should be noted that self.inputs is a list, data is filled in by feed() function as the next input, self.layers is a dictionary, recording the output of each layer, in addition, attention should be paid to the situation of multiple inputs in some layers.

def include_original(dec):
    """ Meta decorator, which make the original function callable (via f._original() )"""
    def meta_decorator(f):
        decorated = dec(f)
        decorated._original = f
        return decorated
    return meta_decorator

Then look at the include_original decorator, which implements the callable function of the original function, i.e. guess it implements the convolution processing by calling self.conv(), where the code is not understood.

2. New Network Classes

class Network(object):
    def __init__(self, inputs, trainable=True):  # Constructor
        self.inputs = []
        self.layers = dict(inputs)
        self.trainable = trainable  
        self.setup()

Self. input is a list, data is filled in by feed() function as the next input, self.layers is a dictionary, recording the output of each layer. Each network processing layer is defined in the class, which will be introduced in "3".

The other seven functions - ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

def setup(self) function, guess that if it does not execute a subclass of Network class, it will trigger an exception (called by _init__function)

    def setup(self):
        raise NotImplementedError('Must be subclassed.')

def load(self,data_path,session,ignore_missing=False) loads pre-training models such as imagenet. npy files

    def load(self, data_path, session, ignore_missing=False):  # from ImageNet Pre-training model loading corresponding layer parameters
        data_dict = np.load(data_path).item()                  # generate data_dict Dictionaries
        for key in data_dict:
            with tf.variable_scope(key, reuse=True):
                for subkey in data_dict[key]:                  # data_dict[key]It should also be a dictionary containing weight and bias Key value
                    try:
                        var = tf.get_variable(subkey)
                        session.run(var.assign(data_dict[key][subkey]))
                        print "assign pretrain model "+subkey+ " to "+key
                    except ValueError:
                        print "ignore "+key
                        if not ignore_missing:
                            raise

The item() function is not found here, and the tensorflow mechanisms related to tf. variable_scope (key, reuse = True) and session.run(var.assign(data_dict[key][subkey]) are not yet understood.

def feed(self, *args) constructs the upper output as the input of the next layer, which is called in the layer decorator and network files (such as VGGnet_test.py).

    #layers For one dict inputs For one list
    def feed(self, *args):
        assert len(args)!=0  # If args If it is empty, there will be no parameters. raise One error
        self.inputs = []
        for layer in args:
            if isinstance(layer, basestring):  # Determine whether an object is str perhaps unicode Examples
                try:
                    layer = self.layers[layer] # Be reassigned
                    print layer
                except KeyError:
                    print self.layers.keys()
                    raise KeyError('Unknown layer name fed: %s'%layer)
            self.inputs.append(layer)   # To be taken out layer Data (i.e. upper output) is stored in input List as next level input
        return self

def get_output(self, layer) takes out the output of a layer (layer parameter) in the dictionary self.layers, which is composed of the output of each layer, and is called by test.py, etc.

    def get_output(self, layer):
        try:
            layer = self.layers[layer]  
        except KeyError:
            print self.layers.keys()
            raise KeyError('Unknown layer name fed: %s'%layer)
        return layer

If def get_unique_name(self,prefix) does not specify the network layer name, it automatically gets the unique network layer name (for example, two of self.layers begin with conv, then the layer name is conv3). In fact, because the network file layers specify name, the function is not used and is called by the layer decorator.

    def get_unique_name(self, prefix):   #By similar conv Equal prefix Counting gets current conv Of id The output is similar to conv1,conv2 Processing Layer Name
        id = sum(t.startswith(prefix) for t,_ in self.layers.items())+1
        return '%s_%d'%(prefix, id)
name = kwargs.setdefault('name', self.get_unique_name(op.__name__))   # Here op is conv, max_pool, etc.

def make_var(self, name, shape, initializer=None, trainable=True, regularizer=None) creates variables in the format defined by tensorflow and is called by network-related processing layer (such as conv() function).

    def make_var(self, name, shape, initializer=None, trainable=True, regularizer=None): # stay tensorflow New variables in format
        return tf.get_variable(name, shape, initializer=initializer, trainable=trainable, regularizer=regularizer)

def validate_padding(self, padding) only allows padding to be SAME or VALID, otherwise it throws an exception and is called by the relevant network processing layer containing padding (such as conv(), upconv() functions).

    def validate_padding(self, padding): # Only allowed padding The way is SAME or VALID,Otherwise throw an exception
        assert padding in ('SAME', 'VALID')

3. Network Processing Layer

----------------------------------- Convolution( Convolution Principle 1,Various convolutions Click here - ----------------------------------------------------------------------------------------------------------------------------------------------

def conv(...)

@layer
    # as VGGnet_test.py in self.conv(3, 3, 64, 1, 1, name='conv1_1', trainable=False)
    # Execution layer Decorator layer_output = op(self, layer_input, *args, **kwargs)
    # Amount to layer_output = conv(self, layer_input, 3, 3, 64, 1, 1, name='conv1_1', trainable
    # =False,biased=True,relu=True, padding=DEFAULT_PADDING)
    def conv(self, input, k_h, k_w, c_o, s_h, s_w, name, biased=True,relu=True, padding=DEFAULT_PADDING, trainable=True):
        """ contribution by miraclebiu, and biased option"""
        self.validate_padding(padding)  # Only allowed padding by same or valid
        #input by[batch,in_height,in_width,in_channels]
        c_i = input.get_shape()[-1]  # Number of input channels, i.e. feature map Number
        # [1, s_h, s_w, 1]finger stride,The first and last bits must be 1, and the first bit must be expressed in batch The last bit represents the displacement in depth.
        # Anonymous function, called in this code segment, where, i,k Individual refers to input,kernel(Controlled by four parameters, number of input and output channels and kernel Length and height)
        convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)
        with tf.variable_scope(name) as scope: # Naming of management parameters
            # init_weights = tf.truncated_normal_initializer(0.0, stddev=0.001)  Normal Initialization Weight
            init_weights = tf.contrib.layers.variance_scaling_initializer(factor=0.01, mode='FAN_AVG', uniform=False)
            init_biases = tf.constant_initializer(0.0)
            # with tensorflow Mechanisms Construct Variables
            kernel = self.make_var('weights', [k_h, k_w, c_i, c_o], init_weights, trainable, \
                                   regularizer=self.l2_regularizer(cfg.TRAIN.WEIGHT_DECAY))  # 0.0005
            if biased:
                biases = self.make_var('biases', [c_o], init_biases, trainable)
                conv = convolve(input, kernel)   # Convolution result
                if relu:
                    bias = tf.nn.bias_add(conv, biases)  # Adding offset
                    return tf.nn.relu(bias)  # relu Later results
                return tf.nn.bias_add(conv, biases)
            else:
                conv = convolve(input, kernel)  # Convolution result
                if relu:
                    return tf.nn.relu(conv)
                return conv

cfg.TRAIN.WEIGHT_DECAY  0.0005

The convolution kernel i s controlled by four parameters. The convolution processing function in tensorflow is tf.nn.conv2d(i,k,[1,s_h,s_w,1],padding=padding).

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

def upconv(...)

@layer
    def upconv(self, input, shape, c_o, ksize=4, stride = 2, name = 'upconv', biased=False, relu=True, padding=DEFAULT_PADDING,
             trainable=True):
        """ up-conv"""
        self.validate_padding(padding)      # Only allowed padding by same or valid

        c_in = input.get_shape()[3].value   # input feature map Channel Number
        in_shape = tf.shape(input)          # Input shape
        if shape is None:
            # h = ((in_shape[1] - 1) * stride) + 1
            # w = ((in_shape[2] - 1) * stride) + 1
            h = ((in_shape[1] ) * stride)
            w = ((in_shape[2] ) * stride)
            new_shape = [in_shape[0], h, w, c_o]  # Four Dimensions of Output shape How to determine unknown?
        else:
            new_shape = [in_shape[0], shape[1], shape[2], c_o]
        output_shape = tf.stack(new_shape)  # tf.stack Is it a stitching matrix?

        filter_shape = [ksize, ksize, c_o, c_in]  # Convolutional kernels shape,Controlled by four parameters?

        with tf.variable_scope(name) as scope:
            # init_weights = tf.truncated_normal_initializer(0.0, stddev=0.01)
            init_weights = tf.contrib.layers.variance_scaling_initializer(factor=0.01, mode='FAN_AVG', uniform=False)  # Initial Value Assignment
            filters = self.make_var('weights', filter_shape, init_weights, trainable, \  
                                   regularizer=self.l2_regularizer(cfg.TRAIN.WEIGHT_DECAY))
            deconv = tf.nn.conv2d_transpose(input, filters, output_shape,
                                            strides=[1, stride, stride, 1], padding=DEFAULT_PADDING, name=scope.name)  # Deconvolution results
            # coz de-conv losses shape info, use reshape to re-gain shape
            deconv = tf.reshape(deconv, new_shape)

            if biased:
                init_biases = tf.constant_initializer(0.0)
                biases = self.make_var('biases', [c_o], init_biases, trainable)
                if relu:
                    bias = tf.nn.bias_add(deconv, biases)  # Adding bias
                    return tf.nn.relu(bias)                # relu Return after processing
                return tf.nn.bias_add(deconv, biases)
            else:
                if relu:
                    return tf.nn.relu(deconv)
                return deconv

The deconvolution function in tensorflow is tf.nn.conv2d_transpose(...).

How to determine the output dimension (four parameters)? Deconvolution principle? tf.stack function? The convolution kernel shape is determined by four parameters? The relationship between output_shape and new_shape?

-------------------------------------------------------- - (maximum / average) pooling - ------------------------------------------------------------------------------------------------------------------------------------------

def max_pool(...)

def avg_pool(...)

    @layer
    def max_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
        self.validate_padding(padding)
        return tf.nn.max_pool(input,
                              ksize=[1, k_h, k_w, 1],
                              strides=[1, s_h, s_w, 1],
                              padding=padding,
                              name=name)

    @layer
    def avg_pool(self, input, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
        self.validate_padding(padding)
        return tf.nn.avg_pool(input,
                              ksize=[1, k_h, k_w, 1],
                              strides=[1, s_h, s_w, 1],
                              padding=padding,
                              name=name)

The (maximum/average) pooling functions in tensorflow are tf.nn.max_pool(...) and tf.nn.avg_pool(...).

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

def relu(...)

    @layer
    def relu(self, input, name):
        return tf.nn.relu(input, name=name)

-------------------------------------------------roi_pool------------------------------------------------------

def roi_pool(...)

    # as VGGnet_test.py in self.roi_pool(7, 7, 1.0 / 16, name='pool_5')
    # This layer self.feed('conv5_3', 'rois'),That is, the input is['conv5_3','rois']list
    @layer
    def roi_pool(self, input, pooled_height, pooled_width, spatial_scale, name):
        # only use the first input
        if isinstance(input[0], tuple):
            input[0] = input[0][0]
        if isinstance(input[1], tuple):
            input[1] = input[1][0]
        print input
        return roi_pool_op.roi_pool(input[0], input[1],
                                    pooled_height,
                                    pooled_width,
                                    spatial_scale,
                                    name=name)[0]

Taking VGnet_test.py as an example, this layer input is a list of conv5_3 and rois, which is executed by roi_pool_op.roi_pool(...). (In roi_pooling_layer/roi_pooling_op.py, it is actually executed by roi_pooling.so.)

Why does isinstance(input,tuple) judge whether an input is a tuple

Psroi_pool (Principle) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

def psroi_pool(...)

    @layer
    def psroi_pool(self, input, output_dim, group_size, spatial_scale, name): 
        """contribution by miraclebiu"""
        # only use the first input
        if isinstance(input[0], tuple):
            input[0] = input[0][0]

        if isinstance(input[1], tuple):
            input[1] = input[1][0]

        return psroi_pooling_op.psroi_pool(input[0], input[1],
                                           output_dim=output_dim,
                                           group_size=group_size,
                                           spatial_scale=spatial_scale,
                                           name=name)[0]

Executed by psroi_pool_op.psroi_pool(...) (in psroi_pooling_layer/psroi_pooling_op.py, actually executed by psroi_pooling.so)

-------------------------------------------------proposal_layer---------------------------------------------------------

 

 

 

 


tf.py_func() receives tensor, converts it into numpy array and feeds it into xxx function, and finally converts numpy array output of xxx function into tensor return.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4. Other

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

How data is fed through sess.run, and what mechanism is the relationship between feed_dict and the corresponding variables in the VGnet_test class (which only defines placeholders)?

Tags: PHP network Session github Lambda

Posted on Wed, 24 Jul 2019 02:44:59 -0400 by jmajeremy