Recurrent neural network (RNN) is a kind of recurrent neural network which takes sequence data as input, recursion in the evolution direction of sequence and all nodes (cyclic units) are linked in chain. This paper introduces the RNN algorithm, You can have a look at this. The explanation is very clear
tf.keras.layers.RNN( cell, return_sequences=False, return_state=False, go_backwards=False, stateful=False, unroll=False, time_major=False, **kwargs ) ''' cell RNN unit instance or RNN unit instance list. RNN units are classes with the following contents: A call(input_at_t, states_at_t) Method, return (output_at_t, states_at_t_plus_1). The call method of a cell can also take the optional parameter constants, as shown in "precautions for passing external constants" below. One state_ The size attribute. It can be a single integer (single state), in which case it is the size of the loop state. This can also be a list / tuple of integers (one size per state). State of_ Size can also be a list of TensorShape or tuple / TensorShape to represent high-dimensional state. One output_ The size attribute. It can be a single integer or TensorShape, representing the shape of the output. For backward compatibility reasons, if the property does not apply to the cell, the value infers state from the first element of_ size. return_sequences Boolean (default False). Whether to return the last output in the output sequence or the complete sequence. return_state Boolean (default False). Whether to return the last state except output. go_backwards Boolean (default False). If True, the input sequence is processed backward and the opposite sequence is returned. stateful Boolean (default False). If True, the last state of each sample with index i in the batch is used as the initial state of samples with index i in the next batch. unroll Boolean (default False). If True, the network will be expanded, otherwise symbolic loops will be used. Deployment can speed up RNN, although it tends to take up more memory. Expansion is only applicable to short sequences. time_ The shape format of major inputs and outputs tensors. If true, the input and output will be shapes, otherwise False. It is more efficient because it avoids transposition at the beginning and end of RNN calculation. However, most TensorFlow data is mass-produced, so by default, this function takes input and emits output in the form of mass production. (timesteps, batch, ...)(batch, timesteps, ...)time_major = True '''
#Now that we have the formula, let's take a look at how to implement RNN with numpy import numpy as np class RNN: def __init__(self,word_dim,hidden_dim=100,output_dim=50): self.word_dim=word_dim self.hidden_dim=hidden_dim self.output_dim=output_dim #The parameter U is used for matrix operation with input data, so the dimension is (input dimension, number of neurons) self.U=np.random.uniform(-np.sqrt(1./word_dim),np.sqrt(1./word_dim),(word_dim,hidden_dim)) #d*h #The parameter W is used for matrix operation with the cyclic unit, so the dimension is (number of neurons, number of neurons) self.W=np.random.uniform(-np.sqrt(1./hidden_dim),np.sqrt(1./hidden_dim),(hidden_dim,hidden_dim)) #h*h #V is the parameter to calculate the output, so the dimension (number of neurons, output dimension) self.V=np.random.uniform(-np.sqrt(1./hidden_dim),np.sqrt(1./hidden_dim),(hidden_dim,output_dim)) #h*q def forward_propagation(self,x): # train steps T=x.shape N=x.shape # hidden states initialized to 0 s=np.zeros((N,T,self.hidden_dim)) # output zeros o=np.zeros((N,T,self.output_dim)) # for each time in step: for t in range(T): #Here is the code implementation of the calculation formula s[:,t,:]=np.tan(x[:,t,:].dot(self.U)+s[:,t-1,:].dot(self.W)) #n*h o[:,t,:]=self.softmax(s[:,t,:].dot(self.V)) # return [o,s] def softmax(self,x): exp_x = np.exp(x) softmax_x = exp_x / np.sum(exp_x) return softmax_x
The implementation process of RNN is also given in tf official document
#This is a little simpler than numpy implementation. Note: the difficulty in understanding RNN is mainly the dimension of each parameter and the dimensional change of data. After this, RNN is easy to understand # First, let's define a RNN Cell, as a layer subclass. class MinimalRNNCell(keras.layers.Layer): def __init__(self, units, **kwargs): self.units = units self.state_size = units super(MinimalRNNCell, self).__init__(**kwargs) def build(self, input_shape): self.kernel = self.add_weight(shape=(input_shape[-1], self.units), initializer='uniform', name='kernel') self.recurrent_kernel = self.add_weight( shape=(self.units, self.units), initializer='uniform', name='recurrent_kernel') self.built = True def call(self, inputs, states): prev_output = states h = K.dot(inputs, self.kernel) output = h + K.dot(prev_output, self.recurrent_kernel) return output, [output] # Let's use this cell in a RNN layer: cell = MinimalRNNCell(32) x = keras.Input((None, 5)) layer = RNN(cell) y = layer(x) # Here's how to use the cell to build a stacked RNN: cells = [MinimalRNNCell(32), MinimalRNNCell(64)] x = keras.Input((None, 5)) layer = RNN(cells) y = layer(x)