In the last article, we have preprocessed the ECG signal to smooth the noisy signal for classification. In this paper, we will formally start to use deep learning to classify and recognize ECG signals.

## Convolutional neural network

Whether it is traditional machine learning or deep learning, the classification is based on the different features contained in different categories of data. In order to classify and recognize, we need to extract the features of the data, but the two methods are not the same. For traditional machine learning, the features of data need to be extracted manually by designers or professionals, while deep learning can automatically extract different features from each kind of data. For convolutional neural network CNN, the key to automatically extract features is convolution operation. The features extracted by convolution often have redundancy, and multiple convolutions will make the parameters of the neural network too much to train, so CNN will often follow a pooling layer behind the convolution layer. After multiple convolutions and pooling, the lower level features will gradually form the higher level features. Finally, the neural network classifies according to the extracted high level features.

In addition, we need to point out why CNN can be used in ECG classification. This is because the convolution operation of CNN has the characteristics of local connection and weight sharing.

- Local connection: the features needed to distinguish different kinds of pictures are only some local areas in the whole picture, so the convolution kernel (receptive field) used in convolution operation can be just a few different small areas, instead of using the convolution kernel (full connection) of the size of the whole picture. This can not only express different features better, but also reduce parameters. For example, in the figure below, on the left is the network using full connection neural network, and on the right is the network using local connection convolution kernel.

- Weight sharing: for a class of pictures, they have similar features, but the position of features in each picture may be shifted. For example, the position of eyes in different face photos may change, and rarely two photos have the same eye position. When a picture is convoluted, there can be multiple convolution kernels to extract different features, but the weight of a convolution kernel remains unchanged in the process of moving (of course, the weight of different convolution kernels is not shared). This can not only ensure that the feature extraction is not affected by the location, but also reduce the number of parameters.

Although the ECG signal is one-dimensional, its characteristics also meet the conditions of local connection and weight sharing, so we can use convolutional neural network to classify it.

## Building deep learning data set

Qiaofu can't cook without rice. Although we have preprocessed ECG data, such data can't be used for direct classification and learning. So we need to build the data set that conforms to the deep learning model. The process of conversion is to cut out the heart beat that meets the requirements from an ECG signal as a sample, then convert python list to numpy array, and finally form a data set for deep learning after disorder and segmentation. Here we use tf.keras The interface provided can directly use the numpy array type without being converted to the DataSet object of TensorFlow, which is also simpler for the training process.

It is necessary to find the location of QRS peak in heart beat segmentation. Since we only train the network model, we directly use the manual annotation provided by MIT-BIH data set, and take 99 signal points forward and 200 signal points backward at the peak to form a complete heart beat. If it is necessary to identify and classify the real measured signals and design the heart beat detection algorithm, I may continue to do it later.

Data sets are divided into training sets, verification sets and test sets according to their uses. Training set is used for training parameter model, verification set is used for testing accuracy and error (loss function) in model training, and test set is used for final testing of training effect after training. It can be compared with study, test and examination. The data structure of the three is the same, but the data content is different. Each training set contains two parts: data and label. The data is a list of several heart beats cut out after preprocessing, and the label is the corresponding ECG type of each heart beat sample.

First, encapsulate the preprocessing steps in the previous article into a function:

`# Wavelet denoising preprocessing def denoise(data): # wavelet transform coeffs = pywt.wavedec(data=data, wavelet='db5', level=9) cA9, cD9, cD8, cD7, cD6, cD5, cD4, cD3, cD2, cD1 = coeffs # Threshold denoising threshold = (np.median(np.abs(cD1)) / 0.6745) * (np.sqrt(2 * np.log(len(cD1)))) cD1.fill(0) cD2.fill(0) for i in range(1, len(coeffs) - 2): coeffs[i] = pywt.threshold(coeffs[i], threshold) # Wavelet inverse transform to obtain the denoised signal rdata = pywt.waverec(coeffs=coeffs, wavelet='db5') return rdata`

Then, the read data, annotation and heart beat segmentation are encapsulated into a function:

`# Read ECG data and corresponding labels, and denoise the data with wavelet def getDataSet(number, X_data, Y_data): ecgClassSet = ['N', 'A', 'V', 'L', 'R'] # Read ECG data record print("Reading " + number + " ECG data No...") record = wfdb.rdrecord('ecg_data/' + number, channel_names=['MLII']) data = record.p_signal.flatten() # Wavelet denoising rdata = denoise(data=data) # Obtain the position and corresponding label of R wave in ECG data record annotation = wfdb.rdann('ecg_data/' + number, 'atr') Rlocation = annotation.sample Rclass = annotation.symbol # Remove the unstable data before and after start = 10 end = 5 i = start j = len(annotation.symbol) - end # Because only five types of NAVLR ECG are selected, the data with specific labels needed in this record should be selected, and the remaining points with labels should be discarded # X_data intercept data points with length of 300 before and after R wave # Y_data converts NAVLR to 01234 in sequence while i < j: try: lable = ecgClassSet.index(Rclass[i]) x_train = rdata[Rlocation[i] - 99:Rlocation[i] + 201] X_data.append(x_train) Y_data.append(lable) i += 1 except ValueError: i += 1 return`

It should be noted that the above function does not return a value, because we load the list X of heartbeat data and samples_ data,Y_data contains all heart beats that meet the requirements in ECG records, which need to be transferred from outside the function, and the obtained data is directly attached to the end of the list. In this way, the number of ECG signal, X_data,Y_ The simultaneous interpreting of data can save the required data in X_data and Y_data.

Next, read all heartbeat signals (removed because 102 and 104 do not have MLII leads) into the two lists of dataSet and lableSet. After the above functions, both dataSet and lableSet are one-dimensional lists (92192). Each element in the dataSet is a numpy array, and each element in the array is 300 signal points of a heartbeat. Each element in the lableSet is the tag value corresponding to an array in the dataSet (NAVLR is 01234). After reshape, change the dataSet to the list of (92192300), and the lableSet to the list of (92192,1). Then the two lists are disordered, but the corresponding relationship between them should not be changed. The idea is to stack two lists vertically and change them into one list train_ds, and then the disordered data X and label Y are separated.

because tf.keras The input data set can be automatically divided into training set and test set, so only test set can be divided. The idea is to make a random list of 92192 (total number of heart beats), and then take the first 30% of the values as indexes, and then take the subscripts of these indexes in the dataset and label set, that is, to get test data set X_test and test label set Y_test.

`# Load and preprocess the dataset def loadData(): numberSet = ['100', '101', '103', '105', '106', '107', '108', '109', '111', '112', '113', '114', '115', '116', '117', '119', '121', '122', '123', '124', '200', '201', '202', '203', '205', '208', '210', '212', '213', '214', '215', '217', '219', '220', '221', '222', '223', '228', '230', '231', '232', '233', '234'] dataSet = [] lableSet = [] for n in numberSet: getDataSet(n, dataSet, lableSet) # Turn to numpy array, disorder the order dataSet = np.array(dataSet).reshape(-1, 300) lableSet = np.array(lableSet).reshape(-1, 1) train_ds = np.hstack((dataSet, lableSet)) np.random.shuffle(train_ds) # Dataset and its label set X = train_ds[:, :300].reshape(-1, 300, 1) Y = train_ds[:, 300] # Test set and its label set shuffle_index = np.random.permutation(len(X)) test_length = int(RATIO * len(shuffle_index)) # RATIO = 0.3 test_index = shuffle_index[:test_length] X_test, Y_test = X[test_index], Y[test_index] return X, Y, X_test, Y_test`

After the above functions, X,Y are the overall data set and label set, X_test,Y_test is the test data set and label set. Verify the use of data set and test set tf.keras Interface automatic division. In this way, the data set for deep learning has been built.

## Deep learning recognition classification

Generally speaking, the training process programming of deep learning neural network is complex, but we use tf.keras Advanced interface can be used to construct deep learning network model conveniently.

First, we build the network structure, as shown in the following figure:

`# Building CNN model def buildModel(): newModel = tf.keras.models.Sequential([ tf.keras.layers.InputLayer(input_shape=(300, 1)), # The first convolution layer, four 21x1 convolution kernels tf.keras.layers.Conv1D(filters=4, kernel_size=21, strides=1, padding='SAME', activation='relu'), # The first pooling layer, the maximum pooling, four 3x1 convolution kernels, step size 2 tf.keras.layers.MaxPool1D(pool_size=3, strides=2, padding='SAME'), # The second convolution layer, 16 convolution kernels of 23x1 tf.keras.layers.Conv1D(filters=16, kernel_size=23, strides=1, padding='SAME', activation='relu'), # The second pooling layer, maximum pooling, 4 3 x 1 convolution kernels, step size 2 tf.keras.layers.MaxPool1D(pool_size=3, strides=2, padding='SAME'), # The third convolution layer, 32 25x1 convolution kernels tf.keras.layers.Conv1D(filters=32, kernel_size=25, strides=1, padding='SAME', activation='relu'), # The third pooling layer, average pooling, 4 3x1 convolution kernels, step size 2 tf.keras.layers.AvgPool1D(pool_size=3, strides=2, padding='SAME'), # The fourth convolution layer, 64 27x1 convolution kernels tf.keras.layers.Conv1D(filters=64, kernel_size=27, strides=1, padding='SAME', activation='relu'), # Flat layer, convenient for full connection layer processing tf.keras.layers.Flatten(), # Full connection layer, 128 nodes tf.keras.layers.Dense(128, activation='relu'), # Dropout layer, dropout = 0.2 tf.keras.layers.Dropout(rate=0.2), # Full connection layer, 5 nodes tf.keras.layers.Dense(5, activation='softmax') ]) return newModel`

Then use model.compile() construction; model.fit() after 30 rounds of training, the batch size was 128, the proportion of validation set was 0.3, and the callback was set to save the training records; model.save() save the model; model.predict_classes() prediction. The complete code can be viewed from my GitHub warehouse, and the address is in article (1).

`def main(): # 10. Y is all data sets and label sets # X_test,Y_test is the split test set and label set X, Y, X_test, Y_test = loadData() if os.path.exists(model_path): # Import the trained model model = tf.keras.models.load_model(filepath=model_path) else: # Building CNN model model = buildModel() model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) model.summary() # Defining TensorBoard objects tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1) # Training and verification model.fit(X, Y, epochs=30, batch_size=128, validation_split=RATIO, callbacks=[tensorboard_callback]) model.save(filepath=model_path) # forecast Y_pred = model.predict_classes(X_test)`

At this point, the recognition rate of ECG signal is about 99%.