Anticipating Atrial Fibrillation Signal Using Efficient Algorithm

Iraq Abstract —One of the common types of arrhythmia is Atrial Fibrillation (AF), it may cause death to patients. Correct diagnosing heart problems through exam-ining the Electrocardiogram (ECG) signal will lead to prescribing the right treat-ment for a patient. This study proposes a system that distinguishes between the normal and AF ECG signals. First, this work provides a novel algorithm for segmenting the ECG signal for extracting a single heartbeat. The algorithm utilizes low computational cost techniques to segment the ECG signal. Then, useful preprocessing and feature extraction methods are suggested. Two classifiers, Support Vector Machine (SVM) and Multilayer Perceptron (MLP), are separately used to evaluate the two proposed algorithms. The performance of the last proposed method with the two classifiers (SVM and MLP) shows an improvement of about (19% and 17%, respectively) after using the proposed segmentation method so it became 96.2% and 97.5%,


Introduction
Diagnosing various kinds of heart diseases was performed through ECG signals, but these signals are disturbed with several types of noises, and this may yield incorrect interpretations of these signals. Hence, eliminating noises from the ECG signals have been attracted significant attention in order to obtain clean signals to extract undisturbed features from ECG signals with acceptable accuracies for medical purposes. [3,4]. AF is one of the abnormal cardiac arrhythmia types that affects five million people in the USA. Recent studies expected that 700,000 people in the USA may have undiagnosed AF. Therefore, more efforts and studies are needed to develop accurate non-invasive technology for expecting AF signals accurately. These technologies enable patients to Paper-Anticipating Atrial Fibrillation Signal Using Efficient Algorithm Fig. 1. An ECG signal with its typical intervals [7].

The suggested algorithm
This algorithm represents the preprocessing and feature extraction algorithm which is involved in five stages; denoising based on a wavelet, normalization, cross-correlation, feature extraction, and classification using MLP or SVM. Figure 2 illustrates the suggested preprocessing and feature extraction procedure.

Fig. 2. block diagram of the ECG classification procedure
Firstly, performing the ECG denoising based on the wavelet transform. The ECG signal was decomposed by 7 levels using the Daubechies 2 (Db2) filter. Then, the ECG signal was reconstructed after excluding the coefficients of level 7, see Figure 3.

Fig. 3. De noised ECG signal and noisy signal
Secondly, normalization is employed where the signal is divided by its absolute maximum value. Thirdly, the cross-correlation is applied by correlating the ECG signal with a reference of normal ECG signal. Fourthly, ten feature extraction methods of min, max, mean, median, mode, 1st quartile, 2nd quartile, Standard Deviation (SD), range, and entropy are utilized. Finally, classifying the two ECG classes by using one of the two classifiers MLP or SVM.

Proposed ECG segmentation algorithm
The segmentation of ECG signals to a single heartbeat involves determining the correct heartbeat interval and its peaks. Since R point is the peak value of the ECG signal so the traditional segmentation algorithms and this algorithm use it to determine each ECG heartbeat boundaries. In the proposed algorithm, the thresholding technique is used to determine the R point and QRS interval of the ECG signal. The threshold estimation is given by the following equation.

Fig. 4. Flowchart of the proposed ECG segmentation algorithm
where s(t) is the ECG signal raised to the base of 1.1 and σ is the mean plus the standard deviation (SD) of s(t). The following steps illustrate the algorithm 1. Initiate the size of the segmentation window to 360 samples (1 sec.) with variable (be=0). 2. Construct the current ECG segment from the ECG signal. 3. Make each segment exponent of (1.1). 4. Apply Eq. (1) to find samples greater than the threshold. 5. Consider the medial sample of the first successive points as the R point and save its index. 6. Make (be) variable equal to the index of the considered R point plus 180 (360/2). 7. Check if the ECG segmentation is finished. If not go to step 2 otherwise continue. 8. Estimate the periods between two successive R points to find the mean interval of the whole ECG dataset. 9. Use the extracted information (the index of R points and the mean interval) for segmenting the original ECG signal and make the R points lie at the center of each ECG segment, see Figure 4.

Support vector machine
The SVM is a machine learning classifier. It has been exploited in different fields as in [16]. The target of the SVM is to determine a hyperplane in N-dimensional space (N is the number of features). This hyperplane distinguishes between the data of classes. For example, to separate data of two classes, there are infinite hyperplanes that could be chosen for the separation, but there is only one hyperplane that has maximum margins between the classes, as shown in Figure 5 [17][18][19]. In this paper, The Radial base kernel function with an automatic Kernel Scale (KernelScale) is used for the SVM.
(a) Before the classification (b) After the classification [11].

Multilayer neural network
The MLP is a type of multilayer NN. Typically, these networks are structured from at least two computational layers and the input layer. The computational layers are the output layer and at least one hidden layer. Figure 6 shows a MLP network [20,21]. The learning algorithm updates the connection weights of the computation layers. There is a relatively simple learning algorithm for a predefined neural network. This algorithm is known as backpropagation. This is the most popular learning algorithm among hundreds of different learning algorithms. The algorithm has two phases. First, the training set is supplied to the input layer of MLP, and its multilayers forward the training set from layer to layer until reaching the output layer to generate the output pattern. In the second phase, a comparison is made between the output pattern with the desired output to calculate the output error. The error (if exists) is then backward from the output layer to the input layer. While the error is backward, the weights of the network are updated. This procedure is repeated until the error is minimized to a specific range [20] . [ 22 , 23 ] This study used the MLP classifier with one hidden layer using the Levenberg-Marquardt backpropagation training algorithm with an error rate set to 0.001 and the evaluation function is Mean Square Error (MSE) .

Discrete Wavelet Transform (DWT)
DWT is widely used in biomedical engineering due to its property of revealing timefrequency components of the signal. It has various types of wavelet filters to decompose a given signal into detail and approximation coefficients [24]. The signal is processed by a low pass filter and then down-sampled by two to produce the approximation coefficients of the next level. The detail coefficients are produced from filtering the signal by a high pass filter and downsampled the output by two. The detail components are kept and the approximation can farther be de-composed to the next level by passing it through the same low pass and high pass filters then downsampled the out coming by 2. This process can be repeated to decompose the signal to the target scale. Figure 7 shows two levels of DWT decomposition [25,26]. For the reconstruction process, the reversed methodology can simply be applied to the signal, so the detail and approximation components at the larger scale are fed back through the low and high pass filters respectively. Before that, the coefficients have to be upsampled by 2, see Figure 8 [25,27].

Correlation
The similarity degree between two signals is utilized by correlation which has enormous applications such as sonar, radar, geology, etc. Let x(n) and y(n) be two different signals so cross-correlation (rxy) of these two signals can be calculated as: Where m is the time shift (lag) parameter and the subscripts xy of cross-correlation indicates the sequences being correlated. That's in Eq. 2, only the signal y(n) is shifted from right to left with respect to x(n) signal when m changes from positive to negative. When x(n)=y(n) which is a special case, the operation is called autocorrelation. sinus rhythm, AF … etc. For each class, 10-second sessions at least were collected. The recordings are digitalized at 360 Hz with a gain of 200 adu / mV. The ECG signals were derived from one lead and all were formatted in mat format (Matlab) [29]. This work distinguishes between only two ECG classes: normal sinus rhythm and AF, where these two classes belong to 19 and 6 subjects, respectively. However, the data acquisition of one subject for the second class has a problem so it is discarded. Therefore, the data of only 5 subjects had been taken for two classes to balance the dataset. Table 1 illustrates the records used in each ECG class. After segmenting the ECG, 100 single heartbeats are extracted from each subject so the set of the heartbeats are 500 from 5 subjects of each class. 10-fold cross-validation was used in this work for evaluating the classification rates of the classifiers.

Results
The performance evaluation of the classifiers is based on the classifier prediction ratio for testing tuples correctly and incorrectly according to the following equation: Where (TP) is the number of data tuples that are predicted correctly as a positive class while the false negative (FP) is the number of the data tuples which are predicted wrongly as a positive class. True negative (TN) represents the number of the data tuples that are predicted correctly as a negative class while the false negative (FN) depicts the number of the incorrect data tuples predicted as a negative class.
For the MLP classifier, the neurons of the hidden layer are experimentally specified. Some experiments were accomplished to determine the number of neurons that can produce the best performance (best classification accuracy with low SD). Each experiment was repeated 10 times for obtaining a more reliable result. Figure 9 illustrates the means of accuracies and SDs of the applied experiments. It is clear that the hidden layer of 22 nodes has attained the best performance. http://www.i-joe.org The classifications are repeated ten times for both classifiers in order to have more reliable results. Figure 10 highlights the classification rates for both the SVM and MLP classifiers. This figure shows that the MLP has recorded a larger classification rate of about 1.4% but it has more classification error around ten runs (SD=1.4). Therefore, SVM provides a more stable performance.

Fig. 10. Classification rate for SVM and MLP
The proposed ECG segmentation procedure has effect to the classification impact. This algorithm found the mean interval of ECG signal is 294 samples, 147 at both sides of the R point. This is 66 samples less than the expected interval (i.e., 360 samples). Figure 11 shows the difference between normal segmentation (single beat every one second) and the proposed ECG segmentation algorithm. The classification rates of two classifiers (SVM and MLP) were also examined and compared. Normal ECG segmentation procedure degrades the performances of our procedure to 77% and 80% respectively of the two classifiers. The low performance of the classifiers was caused by the synchronization of a single heartbeat (i.e. the set of heartbeats has different phases) see Figure 11 (a). The proposed method could improve the performance of the two classifiers to about 19% and 17% for the SVM and MLP, respectively. Figure 12 highlights the performance before and after applying the proposed segmentation algorithm. In the proposed ECG segmentation algorithm, mean and SD are used to determine the threshold value but the ECG signal is corrupted with various artifact noises that influence the mean and the SD of the ECG signal, see Figure 13. Therefore, the algorithm uses the local mean and SD of the signal by windowing the signal with 360 samples (1 sec.) window. Determining the R point is done by choosing the first single sample or the medial sample of the first successive samples that pass the threshold value. The reason for using the ECG signal as an exponent of 1.1 is to increase only the positive values of the signal exponentially, and this increases the opportunity that the Rpoint crosses the threshold value. Also, this prevents the negative values from disturbing the procedure, unlike other methods like the power of 2. A small base (1.1) was used to overcome the overflow problem of programming language variables. After determining the index of R points for the whole dataset and the mean interval of the single heartbeat, the algorithm can segment the ECG signal correctly and makes the R points lie at the center of the heartbeat.

Conclusion
In real-time ECG systems, using the simplest methods to manipulate and extract features from the ECG signal for classification has crucial priority especially when ECG applications need quick response to rescue human life. Our preprocessing procedure is composed of using traditional analyzing methods such as cross-correlation, wavelet transform. Straightforward statistical methods (median, maximum … etc.) were exploited to extract features from the preprocessed ECG signal. The extracted features are then tested using SVM and MLP. SVM provides a reliable classification rate and this is shown with a low classification error (SD=0.131), but its performance (96.2%) is lower than MLP performance about (1.4%). Segmenting the ECG signal has also effects on classification impacts for SVM and MLP. These effects are visible with the performance degradation of about 19% and 17% for SVM and MLP whenever excluding the proposed ECG segmentation algorithm.