Real-Time Concept Drift Detection and Its Application to ECG Data

— Prediction of cardiac disease is one the most crucial topics in the sector of medical info evaluation. The stochastic nature and the variation concerning time in electrocardiogram (ECG) signals make it burdensome to investi-gate its characteristics. Being evolving in nature, it requires a dynamic predictive model. With the presence of concept drift, the model performance will get worse. Thus learning algorithms require an apt adaptive mechanism to accurately handle the drifting data streams. This paper proposes an inceptive approach, Corazon Concept Drift Detection Method (Corazon CDDM), to detect drifts and adapt to them in real-time in electrocardiogram signals. The proposed methodology results in achieving competitive results compared to the methods proposed in the literature for all types of datasets like synthetic, real-world & time-series datasets.


Introduction
In the real world, data is normally nonstationary. In much difficult data evaluation uses, data develops throughout the period and needs to be examined in near real-time [1]. This triggers complications since the forecasts may turn much less correct as the point in time goes by or possibilities to enhance the precision may be skipped. Concept Drifts relates to scenarios in the event that the relationship among the insight data and the focus on shifting, that any model is usually attempting to alter over time [2]. The challenge of concept drift is certainly of strengthening significance as a growing number of data is structured in the mode of data streams alternatively as opposed to stationary sources, and it is impractical to foresee that data allocation stays steady through a prolonged time frame. It is not confounded that the issue of concept drift is analyzed in different research areas including machine learning, data mining, data recuperation, and so on recommender systems [3].
Let us consider that data stream is a set of samples (Xk, yk) where k = 1, 2, 3, . . . where X is a feature vector, and y is the label that belongs to k number of classes. In evolving data streams, the data distribution is changing timely. Concept drift between time t0 and t1 is defined as, ∃X: P t0 (X, y) ≠ P t1 (X, y) Where 0 indicates the joint probability between the input variable set X and the target variable y at time t.
Generally, concept drifts are distinguished concerning their behavior and speed of changes. The drift speed [4] is portrayed through the contrary to a transition period of drift between adjacent concepts. Abrupt concept drift occurs due to faster speed whereas the slower speed leads to the gradual concept drift. Moreover, new concepts may reoccur. The methodology of concept drift adaptation is binarized into active and passive approaches [5]. The active approach first finds the change points in the input data streams and then adapts the model, which is trigger-based adaptation. A proactive approach continuously adapts the model every time new instances arrive. The drift detectors are mechanisms that identify the change points. The concept drift detection methods are classified in following categories [6]: a) sequential analysis methods -SPRT [7], PH [8] and CUSUM [8] b) error rate-based detection methods -DDM [9], EDDM [10], ECDD [11] and RDDM [12], c) data distribution-based detection methods -ADWIN [13] and FHDDM [14].

Related Work
This section focuses on a few drift detection algorithms. Every algorithm's systematic review, framework, and logic are discussed.

Drift Detection Method (DDM)
The DDM is proposed by Gama et al. [9]. DDM is the most refereed drift detection algorithm which supervises the error rate of the base learner. Two levels are defined for concept drift detection in DDM, warning level, and drift level. The error rate pt and its standard deviation st at time t are estimated. Updation of the variables pmin and smin is done if, It signals the warning state when, and signals the drift state when, The algorithm modifies itself every time drift is observed based on stored instances.

Early Drift Detection Method (EDDM)
Overcoming the limitations of DDM, a new algorithm EDDM was proposed by Baena-Garc´ıa et al. [10], that estimates the distance between two adjacent classification errors pt, standard deviation st at time t and updates the variables pmax and smax if, The warning state is alarmed when, and the drift state is alarmed when, After detecting the concept drift, the stored instances are used to relearn a new model.

ADaptive WINdowing (ADWIN)
The ADWIN is a two-time window-based drift detection approach proposed by Bifet et al. [13]. Without defining the size of windows in advance, ADWIN takes the total size n of a window W. Then two sub-window sizes referring to the variation rate between two sub-windows wold and wnew are computed.

Accurate Concept Drift Detection Method (ACDDM)
This Algorithm proposed by M.M.W. Yan [15] in 2020 detects the inconsistency of error rates using Hoeffding's inequality theory by analyzing the prequential error rates. The current error rate of the base learner is used to trigger concept drift. Let us assume a sequence of instances (Xj , yj) where j = 1, 2, 3, . . ., instances that are used to test the model. The error is calculated as follows: where, L = loss function, ̂ = prediction and yi = observed values.

Corazon Concept Drift Detection Method (Corazon CDDM)
Fig. 1 describes the proposed Corazon Concept Drift Detection Method (Corazon CDDM) methodology for real-time detection and adapting to drift in input data. The drift detector is responsible for detecting the variations in stream data. After detecting the drift feedback mechanism is applied to retrain the model. When a change has been detected by drift detector, the adaptation process is activated and it is labeled as model retraining manager. This retraining manager is responsible for considering the previously learned model before proceeding with retraining. The objective is to don't discard previously learned model and fit it partially in retraining.

Dataset descriptions
Synthetic datasets are created using Agrawal generators in Massive Online Analysis (MOA). For experimentation following real-world and synthetically generated datasets are used.
Agrawal. This stream generator [16] has 6 numeric attributes and 3 categorical attributes. These attributes indicate the random loan applications. Ten different functions can be produced commenting on the viability of loan approval. Variation in functions leads to concept drift. We used the six functions and kept perturbation noise 5%.
Electricity. This is a widely used dataset [17] and was collected from the Australian New South Wales Electricity Market. The dataset contains a total of 45,312 instances. Every example of the dataset refers to a period of 30 minutes. The price variation (increase or decrease) for the corresponding average of the last 24 hours is reflected in the class label.
MIT-BIH Arrhythmia Dataset. This database [18] contains 48 records of heartbeats at 360 Hz of 47 patients. Each record contains two ECG leads. Characteristics of the dataset are as follows: We have generated the synthetic streams by introducing the drifts at regular time intervals on a total number of instances. The details are described in Table 1. The advantage of generating the evolving data streams is it introduces heterogeneous concept changes with variable durations.

Algorithm implementation
Algorithm 1 describes the comprehensive model used for training the input data. The model m is trained over Input data and put it in fit(x,y) function. Algorithm 2 takes care of detecting the drift in the input stream and adapt to it in real-time. Once the drift is detected, the previously learned model is cloned and relearning continues on newly arriving data.  [19] are used as base learners. For synthetic datasets, different random sequences are generated after running the drift detection program 5 times. Whenever concept drift is found, the algorithm relearns the new model as mentioned in Algorithm 2, and then, the precision, recall, and prequential accuracy were computed. Precision and recall are two significant model evaluation metrics. Precision, which is also known as a positive predictive value, is the percentage of your results that are relevant, while recall, which is also known as sensitivity, refers to the percentage of total relevant results correctly classified by an algorithm. Table 2 & Table 3 are listing the results when SGD is used as a base classifier. ACDDM gives the highest accuracy for Agrawal-60,000, Agrawal-600,000 dataset & MIT-BIH datasets, while ADWIN is performing better for Electricity dataset. Table 4 & Table 5 are listing the results when Naive Bayes is used as a base classifier. Overall Naive Bayes is not able to produce quantifiable results except for Electricity dataset. For Agrawal-60,000 and Agrawal-600,000 datasets, all drift detection methods result in almost the same accuracy of 68.64%. Being a multiclass dataset, Naive Bayes does not work properly on the MIT-BIH dataset.      Table 6 & Table 7 describes the performance results of experiments using VFDT as a base learner. For Agrawal-60,000, Agrawal-600,000 and Electricity datasets ACDDM archives the highest accuracy, but DDM also shows competitive results compared to ACDDM. For the MIT-BIH and Electricity dataset results of DDM & ACDDM are the same.   2 shows the accuracy results on electricity dataset. Real-world data streams are prone to frequent changes in data distributions. For electricity dataset, ADWIN drift detector with SGD as base learner produces the highest accuracy of 85.28%. Overall, it is observed that Corazon CDDM method with ADWIN drift detector gives little bit increased accuracy of 85.28% than any other drift detectors experimented in literature. Whereas, for the MIT-BIH dataset, as shown in Fig. 3, ACDDM detector achieves accuracy of 94.97% & DDM detector achieves accuracy of 89.11% when VFDT is used as base learners using Corazon CDDM approach.
After analyzing all experimental results on Agrawal synthetic dataset, we found that Corazon CDDM methodology with ACDDM drift detector achieves maximum accuracy, precision and recall. If we compare the ACDDM drift detector performance in [15] with proposed methodology, Corazon CDDM shows remarkable improvement in accuracy when VFDT is used as base learner. Table 8 is showing comparative summary of these results. We observed remarkable growth in performance of ACDDM drift detector, where accuracy is increased to 95.45 ± 1.86% from 76.02 ± 1.09% for Agrawal-60,000 dataset and for Agrawal-6,00,000 dataset accuracy increased from 84.85 ± 1.27% to 95.54 ± 1.92%. In summary, the proposed approach shows notable performance for synthetic datasets and real-world MIT-BIH dataset. As these datasets contains sufficient number of instances for learning, the performance of Corazon CDDM is outstanding. But on the contrary, real world dataset like electricity didn't produce expected accuracy with proposed approach, due to lack of sufficient learning instances. Overall combination of incremental learning process of VFDT and feedback mechanism of Corazon CDDM befit perfectly for both synthetic and real world stream datasets.

Conclusion and Future Work
In this paper, the inceptive approach of concept drift detection and adaption in time series data i.e., MIT-BIH is introduced and experimented with various detectors. To date ECG signals were not analyzed for the presence of drift thus no methodology was proposed for the detection of drift in it. The Corazon CDDM approach uses feedback mechanism after detection of drift and considers previously learned model into consideration while progressing further with retraining. Experimental results on synthetic datasets and MIT-BIH dataset show the effectiveness of the proposed adaptive approach. Future work will be focused on experimenting adaptive approach using Deep Learning algorithms as base learners to improve the performance.