Seizure Detection in Epileptic EEG Using Short-Time Fourier Transform and Support Vector Machine

sugondo@telkomuniversity.ac.id Abstract— Epilepsy is the most common form of neurological disease. The electroencephalogram (EEG) is the primary tool in the observation of epilepsy. The detection and prediction of seizures in EEG signals require multi-domain analysis, one of which is the time domain combined with other approaches for feature extraction. In this study, a method for detecting seizures in epileptic EEG is proposed using analysis of the distribution of the signal spectrum in the time range t. The EEG signal, which includes normal, inter-ictal, and ictal, is transformed into the time-frequency domain using the Short-Time Fourier Transform (STFT). To find the highest detection accuracy, simulations were carried out on varying window length, overlap, and FFT points. The frequency distribution and first-order statistics were then calculated as feature vectors for the classification process. A support vector machine was employed to evaluate the proposed method. The simulation results showed the highest accuracy of 92.3% using 25-20-512 STFT and quadratic SVM. The proposed method in this study is expected to be a basis for detecting and predicting seizures in long-term EEG recordings or real-time EEG monitoring of epilepsy


Introduction
Electroencephalogram (EEG) signal recording is one of the main tools in diagnosing epilepsy [1]. Observation of the EEG signal in cases of epilepsy is an essential part of detecting, analyzing, and determining treatment [2]. The localization of the epileptogenic zone can also be analyzed using EEG [3]. The main task is to determine the seizure and non-seizure EEG signals, whereas the non-seizure signals include normal, pre-ictal, and inter-ictal. This type of EEG signal has a distinctive shape and rhythm that allows it to be visually observed. [4]. However, it can be very confusing for long-term EEG recordings, especially if the observation is multi-channel. Therefore, a system that is able to detect and evaluate the onset of seizures automatically is required [5].
Numerous algorithms and methods for epileptic EEG classification and detection were proposed and reported in related studies in the past few decades. Time-domain analysis, frequency domain analysis, and signal complexity analysis have been used for feature extraction and combine with machine learning in the classification stage. A time-domain approach to EEG epileptic classification was reported in [6]- [8]. In those studies, first-order statistical calculations were used for feature extraction. The simulation results provide excellent classification accuracy.
On the other hand, feature extraction based on frequency analysis was reported in [9]- [12]. Seizure or ictal EEG has dominant spectral power at low frequencies compared to normal EEG. However, observations in the frequency domain are susceptible to noise, both low and high frequencies, which commonly contaminate EEG signals.
Recently, feature extraction using signal complexity analysis or signal dynamics has been widely applied to EEG signals [13]. The disruption of electrical activity in the brain causes changes in the degree of complexity of the EEG signal. Decreased signal complexity is often associated with impaired brain function. Estimation of signal complexity using entropy and fractal is becoming a popular method for feature extraction. An entropy-based feature extraction method in detecting EEG seizures has been reported in [14]- [17].
Meanwhile, fractal calculations for epileptic EEG detection were simulated in [18]- [20]. According to these studies, the complexity features combined with various classifiers were able to generate high accuracy. Multi-domain analysis such as power spectral combinations with complexity was also reported in [5], [21].
Instead of these studies being successful in EEG epileptic classification, research has led to detecting and predicting seizures on long-term EEG recording [19], [22]. Therefore, a time-based analysis method combined with other feature extraction methods is required. Thus, the features calculated in the time series t can be observed. In this paper, a seizure detection in EEG signal simulated using Short-Time Fourier Transform is proposed. With this proposed method, the power spectral distribution information is obtained over the observed time. The first-order statistics are calculated as a feature vector which is further classified using a support vector machine. This proposed system is expected to be applied to detect and predict ictal EEG in the EEG recording of epilepsy patients.
As a guide, this paper is organized as follows: section I discuss the issue regarding Epileptic EEG, and section II explains the EEG dataset and the methods used in this study. The results and discussion of simulations that have been carried out are presented in section III. Section IV contains an overview and conclusion of the research followed by implications and future work.

2
Material and methods Figure 1 shows a proposed system for the classification of seizure EEG signals using STFT and statistical calculations. This process includes preprocessing, windowing, transformation, feature extraction, and validation. The details of each process are described in the following subsections.

EEG dataset
The EEG signal data used in this study were obtained from the University of Bonn [23]. The EEG signal from the dataset has a sampling frequency of 173.61 Hz with a spectral bandwidth of 0.5-85 Hz. This study only simulated three data sets from the five available data classes, including O = normal, N = interictal, and S = seizure. Each data class consisted of 100 data where for the normal class recorded on the face of the scalp, class N consisted of 100 EEG signals from the epileptic patients in the seizure-free intervals (intracranial electrodes). Meanwhile, EEG signals acquired from epileptic patients are called the S dataset and recorded during seizure activity. Examples of each dataset are presented in Figure 2.

Preprocessing the signal
In the initial stage, the EEG signal was preprocessed in amplitude normalization and mean-removal, following Equations (1) and (2).
where X(n) = EEG signal sample -n N = total number of samples This process yielded a signal with an average of 0 and a range of −1 to 1.

Short-time fourier transform
Short-time Fourier Transform (STFT) or spectrogram transforms signals from the time domain to the time-frequency domain. In this algorithm, the signal was segmented at a particular time interval. The segmented signal was transformed using FFT into the frequency domain. Equation (3) represents the mathematical expression of STFT, where x(k) is time-series signal, g(k) shows the window function used in STFT and e j nk L − 2π / is the FFT process.
The STFT illustration is shown in Figure 3. STFT sampled the signal in the specified window and performed FFT on each signal sample. Furthermore, the results of this process were arranged into a signal representation in the time-frequency domain. X STFT / max(X STFT ) was then calculated to get the characteristics of the signal.

First order statistic
The output of the previous process was a 1 × N matrix with N being N points FFT. Since N was still quite large, it was reduced by calculating first-order statistics, which were then used as feature vectors [24]. These characteristics were mean, variance, skewness, kurtosis, and entropy. Thus, regardless of the length of the signal, the five features would be generated. The mathematical equations of the five statistical characteristics are as shown in Equations (4)- (8).
Where Y i is the data point, N is the number of data, and pdf y ( ) is the probability of the value of y in the EEG signal.

Support vector machine
In this study, the Support Vector Machine (SVM) was used to classify the signal based on statistical parameters as input. SVM works using a hyperplane to separate data between classes. The data point or feature closest to the hyperplane is called a support vector. The delta between the hyperplane and the super vector is called the margin. In SVM, the most important thing is to find the maximum margin between the hyperplane and super vector. The choice of hyperplane is highly dependent on obtaining the maximum margin [25].
As a reminder, the EEG epileptic signals are classified into three classes: normal, interictal, and ictal. These were the primary ways of using a linear hyperplane. In some cases, it is necessary to modify the hyperplane using kernel functions to get the highest classification accuracy [26]. So, in this study, a variety of kernel functions were used to get the best performance. Linear Kernel was used as a kernel algorithm because of its most straightforward functionality and was often equalized to its non-kernel counterpart. This was given by the product in ( , ) x x i j plus the optional constant c T as shown in Equation (9) [12].
Then, the quadratic kernel can be expressed in Equation (10).
Meanwhile, the cubic kernel can be expressed in Equation (11). 3 Results and discussion Figure 4 shows an example STFT plot for an EEG seizure signal using a 256 points FFT with some windowing parameters. The difference between the three is the detail at high frequencies. At a resolution of 25-20-256, the signal is segmented on a window length of 25 samples and an overlap of 20, meaning that the window shift is only 5 data samples, so that the STFT plot looks very tight. This is a contrast when compared to 200-100-256. Window length of 200 with an overlap of 100 sample data will make the distance between windows 10 sampled data. This makes the STFT 200-100-256 plot the lowest resolution compared to the other two sample plots.
The result of the XSTFT/max(XSTFT) calculation is shown in Figure 5.  The main difference is that at 256 points FFT, the graph tends to be closer to the vertical axis than 512 points FFT. This is related to the mapping of Fs/2 to N points in the FFT process in STFT. The XSTFT/max(XSTFT) value is quite significant in dimension, so the statistical parameters are calculated and used as a feature of each class of EEG signal. This statistical parameter is intended as a data reduction method to make the classification process simpler and faster.  Figure 6 shows a boxplot of the five statistical parameters that will be input to the SVM. It is generally seen that the seizure EEG signal produces a broader range of all five statistical parameters except for entropy. This is because the EEG signal in the seizure condition has a higher fluctuation compared to the other two data classes. The five parameters tend to overlap, so it is assumed that the generated accuracy will not reach 100%. The classification accuracy of the test using 512 points FFT and 256 points FFT can be seen in Figure 7 and Figure 8, respectively.
The highest accuracy is 92.3% using 25-20-512 with quadratic SVM. The windowing parameter 25-20-N always produces higher accuracy than the other windowing parameters. Windowing 25-20-N produces the highest accuracy on all SVM kernels used. In this case, N point FFT does not have much effect on achieving the highest accuracy. In general, this shows a correlation between STFT resolution and accuracy, where the more detailed STFT can generate higher accuracy.
The number of features generated by the proposed method is not affected by the length of the input EEG signal data. Regardless of the length of the input data, five statistical features will be generated from the feature extraction process. The dimensions of each process are shown in Figure 9.
In Figure 9, K is the length of the EEG signal data, N is N points FFT where L is the number of signal segments resulting from the windowing process. The max(X STFT ) matrix has dimension L × 1 so that it will produce N × 1 for X STFT /max(X STFT ). Calculation of statistical features at the end of the process will produce dimensions of 1 × 5. This is an advantage of the proposed method because the resulting feature does not depend on the signal length. Meanwhile, the accuracy is determined by the STFT resolution used, as shown in Figure 7 and Figure 8.  Table 1 shows a comparison of several previous studies with three data classes employed in this study. In terms of the number of characteristics that are relatively not much different, the accuracy of the proposed method is the lowest compared to previous studies. But the proposed method still has the potential to be explored further. The selection of windowing function and resolution for signal segmentation can still be explored further. In this study, the Keiser window with 3 certain resolutions was used, trials using different window functions were expected to produce higher accuracy. In addition, STFT only uses 256 and 512 FFT points. The use of different FFT points is expected to produce different accuracy.

Conclusion
The EEG signal is a non-stationary signal, so that the analysis on the time-frequency domain is the right one to use on the EEG signal. In this study, an STFT-based feature extraction method was proposed to classify seizure EEG signals. The magnitude of the signal was normalized, and the statistical parameters were calculated. Using this method, five features were obtained regardless of the length of the input EEG signal data. The highest accuracy achieved is 92.3% using quadratic SVM. The accuracy obtained is quite competitive with previous research, with the advantage of having fewer features. STFT is highly dependent on the selection of the type, window length, and N-FFT used. This research shows that STFT with a higher resolution produces higher accuracy. Time-frequency domain analysis has the potential to be applied to seizure prediction in real-time EEG recording. Furthermore, no studies are exploring the selection of windowing and the appropriate STFT resolution for the case of EEG signal processing. This will become an interesting research topic in the future.

5
Sugondo Hadiyoso received the Master in Electrical-Telecommunication Engineering from Telkom University, Bandung, Indonesia, in March 2012. His research interests are wireless sensor networks, embedded systems, logic design on FPGA, and biomedical engineering. In 2018-present, he became a doctoral student in electrical engineering at the Bandung Institute of Technology. The focus of his doctoral research is signal processing and analysis of EEG waves. (email: sugondo@telkomuniversity.ac.id).