Methods of Extracting Feature from Photoplethysmogram Waveform for Non-Invasive Diagnostic Applications

—This paper presents a bibliographical survey of recently published research on different techniques for extracting features from photoplethysmo-grams (PPG). These techniques and approaches have been implemented to increase the accuracy of disease detection. Moreover, several aspects of PPG wave-form analysis, including techniques for feature extraction, parameters involved, and performance comparisons, are discussed. This review will serve as a comparative study and reference for researches working on PPG waveform in healthcare applications.


Introduction
Photoplethysmography (PPG) is a noninvasive optical technique for determining changes in blood volume or blood flow in each heartbeat [1]. It is performed using a pulse oximetry (POx) with light emitting diodes that transmit light through tissues sensed by photodiodes [2], which measure changes in light absorption. PPG also can explain the vascular hemodynamics of humans. The cardiovascular system includes the heart, whose rhythmic activity is represented by systole and diastole [3]. PPG signals consist of two components: a pulsatile part (alternating current [AC] component) from arterial and venous blood and a quasi-static non-pulsatile part (direct current [DC] component) attributed to tissue and venous blood [4]. The pulsatile component of a PPG signal may include the descriptive data of vascular health, such as heart rate (HR) variability, blood pressure (BP), and respiration [5]. The technology is utilized in commercially available medical devices, such as POx, which is now a part of standard patient observations [6]. One pulse of a PPG waveform is composed of systolic and diastolic waves, which are known as reflected waves [7]. Figure 1 shows the characteristics of PPG and the position of the peak in a single PPG waveform. Blood volume is increased during systole because the contraction of the heart ventricle, which causes light transmission through the peripheral vasculature, is reduced and vice versa during diastole condition [8]. The amount of light detected by the receiver decreases with the increase in blood volume during systole. The received light is then transposed and defined as systolic wave.
The wave contour of a PPG signal is simple and has not been analyzed because of difficulty in detecting changes in the phase of inflections; thus, researchers have come out with the first (FDPPG) and second derivatives of PPG (SDPPG) signals to simplify the interpretation of original PPG waves [9]. The waveforms of original PPGs, FDPPG, and SDPPG are shown in Figures 2a, 2b, and 2c, respectively. The use of PPG signal derivatives has been proposed to highlight and locate inflection points because these derivatives improve the locations of these points [10]. PPG signifies blood movement in vessels; thus, the FDPPG can represent the velocity of blood, and the SDPPG represents the acceleration of blood [11].  http://www.i-joe.org PPG is a simple technique and does not require the attachment of electrodes to a patient's chest compared with electrocardiogram (ECG). The application of this technique in the healthcare industry can be of great help in chronic disease management [12]. The aim of this article is to study several methods for analyzing raw PPG waveforms and derivatives in terms of feature extraction techniques, the parameters involved, and their performance. This review also discusses the overall challenges that need to be overcome in future works and the multiple techniques for assessing PPG waveforms for healthcare applications.

Photoplethysmogram Waveform
Researchers have studied PPG morphology because it can be obtained noninvasively and is much cheaper than other techniques. In this section, a number of methods for using PPG morphology in the estimation of disease and study of factors that change PPG morphology have been proposed. Categorizing these methods can be difficult because of the use of different focus parameters, but they can be classified into the following methods: peak analysis, Gaussian method, cardioid-based (CB) graph, and fiducial point analysis. Table 1 shows the comparison of the techniques used in assessing PPG morphology.
Based on Table 1, most researchers tend to use peak analysis in analyzing basic PPG signals. Some researchers combined PPG with other biomedical devices, such as ECG and electromechanical film (EMFi). The peak analysis technique is simple because it only focuses on the basic features of PPG morphology, such as systolic and diastolic peaks. This technique involves preprocessing steps before feature extraction for the removal of motion artifact (MA); hence, the accuracy of this type of algorithm in detecting diseases is enhanced. Systolic and diastolic peaks are determined in this technique because these features provide several important information. Other features can be also obtained through this technique, such as area ratio (AR), HR, delta time (DT), crest time (CT), area time, amplitude, area under curve (AUC), and the valley and peak of a PPG signal, which provide different pieces of information that can be extracted from a hemodynamic system. The combination of PPG and ECG is usually applied in this research field because both measurements provide other informative features, such as pulse arrival time (PAT), pulse transit time (PTT), and inter-beat interval (IBI). Disease detection and prediction in the biomedical field can be enhanced by combining PPG and ECG [30]. The researchers also used EMFi devices in detecting diseases. EMFi sensors have been used by several studies [14] - [15] as dynamic pressure pulse wave (PW) recorders placed on the wrist, cubital fossa, and ankle. PPG, ECG, and EMFi has been used in combination to relate RI (Reflection Index) with atherosclerosis. *Abbreviation used: PA (peak analysis); AR (area ratio); RI (reflection index); HR (heart rate); Sys_T (duration of systolic); Dia_T (duration of diastolic); DT (delta time); CT (crest time); AT (area time); AMP (amplitude); AF: atrial fibrillation; NMS (Neutrally mediated syncope); PAT (pulse arrival time); AUC (area under curve); PTT (pulse transit time); PPI (peak to peak interval); T (time interval); Ar (area); RP (Raynaud's phenomenon); SSc (Systemic sclerosis); IBI (inter-beat-interval); A (systolic position); B (reflected wave position); AW (systolic width); BW (reflected wave width); DST (distance successive troughs); BR (breathing rate); HRV (heart rate variability); CBG (cardioid based graph); Avg. (average); RMSSD (root mean square of successive difference); RR (respiratory rate); ACC (accuracy); MAE (mean absolute error); SN (sensitivity); SP (specificity); PR (precision); Rpre (onset difference); SD (standard deviation); PED (peak error detection); ECG (electrocardiogram); EMFi (electromechanical film); F (finger); T (toe); S (scalp); E (earlobe); NM (not mention); W (wrist).
AR is defined as the ratio between the area opu2 (AUC of the descending limb of PPG but above the baseline) and the area of opu2 (area of triangle). The influence of stimuli on the morphology of a PPG can be analyzed through the AR of the PPG signal. The AR of a PPG signal can be easily understood by referring to Figure 3. HR can be extracted from PPG morphology by monitoring the peak-to-peak (P-P) interval of a PPG signal. HR is an important feature for detecting cardiac diseases [13]. DT is the time difference between systolic and diastolic peaks and is related to the time taken for the pressure wave to propagate from the heart to the periphery and back [31]. CT is the time from the foot of PPG waveform to its systolic peak. This feature is an age-dependent feature that shows clinically significant variations between health and diseases, such as arteriosclerosis, hypertension, and various dermatoses [32]. The region of one PPG pulse that consists of the systolic and diastolic peaks of the PPG pulse is also known as area time. This parameter is measured from the foot of the PPG pulse to the systolic or diastolic peak. PPG amplitude is small if the vascular compliance is low during increased sympathetic tone. PPG amplitude decreases when arterial BP considerably increases because of increase in sympathetic tone [33]. Modir [16] used HR, DT, CT, area time, and amplitude features from PPG morphology to detect seizure. All the features extracted were evaluated using t-test for the determination of the significant difference between the features with seizure condition. The extracted features showed a significant difference towards seizure prediction and epileptic seizure prediction at a p value of <0.05.
A study [8] detected dyslipidemia by using PPG to replace an expensive ultrasound flow-mediated dilation device. The DC component was first removed, and the AC component was extracted and normalized for the prediction of the AUC through the peak analysis technique. After normalization, the features were extracted, and the threshold for each subject was defined. The response was blunted in pathologic subjects, whereas the PPG percentage AC change showed a sharp slope in both directions in healthy subjects. AUC was the feature extracted from the PPG morphology by peak analysis technique. Two types of AUC were discovered: the AUC before the peak time, which indicates the increase in micro vessel diameter and the AUC after the peak, which reflects the decrease in micro vessel diameter toward the baseline. In this research [8], the subjects with dyslipidemia showed a remarkable difference with the healthy subjects in terms of the shape of the signal (AUC). Peak analysis method can be used in determining the valley and peak of PPG morphology in a time domain, which are important to determining the systolic and diastolic phases of a PPG signal, respectively. The systolic region starts with a valley that indicates the beginning of PW and ends with the PW systolic peak, whereas the end of PW is marked by another valley at the end of the diastolic region. According to [19], the systolic region is also called rise time and varies only in a narrow range that is inversely proportional to the HR compared with the PW duration. Potential valleys and peaks are determined by calculating an adaptive threshold with a moving average filter with a span size that is 75% of the last valid PW duration. Then, the absolute maximum and minimum are determined as the potential PW peak and valley above the threshold, respectively [19].
PPG and ECG has been widely used in combination in the biomedical field. PAT can be extracted through peak analysis because it is defined as the time between the Rpeak in the ECG and the onset of PW in a PPG. The first step in extracting the PAT parameter is to determine the R-peak in an ECG signal. Then, the first minimum of PPG PW succeeding the R-peak, the absolute maximum of PPG PW, and the foot point of the PPG are determined. PAT consists of the sum of the cardiac pre-ejection period that covers the isovolumetric ventricular contraction, whereas PTT reflects vascular components, such as vessel tone and stiffness [17]. PTT can be used in calculating stiffness index (SI) by dividing the height of a subject by PTT. Diastolic peak tends to be close to a systolic peak that reduces PTT and increases SI as age increases [34]. Wu [20] hypothesized that PTT fluctuation is correlated with respiratory rhythms. In this study, PTT was obtained by using the R wave in an ECG as the starting point of a PW and the peak of PPG as the transmitting end point, as shown in Figure 4. PTT can be computed by determining the R-peak and the peak of PPG. In addition, two studies [14] - [15] applied three devices, namely, PPG, ECG, and EMFi, to calculate the feature of RI. RI is a ratio between pulse inflection peak amplitude (second peak, b) divided by the pulse max amplitude (first peak, a) and can be an indicator for vascular assessment. The "a" component can be used as a measure of arterial compliance because it decreases with age [34]. Figure 4 explains the locations of components "a" and "b" in a PPG waveform. A linear trend fixed to the end points was subtracted from each individual PW for the calculation of RI. Next, the maximum of an individual PW was determined, and a point of notch divided the PW into systolic and diastolic parts. Then, RI was calculated by the previously mentioned method. The measurement of RI is important to the detection of atherosclerosis by hypothesizing that this disease affects arterial wall properties and causes differences in the measured arterial PW.  IBI is the time difference between the onset times of two consecutive pulses. Peak analysis was applied in detecting fiducial points in PPG waveforms, such as the onset of pulse. The reference beat labels were aligned based on the IBI sequence from PPG and ECG reference beat times. Force-interval relationship can be assessed during atrial fibrillation with the IBIs captured by PPG [23]. Banerjee [7] and Li [24] used the Gaussian technique to extract features from PPG pulses. Banerjee [7] performed a two-step Gaussian modeling of PPG pulses to detect coronary artery disease and estimate the approximate systolic wave and residual in order to determine the systolic position and width. Li [24] used the Gaussian technique to extract HR and BR. First, all Gaussian bases were extracted. Then, instantaneous frequencies and phases were extracted using a Gaussian basis and residue through Hilbert spectral analysis. The Hilbert transform then calculated the conjugated pair of the Gaussian basis. PPG pulses can be analyzed through segmentation and by using a CB graph. The distance between successive pulses can be found by identifying the starting and ending points of a PPG pulse in the segmentation process. Then, the CB graph is plotted according to a scattered XY graph, where the x-axis indicates the amplitude of the PPG pulses, and the y-axis indicates the differentiated PPG data set. The area of the CB graph indicates the heart abnormality of a person [25].
Wavelet transform (WT) uses a fixed function known as the mother wavelet, and the determination of mother wavelet provides the different features of a signal to be extracted. A wavelet is adequately described as a combination of scaling and wavelet functions; scaling functions provide an overview of the signal, and the wavelets represent fine details at various scales [35]. A wavelet is computationally complex but resistant to noise and uncorrelated artifacts [26]. Table 1 shows the use of wavelet in extracting features from a PPG in a recent study. Continuous wavelet transform (CWT) adjusts the continuous signal into tremendously excessive signals of two continuous variables, which are translation and scale. The CWT as a WT with properly sampled has continuous-time mother wavelet, a continuous expansion (scale) parameter, and a discrete translation parameter [36]. The CWT technique was used in identifying the PW's location within PPG and ECG signals. Mexican hat mother wavelet was used because the scaling factors are large and this wavelet has been proven to be the most appropriate. The effect of wavelet shape is challenging to evaluate, because PW morphology can vary strongly between individuals. The detection of peak locations for calculating PAT and PTT is influenced by the morphology or the change of PW [26]. The other mother wavelets used in CWT analysis were Meyer, Morlet and Mexican hat.
CWT technique is generally used in detecting the peaks of PPG signals segmented into heartbeats [27]. The subsequent CWT coefficients comprise the patterns of peaks and troughs, which can be utilized to distinguish the position and quality of peaks. The first Gaussian derivative was used as a mother wavelet called the "Ridger" wavelet as its CWT yields the upward and downward slopes of a peak. The Ridger peak detection algorithm starts by converting the data signal to signal of WT at different scales. Next, the local maxima and minima are acquired from each transform, and these points are related to the nearby maxima/minima points at various scales to produce a series of local maxima/minima along the scales. The series of local maxima are assembled together to their nearest local minima. The peak in the signal is characterized in the CWT coefficients as a pair of local maximum/minimum [37]. Li [21] proposed a hybrid wavelet that enhances signal quality and peak identification. This hybrid wavelet consists of suppression and peak identification. Suppression means to decrease the low-frequency noises of PPG signals and to improve the amplitude changing issue caused by baseline drift and partial MAs. Wavelet multiresolution analysis can be applied towards lowfrequency noise. Moreover, PPG signal can be decomposed to approach low-frequency noise by applying the approximation component at high decomposition level. This method requires the selection of a mother wavelet. Symlet was chosen as this type of mother wavelet supports orthogonal wavelets with very minimal asymmetry and the highest number of vanishing moments for a given supporting width. In addition, the function of symlet scale is similar to that of the PPG signal. After the selection, the next step was decomposition level determination followed by noise estimation and signal reconstruction. Peak identification consists of wavelet decomposition by using quadratic spline wavelet, the threshold setting, modulus maximum sequence calculation, pair selection, and peak identification. The peak was identified by searching the maximum value near the zero-crossing position in the original signal that corresponds to the peak [28].
Multiscale principal component analysis (MSPCA), which is the combination of wavelet method and principal component analysis (PCA), extracts features from PPG signals for the detection of respiratory activity. The WT method uses long window at low frequencies and short window at high frequencies, whereas short-time Fourier transform uses a single analysis window. Wavelet analysis is a robust tool because biomedical signals are quasiperiodic in nature. Discrete WT was used in decomposed signal into multilevel hierarchic frequency bands identical to filter banks. Implementing MSPCA to PPG signals has several steps. First, wavelet decomposition was performed for each column in the data matrix of PPG. Second, the covariance matrix of wavelet coefficients was computed for each scale. Next, PCA loadings were computed, and wavelet coefficients were scored. Appropriate number of loadings and the wavelet coefficients larger than appropriate threshold were selected. Then, all the computed PCAs were scaled together. Finally, approximate data matrix from the selected and threshold scores at each scale was reconstructed [29].

Derivative of PPG
In this section, the order of derivatives and the parameters extracted are briefly discussed. The first and second orders of derivatives are widely used in analyzing PPG waveforms, but several researchers use higher orders in assessing stress. Table 2 shows the description of the focus of the study and the order of the derivative used to find the result.
The systolic peak and foot points of PPG waveform can be extracted from the first derivative where foot points can be related to a zero-crossing point before a maximal inflection, and systolic peak is related to a zero-crossing point after the inflection. Systolic amplitude can be found after the determination of foot and peak because this amplitude is the change from the foot of wave to the following maximum point and responds to hand elevation [31]. A study about replacing the main peak in common PPG data with the point at which maximum ejection acceleration occurs used the FDPPG. A feature sequence is produced by this feature instead of a peak or through. Maximum ejection acceleration can be recognized as the ascending line of the raw data and FDPPG and is also indicated as an increasing volume of blood in the tissue [39].  The SDPPG waveform can be greatly useful in recovering the locations of the peaks. McDuff [40] used SDPPG to locate diastolic inflection point using video images and produced a high correlation in the mean of systolic-diastolic P-P time. SDPPG has five types of waves, namely, waves a, b, c, and d in the systolic part and wave e in the diastolic part, as shown in Figure 5. The first step in the assessment of arterial stiffness and other cardiovascular parameters was to detect waves a and b accurately. APG is obtained by creasing a noncausal filter and a three-point center derivative with a delay of only two samples [41]. The second peak from PPG signals can be obtained by combining FDPPG and SDPPG because this peak is not noticeable in older people unlike in healthy young persons. Figure 6 shows the combination of the derivatives for the detection of the second peak [34]. APG represents the double differentiation of the original PPG waveform. Five consequent waves are useful in determining the points of interest, and the peak positions of SDPPG provide information about erectile dysfunction from the PPG [32], [45]. The method combining zero-crossing and the local minima and maxima of the FDPPG and SDPPG, respectively, can increase the accuracy of the algorithms. First, zero-crossing can detect the peak and onset points of the PPG waveform, but the first detected points may have some intervals because of the time delay from the high-pass filter and change the waveform. Thus, final detection points were compensated through the local maxima and minima in raw PPG near the first detected points. PTT and reflection point can be detected from the second zero-crossing, which changes with respiration rate (RR) [42].
The five consequent waves based on SDPPG can describe the cardiovascular state of an individual. Three ratios explain the cardiovascular conditions: the b/a ratio is an indicator of arterial stiffness and increases with increasing arterial stiffness; the c/a, d/a, and e/a ratios are indicators of arterial stiffness that decrease with increasing arterial stiffness; and the (b−c−d−e)/a or (b−e)/a ratios are aging indexes for the assessment of vascular aging and arteriosclerotic disease [44]. The three basic features in a single PPG waveform are the systolic, diastolic, and dicrotic notches. Dicrotic notch also known as inflection point can be found from the FDPPG and SDPPG waveforms. This point is determined as the first local maximum of FDPPG and can be confirmed by the second positive-to-negative crossing zero of SDPPG. The b/a ratio can be found from the parameters extracted to detect arterial stiffness [46]. In addition, PPG waveform and its derivatives (FDPPG and SDPPG) calculated by two forward mathematical derivatives can be used in extracting PAT [48]. Mental stress greatly affects arterial stiffness. Several features have been identified from PW for the assessment of arterial stiffness. Fiducial points, namely, the maximum upslope, five consequent waves, and early and late systolic components, have been detected from the first, second, and third derivatives of a PPG waveform, respectively [49]. The third derivative of a PPG waveform is shown in Figure 7, where p1 is defined as the early systolic component, and p2 is defined as the late systolic component.

Fig. 7. Third derivatives of PPG waveform [49]
The basic differentiation of PPG signal was used in examining the utility of employing the derivatives of a signal. Equation 1 (T-sampling interval and equals the reciprocal of the sampling frequency; n-number of data points; ∈ [1,20]-derivative step; S0-unfiltered PPG signal) was used in derivative analysis, and the extracted features were energy and Shannon entropy, which were normalized.

Discussion
This section will discuss the advantages and disadvantages of the two different techniques previously briefed. This section also provides a comparison of the use of these two types of morphology from different researchers.

Photoplethysmogram waveform
Information can be extracted directly from a PPG waveform and used in the bio signal processing field. Table 3 shows the comparison of the different methods previously discussed in Section 2.
AR is one of the features extracted from a PPG signal. This feature is affected by nociceptive stimuli, which show highly significant difference (p < 0.001) between AR values before and during intubation [13]. Atherosclerotic disease can be detected by examining reflective index and delays between the peaks of a PPG. Two studies extensively compared the classification performance of different PW-derived parameters and their repeatability [14] - [15]. Recording all the signals from all subjects was not successful; hence, the data used in comparing the parameters was not entirely from the same subject population. Even though this was the first study on doing so, the use of ankle-to-brachial pressure index measurements has limitations [51]. In an early study from the same researchers [15], the lack of dependency on a healthy subject population does not confirm that the parameters can be used in discriminating whether patients have atherosclerosis or not. This research also need to scale the parameters with respect to HR, although the variation in PW parameters cannot be fully explained by HR. PPG signal can be used in detecting or predicting epileptic attack, and information on epilepsy is available in the dynamic of this signal [16]. The contrast of using photodiode and receiver was to ensure that sensitivity towards ambient light was minimal as possible for the receiver to maximally receive light from the photodiode.  [29] Suitable for signal processing problems Lengths of data recorded vary HR is another parameter that can be extracted from the P-P PPG signal. HR can be a crucial parameter in predicting neutrally mediated syncope [17]. The problem with this prediction is small data sample. PPG has been widely used in the medical field because of its simple morphological feature and low-cost setup [8]. Other risk factors should be studied for the diagnosis of vascular health. Mohan [18] proposed an accurate HR measurement (peak interval) in steady conditions. Its performance was comparable with other method, and this measurement uses extremely minimal hardware resources, has high performance in real-time monitoring, and is easily implemented in real-time embedded platforms. Another researcher pursued the same objective and provided an algorithm that can work well in real-time monitoring. A microcontroller implementation can be supported by only using simple filters and decision lists [19]. The drawback from this research was that the diastolic peaks of PWs that can support the calculation of pulse propagation time were not detected.
RR is a vital physiological parameter for the detection and diagnosis of respiratory dysfunction. RR must be monitored continuously at a period of time in order that clinically relevant information can be obtained. The RR parameter can be estimated using PPG [52] or ECG [53]. RR can also be estimated from the PTT parameter produced from the combination of PPG and ECG waveforms [20]. The derivation of PTT must be particular as the PTT parameter can be affected when subjects hold their breath. PPG waveform can be analyzed in two domains, namely, the time and frequency domains. Frequency domain variability indices are more preferable than time domain variability indices for short-term series analysis of less than five minutes for exploring inherent changes [21]. Frequency domain indices, such as low-and high-frequency powers, and their ratio in the power spectral density are calculated with an auto-regressive model. According to the findings in [21], more robust data must be tested in the future for the verification of the results because changes in the morphological and time features of PPG are subtle.
The weakness of using the ECG-PPG method is the need for a pre-ejection period, which reduces the accuracy of diagnosis [54]. Multisite PPG is an alternative to ECG in measuring PTT. PTT can be obtained from the elapsed time of PPG waveform for the assessment of a patient with Systemic Sclerosis and Pre-Raynaud's phenomenon disease by placing PPG at the right and left earlobes, index finger, and giant toe [22]. This technique is practical and inexpensive, but many aspects, such as the balanced of the specificity and sensitivity of the test, treatment changes for patient, and validation of work, must be considered. IBI is an unobtrusive parameter in diagnosis study [23]. MA must be reduced while the subject is going to bed to continuously monitor the IBI from PPG. Gaussian modeling was used in assessing PPG morphology for the study of psychological condition changes [7] and measurement of BR and HR [24]. The studies produced positive results, but the amplitude of systolic, reflected wave, and RI were not included in the Gaussian function because they did not coincide with the systolic and the reflected part of the PPG pulses. The CB graph of a PPG is effective and precise in detecting heart abnormality. This graph can justify statistical parameters for the classification of heart abnormality [25]. Real PPG data must be included in order that the reliability of this technique is ensured.
Usually, the combination of ECG and PPG is used in extracting PTT. Rather than this combination, PPG can be solely used to extract PTT feature. The difference between both systolic peaks can be defined as PTT by placing the PPG at the wrist and finger. Wavelet is used to extract PTT feature because of its relative imperviousness to noise and it is uncorrelated to noise even if this method is more computationally complex than normal peak detection [26]. The problem with using PPG is that the measurement apparatus will be juddered and cause disturbances when tremors and contractions are provoked in the subject. Wavelet technique has been recently applied on the wristband and compared with the commercially available finger clip POx in producing accurate measurement of vital signs. This type of wearable device is preferable than other commercially available finger clip POx because this device is non-invasive, convenient to patient, and engaging in nature [27]. The problem with this device is that rounded peaks and valleys require more sophisticated algorithms to extract physiological measures. In addition, the quality of PPG signal is affected by multiple factors, including device malposition, ambient light, pressure on skin, and biological factors (tissue composition and skin temperature). Better identification accuracy can be obtained by using wavelet technique based on the principle of wavelet multiresolution [21]. This technique corrects the morphologies of the signal and optimizes the quality of peak identification. However, the sample was too small with only ten samples. MSPCA is a powerful combination of PCA with wavelets for the processing of multivariate statistical data. PCA is intended for the interpretation of huge data sets, which can be decomposed into smaller blocks or matrices. This technique is suitable for many signal processing problems faced in the field of biomedical instrumentation [29].

Derivative of PPG
In this sub-section, the uses of the different orders of PPG in assessing health condition will be compared. Table 4 shows the comparison of the different methods that use the derivatives of PPG that were previously discussed in Section 3.
Apart from original PPG pulses in diagnosing health problem, the derivative of PPG pulse is widely used and demonstrates excellent results. Morphological PPG changes are influenced by changes in downstream venous resistance, such as different positions of the hand, whereas PPG monitors blood volume change [31]. Rounded dicrotic peak is difficult to determine and not automatically detected because it is not visible to the eye. Moreover, the use of diastolic peak or inflection point may cause errors when comparing PPG waveforms of different morphologies. Takayasu's arteritis (TA) is a rare disease that causes inflammation and intimal proliferation, which lead to wall thickening, stenotic or occlusive lesions, and thrombosis [55]. Bilateral dissimilarity in the morphological parameters of multi-site peripheral signals in patients with TA can be used to find the pathological site. This research focused on using the FDPPG waveform in predicting the severity of TA in patients. The main problem in this study was the small size of patient groups because of the rarity of TA [38]. The derivative of PPG can also be applied to the image of PPG (IPPG). A charge-coupled complementary metal oxide semiconductor device [39] or a digital camera [40] was used instead of a photodiode. The IPPG provides new feature sequences and prevents the use of main peaks, which are highly susceptible to interference and variation in IPPG measurements. In addition, this technique increases the reliability and usefulness of IPPG applications in providing the accurate assessments of users' physiological conditions. The disadvantages of this technique are the unrefined precision of existing cardiovascular indicators, the trade-off between camera lens power and distance, its inability to perform real-time implementation, capturing waveform morphology in ambient light, and the sensitivity of a remote camera to MA. SDPPG can provide waves a and b, which are crucial for the assessment of arterial stiffness and other cardiovascular parameters [41], [46]. The detection algorithm for waves a and b from a study [41] produced a very high sensitivity (99.78%) for both waves. However, although the sensitivity was very high, a few limitations were found.
The proposed algorithm was only tested on normotensive young subjects; only waves a and b were detected; a small sample size was used, the data set was not diverse, and the PPG signals were sampled at a low rate. The inflection point area with d wave peak index contains an important indicator of vascular aging. This point was supported as the close correlation between peripheral stiffening and brachial-ankle PW velocity [33]. The usefulness of the proposed index is validated by performing a follow-up study. Another study suggested that vascular aging can affect the dicrotic notch and inflection point of PPG waveform. PPG techniques were preferable because they provide noninvasive circulatory monitoring as PPG reflects the changes in blood volume with each heart beat [32], [34], [45]. However, these studies have the same problem with diversity of population and small sample size.
Analysis is the most crucial part in the research field. The least standard deviation of R-R peak must be included for time domain analysis, and the low and high frequencies and the ratio of high and low frequencies must be performed for the frequency domain [42]. Pressure index (PI) is defined as the arrival time or velocity-related parameter of PPG's reflected wave. Instead of using the inconvenient cuff-based technique, PPG has recently showed a statistically significant correlation between PI and BP [43]. The limitations of this study were that the respiratory variability, which was observed in continuous BP monitoring, was not validated; the subject size was small; and PPG morphology may be affected by aging, vessel stiffness, cardiovascular disease, and other hemodynamic properties. The treatment and prevention of associated complications by personal health monitoring systems can help reduce mortality rates. This system can automatically identify cardiovascular features and display the relevant information to the user. Hence, the system can provide useful information about health conditions and whether the user needs to visit a doctor [44]. However, this system is not a gold standard for cardiovascular diagnosis.
Mechanical alternans or pulsus alternans (PA) is a condition of alternating strong and weak beats as measured by pulse or BP [56]. Pacing-induced PA and its magnitude can be detected non-invasively in patients using PPG [47]. However, the small population was the main problem in conducting biomedical research. One way to overcome a small dataset is to use a dataset from online database, such as Medical Information Mart for Intensive Care (MIMIC). PAT from MIMIC waveform database is a potential parameter in continuous BP monitoring [48]. However, this database is not recommended for examining any synchronicity-dependent features, such as PAT interval, for predicting BP without synchronicity check.
Analyzing heterogeneous population can be difficult as different populations produce different sets of PPG waveform. The performance on a heterogeneous population can be improved by infusing CT feature with the slope of the waves b and c of the SDPPG, which is influenced by left ventricular injection time and HR [49]. However, the conclusions were not verified. Besides using the FDPPG and SDPPG, Elgendi [50] found that the seventh derivative produces the highest accuracy among the first to the twentieth derivatives of PPG waveform for heat stress measurement. This seventh derivative of PPG waveform can be used to detect heat stress measurement without measuring body core temperature. This study had several disadvantages, such as the use of a moderate number of subjects, the unavailability of a PPG database containing data measured in tropical conditions or after heat stress, some participants may have cooled down while queuing for measurement, and the use of the standard definition of entropy to measure the randomness of PPG morphology regardless if PPG waveform was noisy with gain or fluctuation.

Challenges
Despite the promising results obtained using PPG waveform and its derivatives, some challenges on the clinical application of PPG waveform to healthcare remain unsolved. In particular, several key issues have been highlighted: Sample size: The basis to achieve a very reliable diagnostic algorithm is to have a huge amount of data. The ideal sample size includes various population types, which can ease the process of validation and provide accurate diagnostic analysis. Machine learning has been commercially available in medical field [57], but combining algorithm with machine learning for disease prediction requires a large sample size. However, healthcare is a different domain; in fact, a great part of the approximately 7.7B people all over the world (as per January 2020) have no access to primary healthcare. Consequently, a low number of patients cannot be gathered. Moreover, understanding diseases and their variability is much more complicated than other tasks, such as speech recognition or image. From the data perspective, a large amount of medical data is needed for an effective diagnostic algorithm. New methods had already been developed using data augmentation to increase the sample size. This method synthetically increases the amount of training data to improve classification accuracy [58]. This method synthesizes a given data using interpolation and extrapolation [59].
Complexity of cases group: The problems in biomedical and healthcare are more difficult than other applications. The diseases are highly dissimilar, and most diseases have unknown causes and mechanism. In addition, the number of patients is usually limited in a practical clinical scenario, and researchers could not demand for as many patients as the research need.
Morphology: The PPG waveform will be different as a person gets older. For example, the diastolic peak of the PPG waveform in an older person will be rounded in shape compared with that of a younger person. A peak that is almost diminished will be difficult to detect and will need a very complex algorithm or analysis.

Conclusion
Various algorithms from PPG have been reported in literature. PPGs have become the preferred technique in diagnosing disease because of its low cost, mobility, and noninvasiveness. The two different techniques discussed were PPG waveform and the derivatives of PPG. The PPG waveform is easy to analyze as it plays around the basic morphology of PPG itself. The problem is that older people tend to have less noticeable diastolic peak, which makes the analyzing process complicated. Wavelets can provide accurate results, but the process is more complex than analysis using PPG waveforms or derivatives. The derivative of PPG can overcome the unnoticed diastolic peak because this derivative produces a local maxima and zero-crossing point, which indicate the inflection point (diastolic peak). The great potential of analyzing PPG waveform will only be realized through close collaboration among researchers, clinicians, and industrial partners.