Discrimination of the Skin Microcirculatory Status Using Photoacoustic Technique and Long Short-term Memory Network

audrey@uthm.edu.my Abstract— Measuring oxygen in blood with a standard imaging method is challenging. Most of the conventional imaging systems presented outcomes of microcirculatory change measurement as signals of complex forms. This leads to analytical insufficiency due to the complicated and visually unnoticeable features of the signals. For that reason, there is a great need to explore the use of photoacoustic (PA) method and deep learning technique for the task. This work presents the use of a deep network containing long short-term memory (LSTM) units for temporal features extraction and classification of skin microcirculatory status. The model was trained using a limited number of PA signals. One way ANOVA test was used to evaluate changes in the PA signals collected under different experiment condition. The results showed a strong statistical signifi-cance between the means of two groups ( ρ < 0.05). The mean ± standard deviation (SD) final validation accuracies of the trained model is given by 95.60 ± 0.47 % with inclusion of augmented data, which showed better performance than the case without the augmentation method. The results of the testing set showed a considerably good classification accuracy, specificity, and sensitivity given by 97.6 %, 100 %, and 83.3%. The future of this work includes improvement of the network architecture to include more convolutional layers for searching patterns in the features


Introduction
A cardiovascular system is responsible for delivering nutrition and oxygen through the bloodstream to all of the body's cells. It is made up of the heart and a closed system of vessels that transports blood throughout the body. The key function of the microcirculation is to ensure adequate oxygen transport to meet the oxygen demands of every cell within an organ. In order to accomplish this, a healthy and well perfused microvasculature will respond to variations in metabolic demand or blood flow to the organ. Tissue hypoxia may develop if the microvasculature is malfunctioning, or under the condition of impeded blood flow to the affected area.
Hypoxia is a medical condition when the supply of oxygen is insufficient for normal living functions, while hypoxemia is a situation where the arterial oxygen supply is inadequate. In the event when blood is unable to carry enough oxygen to the tissues, hypoxemia (low oxygen levels in the blood) can lead to hypoxia (low oxygen in the tissues). Shortness of breath, rapid breathing, and a quick heart rate are the most prevalent acute symptoms. The blood oxygen level is about 92 % or lower for a hypoxemia patient. Late treatment of hypoxia may lead to coma, and even death. Very often, hypoxia cannot be reliably predicted through a physical examination using oxygen sensors, such as a pulse oximeter.
Blood oxygen saturation level is an indicator commonly used to evaluate the level of oxygen in the blood, and the effectiveness of clinical treatment of hypoxia related diseases [1]. Blood oxygen saturation measurement technology has gained increasing interest among the scientific community. Many works have been done to estimate one's blood oxygen level using photoplethysmography (PPG) [2] and spectroscopy technology [3]. The state of peripheral microcirculation in patients can be assessed using PPG that provides measurement of volumetric changes in blood. This system consists of a probe containing an infrared light source and optical sensors, the measured intensity of light reflected from the medium is correlated with changes in blood oxygen levels. According to [4], PPG offers a greater depth of discrimination than Laser Doppler flowmetry (LDF) in the detection of skin microcirculatory flow change. However, measurement using the PPG signal required the use of either a database or a look-up table or a calibration model [5]. Spectroscopy is an alternative technique for valuation of tissue oxygen saturation level. In [6], this approach is shown to produce considerably good prediction sensitivity and specificity of 94 % and 72 %, respectively, for measurement of oxygen saturation (SO2) in patients with chronic mesenteric ischemia. However, the measurement is adversely affected by skin pigmentation [7]. Photoacoustic (PA) technique is a commonly used technique in the study of pressure variation in a closed system, and very recently its use has been extended to biomedical applications [8]. This method is based on absorption of light that causes thermal expansion within the medium. The latter is responsible for acoustic wave generation. The effect of scattering on acoustic waves is two to three orders of magnitude lower than in the light signals, and thus a higher imaging resolution and contrast can be achieved using this hybrid optical-ultrasonic approach [9]. The findings in [10] suggested PA as a suitable approach for study of microcirculation in acupoints, as it allows determination of various blood circulation-related parameters, such as total hemoglobin concentration, blood oxygenation level, blood flow velocity, oxygen metabolism level, vasoconstriction, vasodilation, and hemodynamic of a target vessel in real time.
Artificial intelligence (AI) technology is a popular and efficient tool in classification and decision-making problems. It is widely used in various fields, such as in gas and oil industry [11,12], clinical diagnosis [13][14][15], and agricultural [16]. AI has made great progress in the advancement of system development and performance, making it a technological reality. Deep learning has evolved from artificial neural networks (ANN), which are a web of interconnected "nodes". Unlike traditional machine learning that requires feature engineering to extract useful features from a large amount of data before making accurate decisions based on what has been discovered, deep learning is able to learn representative features and make intelligent decisions by recognizing the significant pattern from the data. Some of the popular deep learning models used for data mining include Convolutional Neural Network (CNN), Deep Neural Network (DNN), and Recurrent Neural Network (RNN). A CNN is designed to exploit "spatial correlation" in data, and it works well on images and speech. The training information is passed from the input to the output layer without travelling backward, such as in a DNN. A RNN addresses the common feed forward network issue (i.e., vanishing gradient problem) with a time twist. It enables the gradients to flow back to the previous layers to preserve information, making it an excellent choice in modelling sequence data, such as text and video. LSTM is a special form of RNN that specializes in classifying, processing, and predicting time series (onedimensional) events. This model uses a gating mechanism that controls the memorizing process. The three gates in an LSTM cell are namely a forget gate, an input gate, and an output gate. The forget gate determines the data that needs attention or can be overlooked. The input gate decides the relevant information that should be added to the cell state (long-term memory), and the output gate finalizes the value of the next hidden state.
Diagnostic methods able to provide detailed information of microcirculatory functional status and tissue oxygen saturation level have come to the limelight in recent years to evaluate the effectiveness of intensive care treatment, especially for hypoxia [17]. In [18] a total of 420 microcirculation images were collected using an optical imaging system for diagnosis purposes using a DNN model. The study reported an accuracy of 92 % for prediction of microcirculation identification It was reported that the performance of the model was affected by the signal to noise ratio (SNR) of the collected images, where low SNR images exhibit complex capillary morphological pattern. In another work [19], a self-developed model named CapillaryNet is used to predict the microcirculation parameters (i.e., red blood cell flow velocity) within the capillary network using microscope video data as its input. The results showed an improved accuracy of 93 % in the detection and classification of the velocity of red blood cell flow density as compared to the results in [20]. The performance of this model was limited by the capillary types, tissue type and imaging system quality. A deep learning multi-cell tracking model named CycleTrack modified from Deep Layer Aggregation (DLA) model is used in [21] to non-invasively investigate the features of capillary blood cell flow with a good accuracy of 96.58 % using capillaroscopy videos. Another related work in [22] used 2D CNN algorithms for classification of microcirculation images from septic and non-septic patients, and were shown to produce an accuracy of 89.45 % and precision of 92 %. But variability in the quality of the image captured using the CytoCam incident dark field imaging camera system and the background noises were reported as the potential confounding factors precluding clinical evaluation. Two other works in this direction is by Mason et al. [23] and Ossama et al. [24]. The authors in [23] suggested the use of generative adversarial network (GAN) to estimate both uncorrected and profile-corrected tissues oxygenation, and showed an accuracy of 96.5 %. Meanwhile the method in [24] includes a two-step training of CNN model to classify microscopic images for functional blood flow, and achieved an accuracy of 83 %. Other field-based and laboratory-based approaches for detection of microcirculatory parameters include the use of a personal computer and oscilloscope to display the electrical signals acquired using readily available devices such as PPG [25]. These signals can be complex, and its variation with microcirculatory change can be visually indiscernible from one another, causing insufficiency in its analysis. This calls for the use of AI for automatic extraction of important features in classification of microcirculatory status, similar to that demonstrated in earlier presented works. In [26], deep spectral learning (DSL) was used in the design of an oximetry that is robust towards changes in experimental, including different setups, imaging protocols, speeds, and other possible longitudinal variations. This approach also provides uncertainty measures of the prediction results. In [27], LSTM technique and an inexpensive fingertip sensor is used to predict oxygen saturation level in individuals with signs of approaching hypoxia. The prediction was more accurate than anaesthesiologists in the operating room. However, no work has been found in the literature combining PA technique and time series deep learning model for field application purposes. This work aims to investigate the performance of LSTM for the prediction of microcirculatory status (i.e., at rest and perfusion occlusion) using PA signals. It is also our objective to demonstrate improvement in the model's classification performance with the inclusion of signal augmentation strategy in our training.

Materials and methods
This section discusses the research methods used in the research. The first section describes the methodologies of this work, followed by preparation of data for model training and testing. Lastly, we provide a description of the designed deep model in section 2.3.

Experimental system
Shown in Figure 1 is the PA system assembled in this study. This system comprises of a laser source producing light of wavelength 633 nm (R-30993 New-port Corp.) and an ultrasonic flaw detector (EPOCH 650, Olympus Corp, Japan) in its detection arm. The continuous laser beam passes through an 80 MHz acousto-optic modulator (AOM, Gooch & Housego 2910 series) produces a modulating light signal. The latter was controlled by a radiofrequency (RF) driver with a carrier frequency of 15 MHz. The modulated light absorbed by absorbers in a sample resulting in thermal expansion in the cavity. This causes pressure changes propagated through the medium, and measured using a flat acoustic transducer (V232-SU/2.25 MHz, Olympus NDR) that is connected to the flaw detector. This device plots the amplitude of ultrasonic echo as a function of time. The penetration depth of visible light in skin is in the range of 4 -5 mm [28], so the measured acoustic signals are likely from subcutaneous tissue. An ultrasonic gel was applied on the imaged skin area to provide good contact between the skin and the transducer head during the measurement.

Experimental procedures and subjects
In this exploratory study, four healthy participants (one male and three females between the ages of 26 and 27, identified by subject A, B, C and D) were recruited. The Research Ethics Committee of Universiti Tun Hussein Onn Malaysia has given its approval to the experimental procedures. Prior to the experiment, volunteers were briefed on the technique, purpose, possible complications, and expected benefits. All subjects claimed no known major medical conditions. Upon enrolment, they were instructed to sign an informed consent form. The experiment was carried out under two conditions to represent different microcirculatory statuses, i.e., at rest (well perfused microcirculatory system) and under systolic occlusions (impeded skin blood flow). The application of occlusion is intended to cause blood flow blockage in the examined arm, and hence variations in local hemodynamic conditions. A finger pulse oximeter (model no. 1805) was used to confirm changes in the oxygen levels during these experiments. The results showed a drop in the average pulsating blood oxygen saturation from 99 % to 93 % for measurement on the index finger as the inflow of oxygenated blood is impeded. The experiment began with at rest condition, where each subject was told to place the examined site under modulated beam illumination. Three measurements were taken with a ten-second gap between them. Following that a systolic pressure of 120 mmHg was applied on the upper left forearm of the selected limb for 60 seconds before the first signal was captured. A total of 18 signals were collected under occlusion conditions with a 10 seconds waiting time allowed between measurements. The transducer head was in contact with the skin during the test, as illustrated in Figure 1, and the PA signals were captured using the ultrasonic flaw detector at the time points mentioned above. These signals were saved on a microSD memory card before they were manipulated and analyzed offline. The consort diagram of the study is presented in Figure 2.

Data handling and signal augmentation
The collected photoacoustic signals are used to train and test the designed model. There are a total of 105 PA signals collected under both at rest and occlusion conditions, i.e., 3 signals and 18 signals/volunteer from at rest and under occlusion condi-tion, respectively. They are randomly split for training, validation and testing using a ratio of 40/20/40 %. For the ease of data handling, measurements from three volunteers (subject A-C) are used for training and validation (i.e., 60 %), while subject D for testing. In the beginning, there are 42 signals available for training of the model. To increase the number of data available for training and validation, data augmentation has been used in this work. White Gaussian noise is added to these sets using AWGN function in MATLAB 2020b. This process produces four additional signals with SNR of 20 dB, 30 dB, 40 dB and 50 dB for each signal. Hence 210 signals (including augmented data) are available for the training of the model, and 105 for network validation. The distribution of signals for the training, testing and validation processes is shown in Table 1.

Model training and hyper parameters tuning
This study used the LSTM model for the classification of different blood CO levels owing to its ability of dealing with sequential data and learning time series information, and its ease of implementation. A network containing LSTM shown in Figure  3 is used for transfer learning for PA signal classification task. This network was validated using the validation set to provide evidence of over or under-fitting of the model during the training session. The training of the proposed model was performed on a DELL laptop with 64-bit window 7, Intel® Core™ i3-3110M CPU @2.40 GHz. The model was trained using Adaptive Moment Estimation (ADAM) optimizer, and with manually tuned hyper parameters value. The epoch number was allowed to vary from 100 to 600, while the initial learning rate was increased from 0.0001 to 1, at tenfold intervals. Other fixed hyper parameters include mini-batch size of 8, and gradient threshold of 0.0001. This model consists of one input of size 378 (i.e., X1 …, X378) feeding into a network LSTM cells of 30 hidden layers to extract the important time features. This is followed by three fully connected layers, FC, of size 30, 15 and 2, respectively. The output of the FCs is fed to the Softmax to classify the output into two classes: class 1 (well perfused skin) and class 2 (impeded blood flow).

Results and analysis
An example of the PA signal (under at rest condition) and its corresponding noiseadded simulation signal of different SNR levels are shown in Figure 4. In this study, epoch number and initial learning rate were manually chosen to train the model. The model's percent accuracy trained was run three times for each learning rate and epochs. The mean ± standard deviation (SD) of the model's percent validation accuracy, and the average computing time obtained from varying these parameters are shown in Table 2. The highest validation accuracy of our trained model is given by 95.60 ± 0.47 % highlighted in the blue. The training progress of this best model is shown in Figure  5(a). Using the optimal hyperparameter set (i.e., epoch no. 400, learning rate of 0.0001) chosen from Table 2, we demonstrated the classification performance of the model trained without the inclusion of the augmentation strategy described in section 2.2. The training progress of this model is shown in Figure 5   One-way ANOVA test (SPSS 26, IBM® SPSS® Statistics) with the confidence level, σ, of 95 % was used to evaluate the differences in the (unaugment) PA signals collected under different experiment condition. A statistically significant difference (ρ = 0.006) was observed between the two groups. The best performed model chosen from Table 2 is used for the subsequent analysis. The classification performance of this model chosen based on the lowest validation accuracies and short training time in Figure 5(a) are evaluated using the testing dataset that has no role in the training. The considered evaluation metrics are accuracy (Ac), specificity (Sp) and sensitivity (Sn) given in Equation (1) Figure 6 shows the confusion matrix of the prediction results. Using information from this table, the model's accuracy, specificity and sensitivity in the classification of the testing dataset are calculated as 97.6 %, 100 % and 83.3 %, respectively. There are very few studies, which we could find in literature, that deal with microcirculatory changes in human skin using PA signals and deep learning method. In Table 3, we report state of the art techniques used for the same reason as ours. We also included the measurement technique and deep model considered, and the measurement site to facilitate the comparison of results.

Discussion
The number of epochs and initial learning rate, which are identified as the most significant parameters in deep learning neural networks [29], were tuned in this study. Table 2 revealed that a low learning rate value, complemented with a large epoch number, are needed for better model learning. Similar to the findings reported in the past, validation accuracy in Table 2 decrease as the learning rate increases. It can be clearly observed that the larger the number of epochs, the longer the training time due to the increased number of times the learning algorithm required to work through the entire training data set. Therefore, with a low learning rate, more training epochs are required to ensure changes to the weights of each node. Even so, there are disadvantages to the use of a large epoch and a small learning rate in model training, wherein there is a risk of the model overfitting to the training data and reduced generalization performance. Thus far, there is no rule of thumb exists for selecting epochs, hidden layers, learning rate and gradient value for optimal model training. However, this study identified a large epoch number and a small initial learning rate as preferable hyper parameters to ensure the proposed model is sufficiently trained.
The validation accuracy decreased to the lowest mean of 3.60 ± 0.47 % for learning rate of 1 and epoch number 600. This might be due to the model converging too quickly to a suboptimal solution but failing to update the weight at each iteration point. This renders the model failing to extract useful temporal features from the data, resulting in lower accuracy on the validation set. We noticed epoch 400 and initial learning rate of 0.0001 in Table 2 as the optimal set, further increase in epoch was shown to reduce the classification performance. The same pattern is hypothesized to be true in the case of reduced learning rate. Limited dataset in model Figure 5(b) pro-duced considerably low validation accuracy of 40 %. The loss function showed a decreasing trend until 120 epochs, where significant variation in its value is observed. Due to the limited dataset in the process, the model failed to model the dataset, and recognition of patterns failed to exist. This shows the data augmentation strategy adopted herein is an acceptable approach for improving the classification performance of the model. In addition, the variation in the predictions in Table 2 is generally low, suggesting good consistency in the model evaluation process.
A statistically significant difference (ρ < 0.05) was observed between the at rest and occlusion groups. This supports our hypothesis that the PA signal differs between rest and occlusion conditions. Conventional tissue oxygen measurement can be a tedious and demanding process because of the noisy, overlapped and almost indiscernible microcirculation signals. But our results in Figure 6 suggested feasibility of our method for the task of recognising microcirculation performance. We observed a considerable good accuracy of 97.6 % and sensitivity of 83.3% in this two-class classification problem. An investigation into the results showed the misclassified signal has considerably large amplitude values. Even though the signal is collected during at rest measurement, it has comparable pattern to those collected during the occlusion experiment. This work does not rule out the possibility that this observation is affected by skin pigmentation. The use of isosbestic wavelengths of hemoglobin may minimize the pigmentation effect while increasing phase contrast caused by hemodynamic activity.
Meanwhile a comparison with the state-of-the-art techniques in Table 3 showed superiority in the performance of our diagnostic method. Unlike the study in [23], our technique is unaffected to ambient light with the use of the high power laser source. In addition, SFDI is reported in [23] to suffer from limited imaging depth and require rigid control of technique. Our method is also considerably less (computational) expensive and straightforward approach when compared to [24], where a 3D CNN was used to predict blood flow in rat microvessel. The authors reported an accuracy of 83 %, likely due to optical artefacts that are unrelated to blood flow, leading to misclassification of these noises as perfused vascular. Unlike the earlier works in [18][19][20][21][22] that considered 2D CNNs that have long been established, this work demonstrated the unprecedented use of 1D LSTM for this prediction task. Even though the performance of our method is comparable or superior to the existing methods, we do not preclude the possibility of motion artifacts during scanning, which might impact quality of the training and classification. We recognized that the latter are also influenced by the insufficiency of labelled data, leading to poor network generalization. The future of this work would be to gather more signals, and include more convolutional layers in the network for the task of evaluating microcirculation changes in patients.

Conclusion
This study demonstrated the use of the LSTM model and PA technique for classification of skin microcirculatory status. It was found from our experiment that the proposed model trained with a limited dataset produces a considerably good classifica-tion accuracy of 97.6 % using our augmentation strategy. But this study has also identified several limitations, such as limited number of labelled data on the classification results. In addition, insufficiency of the network layers to extract important features could have also compromised the model performance. Future works included modification of the network architecture to include more convolutional layers to extract features required for classification. The improved classifier would be useful as a tool for rapid microcirculatory status classification in oxygen saturation research.