Top-K Human Activity Recognition Dataset

—The availability of Smartphones has increased the possibility of self-monitoring to increase physical activity and behavior change to prevent obesity. Self-monitoring on Smartphone comes with some challenges such as una-vailability of lightweight classification algorithm, personalized dataset to capture bodily postures, subject sensitivity, limited storage and computational power. However, most classification algorithms such as Support Vector Machines, C4.5, Naïve Bayes and K Neighbor relies on largest dataset to accurately predict human activities. In this paper, we present top-k of compressed personalized dataset collected from 13 participants to reduce computational cost with increased accuracy. We benchmarked our dataset and found that is suitable for tree-oriented algo-rithm, such as Random Forest, C4.5 and Boosted tree with accuracy and precision of 100% except for KNN, Support Vector and Naïve Bayes.


Introduction
It is important in Human Activity Recognition (HAR) to select suitable classification algorithm for collected dataset to improve classification accuracy and precision, versus computational cost. Normally, HAR is divided into three steps: firstly sensory data collection from accelerometer and gyroscope sensors, and labeling of collected sensor data as human activity; secondly feature extraction to remove noise and missing values from dataset, also known as pre-processing; and third, classification based on machine learning algorithms such as K Nearest Neighbor (KNN), Support Vector Machine (SVM), C4.5, Naïve Bayes, Random Forest, Boosted Tree algorithms [1] [2]. Most algorithm (KNN, SVM, C4.5 and Naïve Bayes) relies on larger dataset with implication on resource constraint environment such as Smartphone. Availability of sensory dataset for benchmarking real time HAR systems activity is still lagging behind due to the shortage of personalized dataset [3][4] [5].
Recently, some HAR researchers collected and published their datasets online to allow other researchers to benchmark their HAR systems [6][7] [8]. However, some of the datasets lack personal attributes (e.g., age, height, weight, body mass index) and are in high dimensional space, which requires high computational power and large memory [4]. The latter poses a challenge in providing HAR systems on resource-constrained platforms such as Smartphone. Recent HAR models are mindful of resource constrained Smartphone vis-à-vis classification requirements [2]. However, most of the dataset lacks a common feature to completely capture bodily posture to accurately classify static and dynamic human activities, hence subject-sensitivity challenge still persist. In this paper, we present a compressed top-k personalized dataset based on harmonic motion principle to mimic walking transition to augment Signal Magnitude Vector and Tilt Angle proposed in [9] and [10]. In this paper our contribution is three-folds, firstly a novel gravimeter filtering technique to select usable best classification feature. Our unique usage of compressed top-k dataset lead to optimal usage of Smartphone limited resources. Thirdly, the inclusion of harmonic motion to augment signal magnitude vector (SMV) opens-up the possibility of using single HAR training dataset for multiplesubjects. The rest of this paper is structured as follows: In Section II, we present related work. In Section III, conceptual model to collect top-k personalized dataset is presented. The selection and benchmarking are presented in section IV. Lastly the conclusion and future work is presented in Section V.

Related Work
HAR has been studied for decades, but there is still limited number of publicly available HAR dataset for machine learning. Researchers in [6] investigated and established that there is a lack of baseline HAR dataset. To close the gap, they collected and published dataset called Physical Activity Monitoring for Aging People dataset (PAMAP2), obtained from 9 subjects. They used 3 Calibri wireless Inertial Measurement Unit (IMU) devices attached to the dominant arm wrist, ankle and one on the chest of each participant at a frequency rate of 100 hertz. However, researchers in [7] collected and published their dataset called University of Southern California Human Activity dataset (USC-HAD) from 14 subjects wearing Motion Node device on their waist, which is connected to the laptop at a frequency rate similar to that in [7]. On the other hand, the authors in [8] and [11] found that public dataset is incrementally introduced, however, there is still a lack of personalized Smartphone dataset. To address this problem, the researchers in [8] and [11] published datasets collected using Samsung Galaxy S II Smartphone from 30 subjects at a frequency rate of 50 hertz, respectively. However, their dataset suffers from dimensionality limitation and thus inappropriate for resource constrained Smartphone platform [4]. A Similar personalized dataset was proposed and published on UCI of machine learning by [12].
The dataset is named dataset-har-PUC-Rio-Ugulino (PUCRO) and collected dataset from 4 healthy subjects consisting of 5 classes (sitting-down, standing-up, standing, walking, and sitting). Four sensors were used to collect 165634 rows of data consisting 6 personal attributes (name, gender, age, bmi, height, weight) and 13 group of tri-axial values (X, Y, Z). A more representative personalized dataset called Wireless Sensor Data Mining (WISDM) was proposed by [13]. The WISDM dataset was donated and published by [14]. They used Smartphone and Smartwatch two sensors (accelerometer and gyroscope) to collect the dataset from 51 recruited subjects performing 18 different human activities using Samsung Galaxy S5, Google Nexus 5 and LG G Watch. Each of 51 collected files consist of 6 attributes (subject code, activity label, time-stamp, X, Y and Z tri-axial values) and partitioned as (Non-hand-oriented activities: {walking, jogging, stairs, standing, kicking}, Hand-oriented activities (General): {dribbling, playing catch, typing, writing, clapping, brushing teeth, folding clothes} and Hand-oriented activities (eating): {eating pasta, eating soup, eating sandwich, eating chips, drinking}). However, it lacks features capable to capture bodily postures to mimic walking patterns completely [10]. Similar to our study in [9] is the work of [15], they found that learning new activities to adapt to new users' needs is challenging due to shortage of annotated dataset. They proposed Feature-Based and Attribute-Based learning to leverage the relationship between existing and new activities to compensate for shortage of dataset. Radial Basis Function was used to feed the SVM algorithm with 11 attributes for classification [15]. They evaluated their technique and found it outperforms other traditional HAR models in recognizing new activities using limited training dataset. However, the technique does not address shortage of personalized dataset; it only detects new activities from existing dataset. More recently is the technique proposed in [16] and [17], it allows each subject to enter height, weight and BMI and employs machine learning algorithm to filter social media using height, weight and BMI to recommend physical activity plans. We present personalized top-k dataset incline on harmonic motion to capture bodily posture to address the shortage of personalized.

A Conceptual Model to Collect Top-K Personalized Dataset
We present our personalized model given in Fig 1 to collect top-k personalized dataset, convenience sampling was followed to recruit 13 staff and students to participate in dataset collection process. The model consists of (a) sensory data collection, (b) pre-processing (Filtering and data segmentation, and Feature Extraction and Annotation), (c) Top-K personalized data collection, and (d) Selection of suitable classification algorithm.

Sensory data collection process
In this process, we present sensory data collection using cost effective Smartphone accelerometer available through the day at closer proximity of subjects [10]. All subjects are expected to carry Smartphone inside front pocket similar to [13] [14] as portrayed in Fig 2 and    Using front pocket, we can properly capture human legs postures during standing, swinging back and forth of human legs just like a simple harmonic motion [10] as shown in Fig 2 and Fig 3.We describe harmonic motion as a uniform projection of circular motion along a diameter of circle as a center of mass, given as pivot point of a hanging blob relative to restoring gravitational force. Hence, we fix Smartphone orientation based on tilting angle (TA) by treating Smartphone as mass hanging in weightless front pocket.

Pre-processing
In this process, we solve Smartphone orientation (Rest or Portrait or Landscape) to extract signal values using signal magnitude vector (SMV) and TA, because SMV is inadequate to capture different bodily postures [9] [10]. Based on size of arc due to restoring gravity force at the time we determine Smartphone orientation. The gravity force causes blob shifts of (x, y) from equilibrium (P0) to and from extremes (P1:P2; P3:P4) positions due to legs transitions depicted in Fig. 3 and Fig. 4. We compute radius as magnitude using SMV equation (1) based on accelerometer tri-axial as point R (x, y, z).
The accelerometer measures acceleration due to gravity about 9.8 / 2 when phone is in portrait position around y-axis [19]. Thus, the angle of interest (α, β and γ) is determined as largest theta to find phone orientation associated with SMV and gravity similar to Euler triple angles shown in Fig.4[19] [20]. TA is defined as angle between positive z-axis and gravitational vector g if phone is in Portrait. Thus, the largest theta is given as size of arc in radians less than 2 =1.570 [19][20] defined by: Where axis is either X, Y and Z depending on orientation of tri-axial values around α, β and γ angles. That is, if the absolute value around alpha ( ) approximates 1.570 , the orientation is Landscape, else if beta (β) approximates to 1.570, the orientation is Portrait otherwise is in Rest around Gamma(γ). a) Filtering and data segmentation: Raw tri-axial features from accelerometer are not without restoring gravity force [3] [20]. Based on restoring force relative to maximum TA around (α, β and γ) [19], we propose unique gravimeter filtering technique associated with harmonic motion period defined by: Where L, is a magnitude expressed as = 2 × 2 4 in equation (3) in the presence of gravity approximation of 9.8 / 2 as g π 2 ≈ 1 (0.994), thus we rearranged equation (3) to equation (4): Therefore, if gravimeter approximates to 9.8 / 2 as g π 2 ≈ 1g (0.994), x and y extremes positions of harmonic motion are sinusoidal in time t (period) computed by equation (5) and (6) [10].
Where (ωt + φ) is maximum TA as theta, A is magnitude (L) given by SMV, ω is omega the angular velocity rotation from equilibrium (P0) to/from positions (P1:P2) and (P3:P4) (see Fig. 3) defined by: Therefore, we collect confident harmonic motion (period, omega, x and y) as confident features into matrix ×4 , where all null values are discarded using equation (4). All equations from 2 to 7 are combined into our unique gravity filtering technique equation defined by: Expanded to Where ( , ) is gravimeter filtering approximation function based on gravity relative to maximum TA at specific period, N is window segment to store group of confident features (period i,j , omega i,j , x i,j and y i,j ) in row i until maximum N Hertz (Hz) is reached. We combine equation (8) and equation (9) into Gravimeter Filtering Technique Algorithm 1, which require accelerometer sensor (accel.x, accel.y, accel.z) values as input in line 1. During the sensory data collection, the while loop is tested whether row count reached the window size or not in line 5 of Algorithm 1. Therefore, if the window size is not reached, the SMV, TA around (α, β, γ), the gravity pull and theta are calculated based on maximum TA in line 11 to 21. The harmonic motion attributes (period, omega, X and Y) are computed from tri-axial accelerometer from line 6 to 10 to capture human legs postural features. The harmonic motion attributes are stored in matrix 50×4 only if their gravimeter approximate to gravity π 2 otherwise are discarded in line 29 to 34 of Algorithm 1.

5.
Set row ← 0  [25]. Hence, we extracted them from 50×4 for each feature (period, omega, X, Y) as valley, peak and average [24]. We compress every 50 instances in K iterations as top-k of best classification features, because reduced dataset improves training and classification accuracy [6][8] [10]. We set the size of K to 20 instead of storing every 50 by 4 real-time instances in K iterations. We extract K time-domain features as Human Activity Model (HAM), consequently producing top-k of compressed features per human activity as matrix ×13 expanded to: We also included the physical activity intensity level for each human activity known as MET published by [26]. The ×(13) is automatically annotated with commonly used predefined static and dynamic human activities listed in Table 1  2, which requires a set of human activity labels with corresponding MET given as 12×2 . The subject will select each human activity he/she wants to create in line 4 as HAM. Subsequently, the subject will slide in the Smartphone in front pocket and perform the selected human activity for 2 minutes. In line 7 Algorithm 1 is called and returns matrix 50×4 of harmonic attributes, subsequently the maximum, minimum and mean are extracted per K iteration and stored to Set JSONTraining empty, append

Top-K personalized dataset collection
We randomly selected 13 subjects to collect personalized HAM every 2 minutes similar to [6] [7]. We implemented Algorithm 1 and Algorithm 2 as part of our already developed training prototype published in [10]. The training prototype User Interface (UI) are sequenced from number 1 to number 7 to guide a subject to collect little personalized dataset (see Fig. 5). We installed our user-friendly personalized training prototype on Samsung Galaxy Grand Prime+ Smartphone. All the commonly used human activities listed on Table 1 are preloaded as dropdown menu as shown in UI-1 and UI-4 in Fig 5.   Fig. 5. Top-k data collection prototype developed by [10] We implemented 10 seconds delay indicated in UI-4 in Fig 5 before recording each HAM activities. Every time, the selected and performed human activity is removed from the pre-loaded menu to prevent replication. The subject must place the Smartphone inside his / her right front-pocket under the supervision of researcher (see Fig. 6). The On every occasion, the researcher selects a specific human activity from dropdown-menu. The subject will slot in the Smartphone inside the front right pocket (see Fig 9), and then the start sound will be triggered after 10 seconds. A subject is given 4 minutes breaks in between each human activity.
The subject starts to perform a selected activity to collect HAM features until the stop sound is triggered after 2 minutes. The subject will stop and give the Smartphone to the researcher, then our prototype prompts the researcher for the next human activity as indicated in Fig 8 UI-5. The researcher selects the next human activity until all the pre-loaded human activities are exhausted. Thereafter, all generated HAM features with labels are automatically written on SD card (see Appendix A). All 13 collected top-k datasets files were transferred and merged into a single Comma Separated Values (CSV) file called Real-time Personalized dataset with 2860 rows as listed in Table 2.

Selection of Suitable Classification Algorithm
In this Section, we selected 6 state of the art classification algorithms (C4.5, KNN, Support Vector Machines (SVM), Boosted Trees (BT), Random Forest (RT) and Naïve Bayes (NB)) to benchmark our proposed top-k dataset in terms of accuracy and precision to select the most suitable classification algorithm similar to [6] The preliminary results presented by [6] using own personalized PAMAP2 dataset on 4 algorithms (C4.5, KNN, SVM and Naïve Bayes) revealed the accuracy of 85.03%, 87.62%, 62.31% and 74.14% respectively. In this study, we selected R programming languages because it is free, easy to use, allows researchers to use predefined algorithms and confusion matrix [26][27] listed in Table 3. The benchmarking R-Source-Code G.1 algorithm proposed by [27][28] is used to simulate and compare existing R machine learning algorithms presented in Table 3. Firstly, we loaded different libraries in line 3 to line 7 of implemented benchmarking required by classification models described in Table 3. We used 10-fold cross validation since the dataset is limited to 260 per human activity to determine the reliability of each 6 selected model [4]. For reproducibility and future comparison, we implemented k-fold cross validation given as R-Source-Code G.1 to conduct comparison to select best classification algorithm, where our collected personalized dataset is partitioned into K equal sub-sets; such that all simulated algorithms can use K-1 as training dataset and 1 as testing dataset. The cross-validation results are given in precision and accuracy and summarized in accuracy and precision graphs. We ran the implemented R-Source-Code G.1 for each of the 6 algorithms (SVM, C4.5, KNN, NB, RF and BT) listed in Table 3 [30] indicates that tree-oriented algorithms perform far better than simpler algorithm with accuracy and precision of 98% using little personalized dataset [29]. The results are summarized in Precision and Accuracy in Fig. 7 and Fig. 8. The results presented in Fig.7 and Fig.8 show improved accuracy and precision of 100% with RF and C4.5 algorithms in all static and complex human activities with 100% reported TP in all human activities with no TN, FP and FN, RF outperformed its predecessor BT, which scored precision and accuracy of 80% and 90% respectively. Our results confirm that tree-oriented algorithms (RF, C4.5 and BT) are suitable for smallest training dataset as compared to SVM, KNN and Naïve Bayes. The results as compared to [6][23] [29] shows significant drop from 87.62% to 40% and 74.14% to 60% in accuracy and precision on simpler algorithms with little dataset, and confirm existing gap between tree-oriented and simpler algorithms (SVM, KNN and Naïve Bayes) reported in [6][23] [29].This results are similar to the results reported by [23],

R-Source
where Naïve Bayes scored accuracy and precision of 42.30% and 47.61% respectively using smaller and reduced dataset. However, Fig.7 and Fig.8 shows improved accuracy and precision above 90% on static human activities in all simpler algorithms due to regularized harmonic features. Hence, personalized static human activities can be replicated to multiple subjects.

Conclusion and Future Work
In this paper, we presented a personalized model to collect top-k personalized dataset to select a suitable classification algorithm. Harmonic motion based on simple pendulum was used to augment Signal Magnitude Vector to capture human bodily postures. We proposed novel filtering technique based on gravimeter to remove noise in order to personalized dataset from 13 subjects; the dataset was benchmarked using state of the art machine learning algorithms. We found that our dataset is suitable for tree-oriented algorithms such as RF, C4.5 and BT, because each feature creates as tree-like hierarchical structures. However, the dataset is not suitable for simpler algorithms such as KNN and Naïve Bayes. In future we intend to propose a hybrid model combining KNN and Naïve Bayes for smallest datasets.

6
Department at University of the Western Cape in Cape Town, South Africa from 2004 to 2009. He was a Lecturer at University of Nairobi in Nairobi, Kenya from 1999 to 2000. He is a member of IITPSA, IEEE and IAENG. He is NRF rated researcher since 2015. He has published three books, 26 journal articles, 7 Chapters in books, over 60 refereed conference papers, and one patent.