Software Development Framework for Real-Time Face Detection and Recognition in Mobile Devices

With the rapid use of Android OS in mobile devices and related products, face recognition technology is an essential feature, so that mobile devices have a strong personal identity authentication. In this paper, we propose Android based software development framework for real-time face detection and recognition using OpenCV library, which is applicable in several mobile applications. Initially, the Gaussian smoothing and gray-scale transformation algorithm is applied to preprocess the source image. Then, the Haar-like feature matching method is used to describe the characteristics of the operator and obtain the face characteristic value. Finally, the normalization method is used to match the recognition of face database. To achieve the face recognition in the Android platform, JNI (Java Native Interface) is used to call the local Open CV. The proposed system is tested in real-time in two different brands of smart phones, and results shows average success rate in both devices for face detection and recognition is 95% and 80% respectively. Keywords—Authentication, Image processing, Wearable, Framework, JNI, OpenCV, Personal identity, Smart phones.


Introduction
over the Internet. Face recognition technology is an emerging biometric technology, mainly used in intelligent robots, smart homes, and military security systems etc. The major challenges of face recognition system are the identification problem; where the human face has dynamic biological features, such as the similarities of face structures and the face variation caused by different observation angles [4]. In addition, the factors such as the masking and the degree of user cooperation while authentication also makes it difficult during face recognition. As a kind of human intrinsic attribute, face features have strong individual differences and easy to collect. The facial features are the basis of identity and authentication, which is safer and more reliable than traditional methods. Using face recognition and authentication process is useful is several applications related multimedia processing, where camera, and mobile devices are widely used. This can save valuable time and resources with better user experience especially during cumbersome registration process, and logging into mobile terminals remotely.
In several cases, a simple variation in the facial condition affect the accuracy of the systems through which the facial recognition is performed especially while using highly populated databases. This is why the facial recognition systems are not widely used in security systems as compared to other biometric systems such as fingerprint or iris recognition systems. The face recognition technology is based on physiological characteristics of identification method, where through the computer to extract facial features, and according to characteristics obtained authentication is determined. In the past several years, researchers have developed large applications for real-time object detection and face recognition, text recognition and currency bill identification [5,6]. In the earlier years, several researchers focused on popular methods available for of object and face detection. Chen, and Yuille [7] described the object detection and used cascade structure and AdaBoost classifiers based on Haar basis functions; Viola and Jones [8] uses Eigenfaces based on the Turk and Pentland models [9]. Other researchers implemented Principle Component Analysis (PCA) or Eigenfaces for face recognition in order to perceive facial expressions, emotions and gesture. In addition, researchers also focused on developing face detection and face recognition algorithms to be used by visually impaired people in the recent years [10,11]. Most of the face detection methods uses the algorithms proposed by Viola and Jones [8]. The Haar Cascades functions and the Principal Component Analysis of the Eigenfaces algorithms were used in order to achieve the detection and recognition objectives.
Recently, the works related to real-time face detection, and their role in mobile applications is getting widely popular among programmers, developers, and researchers. Authors in [12] proposed client-server-based framework, where face detection, and tracking application is designed for Android mobile devices. In [13] an algorithm designed to identify facial features on an android mobile platform. This algorithm is based on anthropometric face model and box-blur filtering. Similarly, other researchers also proposed the methods of emotion recognition in Android smart phones based on heart rates and the talk users obtained using built in camera, and microphones respectively [14]. An emotion recognition framework to analyze the facial expressions is presented in [15]. The main focus of this work is to identify emotions in complex environments such variation in lighting, and device movements. A related appli-cation to recognize and analyze the user's audio and characteristics on smart phones is presented in [16]. This application is developed on Android platform for the primary goal of emotion recognition efficiently in real-time. Considering the development environment, since early 2010, several researchers developed applications on Android platform. An experience report of development environment for Android applications along with Eclipse IDE and open source tools can be found in [17]. A development of a face detection and recognition application developed into Raspberry Pi and Android is described in [18]. However, most of the earlier works related to Android, face recognition, and emotion recognition etc., are fail to generalize the approach towards developing Android based software framework for face recognition, which can be applied to several applications and devices for the privacy protection, user security, user authentication, and fraud detection. Considering these developments, in this paper, we explore the tools and methods necessary for implementing generalized Android based software development framework for real-time face detection and recognition in mobile applications. Initially, the face detection based on Adaboost face and the Haar feature is performed. Then the eigenface extraction algorithm of OpenCV is used to extract the features, and the extracted eigenface is compared with the saved eigenface. If the similarity exceeds the threshold, the face is identified as belong to the same person.
The paper is organized as follows. The Section 2 introduces the proposed system framework and detailed design. The image preprocessing, Gaussian smoothing, gray transformation, and binarization steps are described in this Section, along with Haarlike features and point graph. The details of specific face recognition are also presented here. The Section 3 presents the development environment with details of implementation, where details of building of Android development environment is presented, along with details OpenCV, JNI, and NDK. Section 4 presents the results obtained after face detection and recognition. Finally, the conclusion is presented in Section 5.

System Framework and Detailed Design
The face recognition system proposed in this paper include steps as shown in Figure 1 and Figure 2. Firstly, the image is captured by a camera, which either is capable of capturing videos or photographs. However, the camera which captures photographs is more suitable for accurate recognition. Secondly, as the face detection is a complex process in general, the system will try to standardize the image captured with the similar characteristics with the previously stored images in the gallery. This is required because most of the captured images have some random background or other images of other faces. Thirdly, feature extraction and mathematical representation named biometric reference is achieved. This step is the essential step which form the basis for face recognition. Final step is about process of comparison of models, where biometric reference is compared with the other models of familiar faces in the gallery. Declaring identity is establishing the close connection and affinity between two references, which is often carried out by the human factor. The flowchart of the overview of the proposed system is provided in Figure 1. The Figure 2 shows the flowchart for steps involving in face recognition and detection. The process of image detection and recognition includes several steps starting from capturing a dynamic image and making it as a static image using system camera. Afterwards, the system locates the face position to the obtained image based on the contour symmetry detection method. The image containing the effective face is filtered out, and several steps follows after this step. These steps include processing and adopting Haar-like feature matching method to extract the facial feature information [19], comparing the extracted feature data with the face database information, and then using the normalized square difference matching method in the OpenCV library to perform specific face recognition as shown in Figure 2.

Image preprocessing
In order to reduce the image noise, which may hinder the image extraction, detection, and recognition in the later stages, the image preprocessing method is selected as a combination of different algorithms to process the image step by step ( Figure 3). The image preprocessing involves four steps, namely Gaussian smoothing, gray-scale transformation, contrast enhancement, and binarization. Using the Gaussian smoothing filtering and weighted average methods, the image is de-noised along with grayscale transformation after the conversion by local mean and standard deviation algorithm to realize the contrast enhancement. Finally, the local adaptive binary method is used for binary processing after the processing of the grayscale image.

Gaussian Smoothing
Gray-scale Transformation Contrast Enhancement Binarization

Fig. 3. Steps involved in image preprocessing
The reasons behind choosing Gaussian smoothing technique can be described as follows. Because of the interference of various external factors such as irregularity and noise which is highly inevitable during the process of acquiring video or image. This also leads to lose the image information and data corruption while preprocessing, and will affect the image quality during the subsequent steps such as image extraction, detection, and identification. Therefore, image noise filtering is highly essential. Compared with the smoothing method such as median filtering and adaptive filtering, the Gaussian smoothing filter is used to eliminate the noise in the spatial and frequency domain. The Gaussian smoothing image not only enhances the significant low frequency information, but also retains the image edge contour. The contrast enhancement method applied is as follows. According to the different brightness point level measurements and different levels of pixel statistics, the pixel information is compared at each point using clustering method. The method compares the differences between bright and dark points and improves the difference between the pixels, the contrast is enhanced. By using the selection of local mean and standard deviation algorithms to complete contrast enhancement, the excessive contrast in other highfrequency parts is avoided. Haar feature recognition algorithm, SIFT and SURF are based on grayscale, but generalized Hough transform is more suitable for the detection of the whole face. The Haar more inclined to face detection, SIFT and SURF are more complicated than Haar, thus the Haar feature algorithm is used for face recognition in this paper.
Gaussian smoothing: Image acquisition is easily distorted by various environmental factors resulting from irregular noise. So, this needs to be processed by a Gaussian smoothing filter. The smoothing process reduces the image noise using a bilateral filter, while keeping the edges sharp. The formula for Gaussian smoothing process for two-dimensional Gaussian function is shown as follows: Where x and y are defined as the pixel template coordinates, σ is defined as the smoothness parameter. The larger σ is, the better the smoothness is. The resulting kernel of the filter also exhibits Gaussian distribution characteristics, and the processed image enhances the low-frequency information.
Gray-scale transformation: The principle of Grayscale conversion method is that, the color image is converted to grayscale images for easier processing of images and face detection. The principle is to convert the R (red), G (green), and B (blue) values in the picture to gray values. The system conversion formula uses the weighted average method as shown in below: Gray=0.229*R + 0.587*G + 0.114*B Where, Gray is the gray value. Contrast enhancement: The contrast enhancement is to separately process the grayscale values of all the points in the image that have been smoothed. By comparing the grayscale values, level calculation and difference comparison, the difference in grayscale values among the points is more significant. The main application of local mean and standard deviation algorithm is to computer local average of the lowfrequency part of the standard deviation, where the high-frequency part of the formula is: The local variance is defined as below: Where δx (i, j) is the local standard deviation and f (i, j) is used to represent the intensified pixel value corresponding to x (i, j), which can be expressed as follows.
The function G (I, j) is not arbitrarily defined. Under normal circumstances, the value of G (I, j) will be greater than 1 to enhance [x (I, j)mx (I, j)], the highfrequency component. Usually G (I, j) is defined as a constant, it is assumed to be C, and generally C > 1, which will be transformed into: During the processing of any high-frequency part of the image which are basically of the same magnification, does not rule out that some of the high-frequency components will be over-enhanced or amplified.
Binarization: In the process of grayscale binarization, the values between 0 to 255 to represent grayscale values of the image pixels are used. The whole image has only two colours of with black and white, so that the grayscale features can be handled easily and quickly [20]. In this system, partial adaptive binary image processing is selected, and the grayscale is mainly cut into N modules according to a specified rule. A threshold T is set for each module, and each pixel in the module is respectively adjusted according to a T assigned 0 or 255, thereby completing the binarization of the module, and the remaining N-1 modules are also binarized in the similar way. In local adaptive binarization, each module threshold T according to a combination of local characteristics formulated threshold calculation formula, as shown in Equation 7.
T = a*E + b*P + c*Q Where a, b, and c are free parameters, E is the average of the module's pixels, P is the squared difference between the pixels, and Q is the root mean square between the pixels so that the binarized image can be more pronounced.
Haar-like features: Currently, face detection methods are mainly divided into two types: one is based on existing knowledge and other is based on statistics. Haar-like feature belong to the latter category. In 2004, Viola and Jones proposed face detection using the Haar-like feature and integral graph method [8]. Later, Rainer Lienhart and Jochen Maydt extended the Haar-like feature and eventually formed the Haar classifier which is currently used by OpenCV [21]. Haar-like features, have been commonly used in several applications such as object detection, and face recognition [19]. Haar includes four types of feature templates: edge features, linear features, centre features and diagonal features as the basic elements of decision. The feature template consists of black and white rectangles, and the eigenvalue F of a template is given by white rectangular pixels and Sw minus black rectangular pixels and Sb. By setting the subwindow template category, location and size of the rectangle, a large number of eigenvalues can be collected from the image. For example, using a detection window of 24 * 24-pixel size, the number of rectangular features that can be obtained will reach over 160,000 features.

Image building algorithm
The integral image needs to pass the image traversal only once, then the pixel sum of all the areas in the image can be calculated. This is a fast algorithm and greatly improves the eigenvalue calculation efficiency. The construction principle of the integral image is that the value I (x, y) at the position (x, y) is the sum of all the pixels in the upper left corner of the original image (x, y). The details of algorithm are as follows.
S(x, y) represents the sum of the row direction, initialize S(x, -1) = 0; an integral image is represented by I (x, y), initialized by I (-1, y) = 0; By progressively scanning the image, the summation S(x, y) and the integral image I (x, y) values in the row direction of each pixel point (x, y) are respectively calculated by the following equations and expressions.
S(x , y)=S(x , y-1)+I(x , y) I(x , y)=I(x-1 , y)+S(x , y) To do a traversal of the image, it is clear that once the bottom right pixel is reached, the construction of the integral graph I is declared complete. After completion of the integral graph construction, for any matrix area P in the image, the corre-sponding four vertices are α, β, γ, δ, then the P pixels can be calculated by the following formula: The Haar-like feature value is essentially the difference between the sum of two matrix pixels and therefore can be completed in a very short period of time.

Image building algorithm
The important method of face recognition in the system is the process of template matching, which uses an algorithm provided in OpenCV, named normalized variance matching method. Finding the most similar area in a source image with a known template image is called template matching. The objective function is the Match Template function. Its function is to find the similarity between each position of the source image I and the template image T, and store the similarity result in the result matrix R. The brightness of each point of the matrix represents the similarity with the Template Matching degree. The normalized square difference matching algorithm formula is as shown in Equation 11. Here, T is defined as a template image, I is defined as a source image, and R is defined as a matching result.

Development Environment and Implementation
In this Section, the design details of development environment of the proposed system are described. The steps in this design include building Android development environment [22], using OpenCV libraries, using JNI (Java Native Interface) technology [23], Android NDK (Native Development Kit) [24], and Android SDK [25] components. The interconnection of these tools is shown in Figure 4.  [26,27]. The Android NDK supports native development of C/C++ and allows programmers to develop Android applications using C/C++, and call libraries such as OpenCV to Android platform. So, to use OpenCV in the standard Android development environment, also need tools such as Android NDK, and SDK. Android NDK is a tool set, integrated Android cross compiler environment, and help developers to quickly develop C/C++ shared library. The face recognition algorithm is implemented by C language, and by calling libraries available in OpenCV. This provides higher implementation efficiency than using only the Java language. Android system acts as the application layer, where the programs are written in Java language and it provides JNI interface, so that a Android program can easily call the C language. The JNI interacts between the local library and the Java framework as shown in Figure 4.

OpenCV library
OpenCV is an open source library of image processing algorithms based on C or C++ programming. OpenCV libraries have the advantages such as: Cross-platform, independent of operating system, hardware, and graphics manager; OpenCV is free of charge for non-commercial or commercial applications; fast and easy to use; good scalability including low-level and high-level application development kits for common image or video load, save and capture modules [21]. As the devices with Android OS has the capabilities to obtain the benefits provided by OpenCV library which also guarantee real-time, an image preprocessing algorithm is used to form a face recognition system.

JNI framework
Currently, most of the Android developments are achieved through Java language, and it supports the applications written in C/C++. As a local language C++, and Java will be required to be run on the JVM (Java Virtual Machine) to as a driver software to interact with the hardware directly. So, system libraries are completed in C/C++ and Java language is used to call several functions related multimedia, SQLite, and OpenGL etc., where are they bundled as system packages. In this paper, the image preprocessing, face detection, and face classification authentication algorithm are accomplished in OpenCV class library function and compiled as a library, and JNI framework is used to call them, and the steps involved in JNI call are shown as follows: 1) Create a new JNI folder in the project's root directory. 2) Create a C file in the root folder and add the header file to the folder. 3) In the Java code, create a local method, such as the method named: This returns a string that defines a string in C language: char*cstr= "hello from c"; The C language string into java language string: jstringjstr= (*env)->New-StringUTF(env,cstr);return jstr;

5)
In the JNI, create Android.mk and fill out the following in this format: LOCAL_PATH:=$(callmy-dir) include $(CLEAR_VARS) LOCAL_MODULE:="" LOCAL_SRC_FLIES:="" include$(BUILD_SHARED_LIBRARY) 6) In the JNI folder under the implementation of ndkbuild.cmd instructions, configure the ndk-build environment variable, and refresh the project lib under the armbi file. 7) Use java code to load the so class library, call the local method, by setting the parameter string. 8) Set the support based on implementationarchitecture such as x86 architecture, build it inside JNI, by setting APP_ABI parameter.

Android native development kit (NDK)
The Android NDK is a set of components based on C/C ++, which can be used to write some of the modules through C/C++ code, and the code of these modules can also run in the Android Virtual Machine. NDK can guarantee higher performance requirements, and often used in applications that require higher security, because these can prevent recompilation or de-compilation. Similarly, the NDK can easily reuse the existing C/C++ modules and can call C++ library. In this paper C/C++ version of OpenCV is used, and compared to OpenCV for android, NDK has more powerful functions. NDK includes API, ARM, x86, MIPS cross compiler, debugger, Java native, and build tools, with dynamic link library. The face recognition algorithm presented in this paper is implemented using C language, and by calling OpenCV library. This has a higher implementation efficiency than using the Java language alone. Android system application layer use Java language, but the Android system also provides JNI interface, so that the Android program can easily call the C language. JNI is located between the local library and the Java framework layer, and this is shown in Figure 4.

Implementation details
The implementation process is a is a two-step process with face detection as the first step in face recognition. In this section, we will describe the algorithms behind the face detection, and face recognition. Face detection is the first step in face recognition, where it is important to decide whether the image captured is a face or contain information related to human face. The implementation of this method is described as follows. The OpenCV has its own org.opencv.android.JavaCameraView custom control tool, which cyclically crawls data from the camera. In the call-back method, we can obtain the matrix data, and then by calling OpenCV native method. To detect whether there is a face in the obtained image, we will enclose within a rectangle array with face data as shown in Figure 5. The code responsible for this process is shown below:  Face recognition is the major step in this process. The face recognition step evaluate the similarity information by obtaining the eigenvalues of the face, and then by comparing the two eigenvalues. In OpenCV eigenvalues are represented as a picture, and after the face detection from the callback method of matrix data, we can extract the eigenvalues and, then compare the eigenvalues. In order to improve the accuracy of identification, face need to be detected, must be identified as a human face, converted into a gray scale, and then normalized. The major sections of program code responsible for this process is shown as below:

Testing and Results
Testing is done with two different smart phones to evaluate the accuracy and acceptability of face detection and recognition methods implemented. In the first case, the testing is carried out on Huawei Mate9 Android 7.0 smart phone, which has a 20megapixel camera, and Kirin 960 CPU. In the second case, the testing is carried out on Xiaomi 5 Android 6.0 smart phone, which has a 16-megapixel camera, and MSM8996 CPU. Face information of ten individuals are added to the database, and testing the face detection and face recognition is carried out in real-time. We have extracted different facial eigenvalues and compared their similarity. It is found that the similarity of face eigenvalue of the same person is significantly higher than that of different human face. The Table 1, and Table 2 shows the results obtained in Huawei Mate 9, and Xiaomi 5 smart phones respectively. We have also shown the overall success rate, time taken for face recognition, and RAM usage in megabytes.
The success rates in each case shows successful matching of the images while testing with 20 images. Figure 6 shows the sample display of face recognition results acquired from Huawei mate 10 smart phones. As shown, the similarity level is 88.15% and 88.97% during successful matching, and 31.14% during unsuccessful matching. We have extracted different facial eigenvalues and compared their similarity. It is found that the similarity of face eigenvalue of the same person is significantly higher than that of different human face. The Table 1, and Table 2 shows the results obtained in Huawei Mate 9, and Xiaomi 5 smart phones respectively. We have also shown the overall success rate, time taken for face recognition, and RAM usage in megabytes. The success rates in each case shows successful matching of the images while testing with 20 images. Figure 6 shows the sample display of face recognition results acquired from Huawei mate 10 smart phones. As shown, the similarity level is 88.15% and 88.97% during successful matching, and 31.14% during unsuccessful matching. Currently, cameras and microphones are miniature in nature, and can be easily integrated into wearable devices. For the efficient security of the devices from hackers, and confirming authorized users is very important for personal data protection, and to avoid misuse of such information from third parties. So, there are different methods such as speech recognition and face identification methods are very essential. These methods have better advantages over password-based authentication systems, and make only the user to access such devices. Face recognition systems perform well in the limited circumstances, although they exhibit better performance with frontal images and constant illumination. Currently, majority of face recognition algorithms fails in different conditions in which people need to use the idea of this technology. The next generation facial recognition systems should have tools to recognize human face in real-time and in limited and unexpected circumstances.
In this paper, a software development framework for real-time face identification system for mobile devices based on Android platform using OpenCV is presented. We have used supporting software tools such as Java Native Interface (JNI), and NDK (Native Development Kit) along with OpenCV for implementation of the proposed system. For the face detection and recognition, the Gaussian smoothing and gray-scale transformation algorithm is applied to preprocess the image. Then, the Haar-like feature matching method is used to describe the characteristics of the operator and obtain the face characteristic value. The proposed system is tested in real-time in two different brands of smart phones, and results shows average success rate in both devices for face detection and recognition is 95% and 80% respectively.