Intelligent Botnet Detection Approach in Modern Applications

Innovative applications are employed to enhance human-style life. The Internet of Things (IoT) is recently utilized in designing these environments. Therefore, security and privacy are considered essential parts to deploy and successful intelligent environments. In addition, most of the protection systems of IoT are vulnerable to various types of attacks. Hence, intrusion detection systems (IDS) have become crucial requirements for any modern design. In this paper, a new detection system is proposed to secure sensitive information of IoT devices. However, it is heavily based on deep learning networks. The protection system can provide a secure environment for IoT. To prove the efficiency of the proposed approach, the system was tested by using two datasets; normal and fuzzification datasets. The accuracy rate in the case of the normal testing dataset was 99.30%, while was 99.42% for the fuzzification testing dataset. The experimental results of the proposed system reflect its robustness, reliability, and efficiency. Keywords—IDS, IoT, deep neural networks, DDoS, Bot-IoT


Introduction
Internet of Things (IoT) provides a technological movement towards effective physical-digital convergence. It enables the development of intelligent systems interconnecting physical things surrounding us over the Internet. Such technological advancements have resulted in a broad scope of IoT applications in varying domains. Examples of these applications include smart homes, e-healthcare, smart grid, and intelligent transportation systems. The IoT technology has also been applied in various industrial environments for effective monitoring, smart control, and intelligent automation. As a result, real deployments of IoT systems with an increasing number of IoT-enabled objects and devices have been growing during recent years. Accordingly, various IoT networks increasingly emerged along with a massive amount of IoT data communicated over the Internet. This poses serious challenges concerning IoT security and privacy, which can delay the effective deployment of IoT systems, particularly for data-sensitive applications.
In reality, IoT systems are vulnerable to a wide range of cyber-attacks which can lead to vital damage at different levels. This is more evident in the case of critical industrial and military applications with large IoT setups. Therefore, there is a need in current and future IoT deployments to address adequate practical IoT security support and establish effective resilience of any IoT attacks. IoT security solutions providing effective detection of malicious activities at early stages is critical to immune IoT systems. This gives rise to the need to develop Intrusion Detection Systems (IDSs) oriented for IoT systems.
IDSs enable monitoring network traffic and observing user activities to detect any abnormal actions or system abuse. The IDS functionality is based on detecting and responding to malicious traffic, which has made its way through the firewall system. IDSs can be implemented for the detection of misuses and anomalies. They can also be developed for host-based and network-based detection following either passive or reactive approaches. However, traditional IDS-based security solutions would require further improvement to be more effective in IoT environments. Most of the IoT systems are deployed with physical resources of limited computational and storage capabilities. Cybersecurity solutions need to be developed with lightweight yet efficient intrusion detection for adequate practical security support in IoT environments. Figure 1 shows the basic structure of a traditional detection system in IoT networks.  In this paper, a new IDS is proposed to secure IoT resources against cyber-attacks. The proposed solution is heavily based on deep learning networks to increase the efficiency in providing resilient and secure environments for various IoT applications. The recent IoT dataset, Bot-IoT, was adopted for the development of the proposed IDS.
The experimental results reflect the robustness, reliability, and efficiency of the deep learning IDS.

Related works
Intrusion detection in IoT environments is crucial to provide adequate protection from separate various critical attacks that would compromise the security of IoT resources. Several IDSs have been introduced to secure IoT systems against different attacks. For example, effective detection of the Denial of Service (DoS) attack was addressed in [1][2][3][4] to prevent exhausting IoT resources. Another common IoT attack is Distributed DoS (DDoS), which was considered in [5][6]. Other Network IDSs were also proposed for mitigating routing attacks, including Wormhole [7][8], Sinkhole [9], and Sybil [10][11] attacks. In other research proposals, more effective IDSs were introduced to address the detection of multiple routing attacks such as Blackhole and Selective Forwarding attacks in addition to other attacks [12][13][14]. Moreover, the proposed IDS in [15] provided a solution to detect DoS, DDoS, surveillance, and information theft attacks. In [16][17], the focus was on developing IDSs to secure IoT networks against DoD, DDoS, Remote 2 Local (R2L), User 2 Root (U2R), and probe attacks.
Addressing effective detection of such IoT attacks was differently approached by the research community. Many IoT-oriented IDSs based on machine learning methods were proposed in the literature. These include the solutions proposed in [13,18], which introduced Support Vector Machine (SVM) models for IoT intrusion detection. The Artificial Neural Network (ANN) method was also adopted in various IDSs proposals to secure IoT resources against different attacks [6,[19][20][21]. Other machine learning methods such as Naïve Bayesian [5,22], random forest [23][24], optimum-path forest [25], and logistic regression [13] were also considered.
More effective IDS approaches such as those based on deep learning were also proposed for securing IoT Environments. In [26], the performance of two IoT intrusion detection based on deep learning models was compared and examined the effects of adversarial samples on such deep learning models. In [16], the need for optimal features selection to build an effective deep learning IDS was addressed using the spider monkey optimization (SMO) algorithm in addition to incorporating the stacked-deep polynomial network (SDPN) to enhance detection recognition. In [17], a deep learning IDS model incorporating a self-taught technique (STL) was introduced to support innovative home IoT applications. In [27], the focus was on helping IoT Fog security with fully automated deep learning IDS based on cascaded filtering that can be adaptively tuned to improve the detection of specific IoT attacks. In [28], a combination of a deep learning method and a shallow learning engine was considered to build IDS for IoT applications. In [29], a Deep Learning model was also combined with Dendritic Cell Algorithm (DCA) for a better feature selection process with less complexity. In [30], another combination was proposed to enhance the adaptive selection of the hyper-parameter values of a deep learning model using the Particle Swarm Optimization (PSO) method. In [31], the researchers proposed an IDS framework based on combining network visualization and deep learning for large-scale IoT networks. In [32], the proposed IDS approach was based on implementing the Deep Belief Network, a type of deep learning algorithm. In [33], an IDS based on a deep migration learning model was proposed for IoT smart city applications.
On the other hand, such IDSs' effectiveness relies on the efficiency of the dataset adopted to develop the proposed model. In this regard, there are several publicly available datasets considered by the research community. Among the widely adopted ones are KDD99 [34], NSL-KDD [35], UNSW-NB15 [36]. For example, NSL-KKD was considered in [2, 17, 20-21, 24, 27, 37] to develop different machine learning and deep learning-based IDSs. UNSW-NB15 was also a common dataset among many IoT-oriented IDS solutions [19,22,28,[38][39]. Other examples also include CICID2017 [40] which contains traces for network flows and was utilized to build different machine learning IDSs in [41][42].
However, such datasets were not oriented towards IoT systems, and few IoT-based datasets are currently available to the research community. Thus, few IDS models developed using IoT-based datasets were proposed in the literature. One recent IoT dataset is the Bot-IoT dataset [43] which has drawn the attention of some researchers. For example, the dataset was utilized to develop the IDS solutions in [44][45][46][47]. The presented work in this paper provides an effective intrusion detection based on deep learning using the Bot-IoT dataset.

3
The methodology of proposed system A new security system is proposed in this paper to provide a good production environment for IoT applications. However, it will play an essential vital role in deploying various innovative trends. The proposed system is heavily based on a dataset that collected IoT devices. In more detail, this dataset is reflected in the internal and external behaviours of various devices/nodes that are connected with local/ global networks directly [1]. The phases of the proposed system are explained below:

Dataset source
The Intrusion Detection system proposed in this paper is heavily based on dataset features [1]. In addition, these features described the events of devices in IoT. However, the IXIA PerfectStrom tool is employed to produce raw network packets of UNSW-NB.
This dataset is used to evaluate the performance of the security system proposed in this paper. Thus, it contains normal and malicious behaviours of the network packets. Table 1 shows the names of the features of the dataset used to evaluate the efficiency and effectiveness of the proposed system.  Table 1 describes 46 features that are used to evaluate the performance of the proposed security system.

Intelligent detection system
The intelligent security system utilizes deep learning networks to distinguish between normal and abnormal connections via IoT. Therefore, it will be trained and tested with (more than 700000 connections) which describe the normal and abnormal behaviors of various IoT devices. On the one hand, the dataset is classified into three subsets which are training, testing, and validation. On the other hand, the first subset is testing (25%) and validation (25%), whereas training (50%).
The learning phase is stopped at a deep learning network when the square error is reached between actual output and desired. In more detail, this network has another stopping vector, epoch, namely 500. The basic structure of the deep network is shown in Figure 2. According to Figure 2, we can quickly notice that deep neural networks are composed of three layers. These layers are input, hidden, and output layers. However, the input layer is formed of 46 features, whereas the output layer has one output normal or abnormal connection. Therefore, the neural network is integrated from two hidden layers, and the first one has eight neurons while the second contains five neurons. The number of hidden layers or the number of neurons of each layer is considered a significant issue in a design detection system based on one of the artificial intelligence tools. Thus, in this paper, train-and-error is utilized to select the optimal number of hidden layers and the number of neurons at each layer.
The primary parameters of neural networks are considered an essential part of the design/build intelligent network for this Table 2 shows the initial parameters. TrainParam. goal 0 TrainParam. min_grad 1*10 ˗16 The proposed system has been simulated on the system with the Intel Core i3 processor (2.53GHZ).

The model of intrusion detection system
In general, security systems are considered a very important point in an open wireless area [48][49]. The security system is composed of the main three-phase which are dataset collection and preprocessing phase, training phase, and testing phase. All of these phases are shown in Figure 3.
• Data source and preprocessing: in this phase, we are prepared for training and testing by dividing into three subsets that are mentioned above. In addition, the dataset needs some preprocessing operations, such as uniform distribution and normalization. • Training phase: the initial structure of the deep neural network will be trained with features of the dataset. In this phase, condition stop is applied to get the best training rate that is heavily based on the threshold value, which is 99%. • Testing phase: the proposed system will be tested with a dataset of IoT. In this case, we have two options: one testing with the same training subset and another testing with another subset to prove the efficiency of the proposed system.
At this step, the detection system must have the ability to detect various types of attacks. The hybrid detection system is based on a deep neural network that can learn from normal and abnormal behaviors. The low cost, real-time response encourages us to apply this type of tool to design security systems.

Experimental setup
In this stage, the Bot-IoT dataset was utilized to perform a further study on the performance of the proposed system. It has pcap files of 69.3 GB in size, which contains more than 72.000.000 records. The attacks considered in Bot-IoT include DDoS, DoS, OS and Service Scan, Keylogging, and Data exfiltration attacks. However, a subset of the original dataset comprised of 3 million records with a size of 1.07 GB was extracted via the use of select MySQL queries. The Bot-IoT dataset was classified into three subsets which are training (50%), testing (25%), and validation (25%).
To build an effective deep learning system, the three-hidden-layers deep neural network was developed with an input of 24 features. The system enables the classification of each information into one of five distinct classes. The system is based on three different hidden layers. The first one has six neurons, and the second contains eight neurons, while the last one includes five neurons. This setup was developed following a well-known artificial intelligence method, namely train-and-error, to optimally specify the optimal number of hidden layers and the number of neurons at each layer.

Results and discussion
In this work, the proposed system is tested with two datasets. Two options are dependent on one testing with the same training subset, and the second option is testing with another subset to measure and prove the efficiency of the proposed system. First, the performance metrics of training and testing the deep neural network with standard data are calculated to evaluate efficiency. The performance of the classification and the number of records utilized are shown in Table 3. The accuracy rate is calculated according to Equation 1.

Accuracy =
Number of correctly classified patterns Total number of pa atterns (1) Moreover, the training/testing phases used the fuzzification dataset to calculate the accuracy rate of the proposed system. The accuracy rate of the training and testing= 99.20%, 99.42%, respectively. The performance of the classification and the number of records utilized are shown in Table 4. To measure and evaluate the proposed system performance, four types of alarms are measured: True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN). Alarms rates are calculated to evaluate the efficiency of the proposed system. The equations in (2, 3, 4, and 5) show alarm rate calculation-the results of this calculation presented in Table 5 Table 6 shows the results of alarm rate when training and testing the deep neural network with two datasets. First, the normal dataset was used, and the next fuzzification dataset was utilized. Secondly, the proposed system is tested with an IoT dataset, and some performance metrics are calculated. The results of four types of alarms are presented in Table 6.  Therefore, we compare the accuracy rate of the proposed system with the latest works to distinguish our results from others explain in Table 7. Table 7. Comparing the accuracy rate of the proposed system with the latest works

Security Systems
Accuracy Rate (%) [52] 98.22 [53] 98.071 Our proposal with a normal dataset 99.46 Our proposal with fuzzification dataset 99.54 However, we can easily notice that our system is more efficient at detection rate than others. In more detail, the proposed security system can adopt to detect/ identify various attacks, such as sybil, wormhole attacks.

Conclusion
In the current human life, the use of the Internet is increased. Hence, the number of Internet of Things (IoT) devices connected to the Internet increased. For this reason, finding a robust security model for the IoT environment is a big challenge. This paper proposes a new intrusion detection system based on utilizing a deep neural network as a classifier. To measure and prove the efficiency of the proposed approach, the system is tested on two datasets. First, the performance metrics of training and testing the deep neural network with normal data are calculated to evaluate the efficiency. The total accuracy of training and testing absolute was 98.60%, 99.30%, respectively.
Moreover, the training and testing phases used the fuzzification dataset to calculate the accuracy rate of the proposed system. The training and testing accuracy were 99.20%, 99.42%, respectively. approach. The suggested system is tested with an IoT dataset and the performance metrics calculated in the second case. For future work, the DNN method can be compared with other machine learning methods including fuzzy logic system, genetic algorithm, and swarm algorithm. Moreover, the usage of the proposed DNN-based model can be applied to potential online applications for network/ service providers.