A Time Series Modeling and prediction of wireless Network Traffic

The number of users and their network utilization will enumerate the traffic of the network. The accurate and timely estimation of network traffic is increasingly becoming important in achieving guaranteed Quality of Service (QoS) in a wireless network. The better QoS can be maintained in the network by admission control, inter or intra network handovers by knowing the network traffic in advance. Here wireless network traffic is modeled as a nonlinear and nonstationary time series. In this framework, network traffic is predicted using neural network and statistical methods. The results of both the methods are compared on different time scales or time granularity. The Neural Network(NN) architectures used in this study are Recurrent Radial Basis Function Network (RRBFN) and Echo state network (ESN).The statistical model used here in this work is Fractional Auto Regressive Integrated Moving Average (FARIMA) model. The traffic prediction accuracy of neural network and statistical models are in the range of 96.4% to 98.3% and 78.5% to 80.2% respectively.


INTRODUCTION
In recent past due to the technological breakthroughs in the field of wireless system has enabled the pervasive acceptance and deployment of wireless network. Still wireless system is not a preferred choice to the wired counterpart. The reason for preferring wired network to a wireless network is that wireless network will not ensure guaranteed QoS due to the unpredictable behavior of network traffic. The unpredictability is for the various factors such as user mobility, arrival pattern and diversified network requirement of user application. This leads to nonlinear and time varying traffic flow between wireless network infrastructure like wireless Access Points (AP)/Base Station (BS) and set of wireless devices using the network infrastructure. Now it is increasingly becoming important to provide guaranteed QoS to the users of the network, that can be achieved by one or more remedial measures like admission control, load balancing and intra or inter network handovers [1], for these measures accurate and timely forecasting of network traffic is the critical factor. The remedial measure in turn improves the QoS of a wireless network and hence the wireless network will be more reliable.
The nucleus of any prediction mechanism is to monitor the past and present behavior of the system and establish statistical relationship between set of inputs to the set of outputs over a given time scale or time granularity. This relationship is either linear or nonlinear and time invariant (stationary) or time variant (nonstationary). The wireless network traffic is a nonlinear and nonstationary system [32]. The number of users and their network requirement is nonlinear due application characteristics and user behavior [2].The time varying behavior of the traffic is due to terminal mobility and network characteristics [3].The network applications are classified in to four classes namely conversational, streaming, interactive and background. The first two classes' network requirement is relatively constant and predictable. The third class of network requirement is based on the user behavior in the form of mouse clicks. The fourth and the last class network requirement is always on the state of the network [4]. Number of user and the amount of traffic generated by them has no correlation, hence the network traffic is nonlinear to the number users served by the wireless network [5].The number of users served by each AP or BS is dynamic due to the user preference, load balancing, inter/Intra network handover, admission control policy and network characteristics like fading, time-variant noise [3] [6].Neural networks are the efficient methods to model, evaluate and predict the behavior of nonlinear and nonstationary systems [7]. In this work first data traffic time series is extracted from realistic wireless trace [8],the missing values of time series is estimated using Radial Basis function based Interpolation and function approximation method [9], then traffic time series is modeled and predicted using neural network architecture and finally results are compared with statistical model. The rest of the paper is organized as follows. Section 2, presents review of related work in the field of network traffic prediction and estimation. In section 3, the system is modeled from traffic, prediction, statistical and neural network viewpoint. The system model is validated with set of forecasting trails in section 4. Finally, paper is concluded in section 5.

RELATED WORK
There are several research works on network traffic modeling and prediction of wireless network. Most of this will use network traces for the study; these trace data collected by three techniques, SNMP polling, Syslog or tcpdump from network sniffers [18]. The SNMP trace will specify total bytes transferred, number of bytes transferred with error and layer specific information of network [19]. The Syslog traces will enumerate the statistics of user behavior in terms of mobility and usage pattern of network. Sniffer will provide detail information about the size of packet, packet interarrival time and interface specific information of the network. The time granularity of SNMP, Syslog and tcpdump is in terms of few seconds, second and microseconds respectively. In [20], tcpdump and SNMP traces were used to study the variance of number of users, packet latency and handoffs between AP and laptops for the period of eight days. In [12], SNMP traces were used to characterize WLAN usage, user arrival and session duration, statistical model were proposed for network usage, arrival and session duration. In a study conducted at Dartmouth College [17], SNMP, Syslog and tcpdump traces were used to analyze the usage pattern and user distribution in a large wireless network. In [20], tcpdump trace is used to show the traffic characteristics from AP perspective and they also studied the behavior of traffic from Physical, Data link, IP and Transport Layer. Recently [21], SNMP trace is used to analyze the traffic characteristics like aggregate network load and its periodicity.
Some works have also been reported in the field of network traffic prediction. The prediction process can be classified as long and short term. In long term prediction the process of prediction time granularity is in the order week and month. In short term prediction the time granularity is in terms of milliseconds, second, minutes and hours. The aggregated IP Backbone traffic is modeled for large time scale (weekly basis) and accurate forecasting for the period of six months by lower order ARIMA model [22]. In recent studies FARIMA process is used to model crucial features of self similar traffic like long range dependence (LRD) and short range dependence (SRD) of internet and broadband traffic, short term traffic forecasting in terms of millisecond to seconds of granularity were also reported in theses works [23] [24].In recent past [21], Normalized ARIMA Process and historical means of recent traffic were used to forecast hourly and daily trend of wireless traffic. The perceptron neural network model with backpropagation learning algorithm is used model the self similar traffic patterns [25]. Recently, Generalized Mean Neuron (GMN) model is used for forecasting of weekly internet traffic of a WAN router [26]. In the above works, the work is either limited to modeling of wireless traffic or forecasting of wireline or wireless network traffic, both short term and long term trends. In wireless network the forecasting granularity is limited to hourly trend of the traffic and due to the large spikes and nonstationary behavior of time series of traffic, the series is smoothened and converted in to stationary series by applying power transformation technique. In the proposed work the wireless traffic is modeled as set of random processes and for the purpose of forecasting the wireless traces were used to construct time series and these time series were used to forecast traffic for single step and multistep step for second to minute time granularity.

Traffic Model
In any stationary time series system, the series can be modeled as However this is not true in case of nonstationary time series, the entire series cannot be determined by single function f(.), instead by set of function f 1 ,f 2 ,…..f k where k is the number of factors determine each elements of the time series, hence the nonstationary time series can be modeled as Where fi(t) is a random process and hence the nonstationary time series can be modeled by a set of random process [14].Since the traffic at AP or BS is a nonstationary time series, this can be determined by number of users and their arrival pattern, numbers of sessions(application) of individual user, session inter arrival and size of each session. All these are random processes [10].
Where f1(t) is random process of number of users and the user arrival pattern, f1(t) is discrete value continuous time process User distribution either uniform or lognormal process and user arrival pattern is time varying Poisson process [18][10] [11]. The f2(t) is random process of number of session and session inter arrival pattern, f2(t) is discrete value and continuous time random process, user session is Lognormal process or time varying Poisson process and session inter arrival is a BiPareto, Weibull, Markov Modulated Poisson Process(MMPP) or Time varying Poisson process [10] [12] [13]. The f3(t) is the size of each session, f3(t) is discrete value continuous time process and either Bipareto or Lognormal random Process[10] [13].

FARIMA Model
There is a well established theory for statistical modeling and forecasting of stationary time series by Auto regressive Moving Average (ARMA) Process, as the order of Auto Regression(AR)value increases the system can capture the seasonal trend of the time series and this leads to better approximation of stationary time series and the Moving Average(MA)value tries to remove effect of unknown initial value of the series [15]. The combination of AR and MA process will leads to high prediction accuracy of stationary time series with removal of dependency of series by unknown initial values. However, in case of nonstationary time series such approximation will not be possible by simple ARMA process due to the fact that time series represents only one realisation of set of stochastic process, for the better forecasting and approximation the time series can be decomposed into permanent (nonstationary) and transitory(stationary) parts [16].
Where yp(t) denotes permanent and yk(t) is transitory component. Such decomposition will help in determining the Long Range Dependence(LRD) and Short Range Dependence(SRD) properties of the time series separately [15].A nonstationary process y(t) is transformed in to stationary process by differencing the process d times, the parameter d determines the LRD properties of the process.
Where L is backward shift operator, δ = [y(t) -y(t-1)] and x(t) is ARMA process, the parameter of ARMA process p and q determines SRD properties of the process. The original nonstationary process is denoted as ARIMA(p,d,q). The accuracy of nonstationary time series prediction lies on the estimation of differencing tem d. If the degree of differencing d allowed to take nonintegral value and the range is between -½< d<½, then such process is called as FARIMA (p,d,q) process [15]. When the nonstationary time series is modeled by FARIMA (p,d,q) process, the FARIMA (p,d,q) model can be used to predict the future steps of time series [23].
Where Γ(.) is gamma function and (.) € y is the predicted time series and n-step prediction from estimated FARIMA(p,d,q) process for nonstationary time series.

Neural Network Model
Feedforward neural network has the ability to map any nonlinear and nonstationary function to an arbitrary degree of accuracy [27].One such popular feedforword network is the radial basis function network. It is a single hidden layer feedforword network. Each node in the hidden layer has a parameter vector called as center. These centers are used to compare with network input and produce radically symmetrical response. These responses are scaled by connection weights of the output layer and then produce network output, where Gaussian basis function is used and given by. recurrent, globally feedforward neural network [29]. The RRBFN output for Gaussian basis function is ( ) € y is the predicted time series, n is the number of step prediction and j is the number of neurons in the input layer of RRBFN system, the architecture of RRBFN model is shown in Figure 1.
The Recurrent neural network with standard gradient decent algorithms will provide better function approximation for a short time step. For the longer temporal dependencies the gradient vanishes as the error signal is propagated back through time so that network weights are never adjusted correctly and the system will fail to predict for longer and complex time series steps. To deal with this an echo state network was proposed [30]. It consists of two parts such as a dynamical system with a rich set of dynamics followed by a memoryless output readout function shown in Figure 2.
The dynamical system consists of large number of neurons that are randomly interconnected and selfconnected and these connections are fixed. This dynamical system is also called as reservoir and the optimal connections of neurons inside the reservoir will always require a number of trails. During the training process only weights of memoryless output function is changed through offline linear regression process or by online methods like recursive least square (RLS). The general state of reservoir is given by, Where ϕ is the sigmoidal activation function, y(n) is current input vector, x(n-1) is the internal state of reservoir at time step n-1,

TRAFFIC PREDICTION
The traffic prediction process involves two major phases, the first phase is time series extraction of data traffic and second phase is neural network and statistical model based time series prediction of wireless data traffic. For the purpose of time series extraction of data traffic, the wireless trace available at CRAWDAD data repository is used [8]. The original dataset contains the traffic trace of 476 wireless APs for the period of 11 weeks [17], here the traffic prediction is limited to single AP for the time granularity of one second, ten second and one minute. The original trace file is filtered to obtain the traffic of single AP, for this SNMP trace is used to extract the detail traffic from tcpdump file. The filtered traffic trace is further processed to obtain amount of traffic in bytes per unit time. The filtered trace contains traffic between AP and set of wireless users for the period of one day. This traffic is combined and arranged to form a time series in to the different time scales such as 1 millisecond, 1 second and 1 minute. Once the time series is extracted from network trace, the traffic is forecasted for different time steps and the step refers to the unit of time .The filtered traffic has to be normalized for the purpose of n-step ahead prediction [31].
The Neural network based traffic prediction involves training and testing of RRBFN and ESN predictor. The first phase extracted traffic time series used for training and testing of the predictors. The training and testing samples are randomly picked from the sample size of 1000. The RRBFN network has three layers: input, hidden and output. Here 300 neurons in the input layer with sigmiodal activation function and with the recurrent connections, the range of recurrent weights are -1 to +1. The hidden RBF layer has 175 neurons with RBF activation and output layer has single neuron with linear activation. The ESN network has 350 neurons in the reservoir with 75% of recurrent connection in the range of weights between -1 to +1. Input weights are in the range of -0.40 to +0.40 and recurrent weights are in the range of -0.6 to +0.6. Spectral radius is set to 0 and all the layer neurons have sigmoidal activation function, the predicted step sizes are 1 and 10 for both RRBFN and ESN.
In FARIMA based traffic prediction, The first step is to build FARIMA(p,d,q) model to describe wireless trace and the next step is to predict the network traffic for different time scale using    FARIMA model . The process of building FARIMA model involves four stages [23]. First stage is to preprocess the normalized traffic to obtain zero mean time series. Next is to approximate the value d, where d = H -½ , the value H refers to the Hurst Parameter, the Hurst Parameter is estimated by Variance-time plot of the normalized trace. Then to obtain the exact value of d by fractionally differencing of y(t) to obtain x(t) of equation 3, where x(t) is ARMA model of the original time series. Finally ARMA process parameter p and q, initially p and q values should start from smallest value, the best (p ,q) value is obtained by model identification and diagnostic checking process [15].The model parameters for various time granularity is listed in Table 1. From the estimated FARIMA process and current and past values of time series, the future steps of time series are predicted. The predicted step sizes are 1 and 10. The Figure     Complementary-CDF(CCDF) plot for trace and predicted samples for 1 step and 10 step granularity for 1 second granularity shown in Figure 9 and 10. CCDF plot for model output and trace for 10 second granularity is shown in Figure 11 and 12. Figure 13 and 14 shows 1 and 10 step predicted output for 1 minute granularity.   The Quantile-Quantile Plot (Q-Q Plot) RRBFN, ESN, FARIMA Models output and trace for 1 and 10 steps for 1 second granularity is shown in Figure 14 and 15. Figure 16 and 17 is the Q-Q plot for 1 and 10 step for 10 second granularity. The Q-Q plot for 1 and 10 step prediction for 1 minute granularity is shown in Figure 18 and 19.   Normalized Auto Correlation Function(ACF) plot for trace and predicted samples for 1 step and 10 step granularity for 1 second granularity shown in Figure 21 and 22. ACF plot for model output and trace for 10 second granularity is shown in Figure 23 and 24. Figure 25 and 26 shows ACF plot for1 and 10 step prediction output for 1 minute granularity. From CCDF plot, the functional behavior of trace and model output can be visually determined. The Q-Q plot and ACF plot the first and second order statistical relationship between normalized trace and predicted output of neural and statistical models can be visually determined [10].

CONCLUSION
In this paper wireless network traffic for short time step is modeled as time series and future values of time series are predicted by RRBFN, ESN and FARIMA models. The prediction accuracy of both Neural network model found to be superior for 1 second, 10 seconds and 1 Minute time granularity, The Prediction accuracy of RRBFN model for three time granularity ranges from 96.4% to 98.3%. Prediction accuracy of ESN model for three time granularity found to be 97.8%, to 98.1%. The prediction accuracy of FARIMA model of three time granularity seems to be less satisfactory compare to neural models; the prediction accuracy is the range of 78.5% to 80.2%. The FARIMA prediction accuracy will be slightly increased in 1 minute granularity, this is due to the fact that the time aggregation makes the wireless trace exhibits high degree of self similarity, the estimated Hurst Parameter H from variance time plot is 0.8457.In a shorter time granularity time series behavior is nonlinear and nonstationary due to burstiness in the network traffic and user behavior.
In neural Network Models, ESN based predictor is slightly better in terms of prediction accuracy, but network takes more training samples and longer training time compared to RRBFN. During the training phase the training Mean Squared Error (MSE) for ESN and RRBFN are 0.01254 and 0.0162 respectively.
In general, the results shows that neural network predictors better than statistical models. The performance of FARIMA model is based on parameter estimation, the process of parameter estimation is trail and error this affects the performance of FARIMA model to the large extent. The property of burstiness of wireless network traffic will be better captured by neural network predictors, the long range dependency is captured by adjusting the weights of the neurons during the training phase, Short range dependency the input is confined by the recurrent architecture of the network.
Future work includes applying NN models for long term prediction of wireless traffic. Furthermore, the NN based predictors are used for traffic prediction and Bit Error Rate (BER) prediction for the purpose of inter and intra network handovers, admission control in heterogeneous wireless network infrastructure.