Paper—An Efficient Extreme Learning Machine Based on Fuzzy Information Granulation An Efficient Extreme Learning Machine Based on Fuzzy Information Granulation

In order to improve learning efficiency and generalization ability of extreme learning machine (ELM), an efficient extreme learning machine based on fuzzy information granulation (FIG) is put forward. The new approach not only improves the speed of basic ELM algorithm that contains many hidden nodes, but also overcomes the weakness of basic ELM of low learning efficiency and generalization ability by getting rid of redundant information in the observed values. The experimental results show that the proposed method is effective and can produce desirable generalization performance in most cases based on a few regression and classification problem. Keywords—extreme learning machine (ELM), fuzzy information granulation (FIG), neural networks, support vector machine (SVM)


Introduction
Extreme learning machine (ELM) [1], proposed by Dr. Huang in 2004, works for the "generalized" single-hidden layer feed-forward networks (SLFNs) [1 2], but there is no need to tune the hidden layer (or called feature mapping) in ELM [3]. Due to the fact that ELM's input weights and hidden neurons' biases do not need to be adjusted during training and one may simply randomly assign values to them, the learning phase of many applications can be completed within seconds [4]. Compared with some classical methods, such as neural networks [5] and support vector regression (SVR) [6], ELM is proved to require fewer optimization constraints and result in simpler implementation, faster learning, and better generalization performance.
Due to the simplicity of ELM's implementations, ELM has been extensively used in classification and regression applications. However, because of some ELM's parameters (the input weights and hidden biases) are randomly chosen, ELM's learning efficiency and generalization ability can't be guaranteed. Usually, there are two ways to solve this problem. One is to find the best parameters and some scholars have finished by some intelligence algorithm. The other is to increase the number of the hid-den units so as to make these randomly generated parameters approach the best parameters [7]. Nevertheless, the hidden units's increase will add up to the amount of calculation and is inconvenient for the implementation.
In recent years, information granulation (IG) has been used increasingly as an effective technique to get rid of redundant information in the observed values. IG is the process of forming meaningful entities of information, and fuzzy modeling can be conveniently adopted for information granulation, i.e. fuzzy information granulation (FIG). The FIG approach can transform primary data into a sequence of granules by setting the size of the granulation window, generating the granulated sets [8]. Historical observation data is selected to be fuzzy information granulated and this process can improve training time, training efficiency and ensure test accuracy.
Enlightened by the idea of FIG-SVM (support vector machine based on fuzzy information granulation) algorithm [9], this paper puts forward an efficient extreme learning machine (ELM) based on fuzzy information granulation ( The paper is organized as follows. In Section 2, the extreme learning machine (ELM) is introduced. An Efficient Extreme Learning Machine Based on Fuzzy Information Granulation (FIG-ELM) is discussed in Section 3, and in Section 4 simulation results of some recognized benchmark regression and classification problems are given. In Section 5, conclusions are brought forward.

Extreme Learning Machine
The typical single hidden layer feed-forward networks (SLFNs) can be expressed as: Where x and y respectively represent the input element and the output element ,W ij is the weight vector connecting the i-th input node and the j-th hidden node, ! jk is the weight vector connecting the j-th hidden node and the k-th output node.
For Q arbitrary distinct samples ( is activation function and b is the threshold. The ELM's mathematic model is: where w i =[w i1 , w i2 ,K, w in ] T is the weight vector connecting the i-th input node and the hidden nodes ,x j =[x 1j , x 2j , K, x nj ] T is the input matrix, ! i =[! i1 , ! i2 , K, ! in ] T is the weight vector connecting the i-th hidden node and the output nodes and b i is the threshold of the i-th node, i=1,2,k,l.. Eq. (1) can be rewritten as follows where T' is the transposition of matrix T and H is called the hidden layer output matrix in neural network.
(3" According to Eq. (1) and Eq. (2), matrix H is known when activation function g(x) is given and the parameters values (w, b ) are randomly obtained. So, the weight vector ! can be solved by the following equation: The solution is: where H + is matrix H's Moore-Penrose generalized inverse.
An Efficient Extreme Learning Machine Based on Fuzzy Information Granulation (FIG-ELM)

Fuzzy Information Granulation (FIG)
The concept of fuzzy information granulation (FIG) was suggested by Dr.Lotfi and A. Zadehin the 1960s. The FIG approach is implemented to transform the original data into a sequence of granules, gaining more general view at the data that retains only the most dominant components of the original temporal series [8].
For given time series, all time series X can be regard as a window for fuzzification. The task of fuzzification is to create a fuzzy granule P on X, which can reasonably describe inkling G of X [9]. So, the definition of data is: Fuzzification essentially is a process to ensure a function A, which A is the membership function of G. Generally speaking, there are some common forms of fuzzy granules: triangular type, trapezoidal type, gaussian type, parabolic type and so on. In this paper, triangular type was chosen. The structure of triangular type is shown in the Figure 2.The membership function can be constructed as: ELM need not spend much time to tune the input weights and hidden biases of the SLFN by randomly choosing these parameters. Since the output weights are computed based on the input weights and hidden biases, there inevitably exists a set of nonoptimal or unnecessary input weights and hidden biases. Thus, in order to make these randomly generated parameters approach the best parameters, we need to increase the number of the hidden units at the cost of increasing the training time.
In view of the situation above, an efficient approach named FIG-ELM combining ELM with FIG is proposed. In order to improve ELM's training time, learning efficiency and generalization ability , FIG is exploited to dispose the original data set, namely, get rid of redundant information in the observed values at first, and then ELM is used to do train granulated data for prediction.
The whole arithmetic processes can be shown in Figure 3:

Experimental Results and Analyses
In order to prove the capabilities of the approach proposed in this paper, some regression problems and classification problems are selected. All the programs are run in MATLAB 2010a environment. In every simulation, there are three different fuzzy particle after granulation: Low, R, Up, and we all only choose fuzzy particle Low for prediction to facilitate the process.  Figure 4 and Figure 5 show the original data and the new data disposed by FIG respectively. Figure 6 shows the fitting results obtained by these four learning algorithms. Table 1 summarizes the results of realworld regression problem, with regard to the training time, maximum absolute error, least absolute error and the average error for making the proposed algorithm's performance prominent.   It can be seen from Table 1 (8) The training data set and testing data set are randomly generated, and 1500 groups of data are used for training and 500 groups for testing.

Real-World
There  Table 2. Figure 7 shows the true value of the nonlinear function and the approximated value obtained by these four learning algorithms.  As can be seen from Figure 7 (a) and (b), the BP and SVM learning algorithms' testing error is obvious. We can see that FIG-ELM has the best prediction performance from Figure 7 and Table 2, it not only has the shortest training time, but also the minimum MSE. groups of training data can represent 500 groups. One hundred trials have been conducted to compare the performance of these algorithms. Simulation results, including the average training accuracy, the average testing accuracy, and the training time, are shown in Table 3. It's easy to see from Table 3 that SVM's testing accuracy is lowest though it obtains the best training accuracy. BP performs very poor in this case, with testing accuracy being at 0.739 and longest training time. LVQ and DT algorithms obtain better performance at training time, training and testing accuracy, but is still not very well. By contrast, FIG-ELM algorithm runs the fastest among these five algorithms and obtains the best performance.
Medium Size Classification Problem-Character Recognition: The total data are 6000 groups, and 5000 groups of data are randomly selected for training and the rest 1000 groups for testing. In addition, FIG's number of granulation window is 500. One hundred trials have been conducted to compare the performance of these algorithms. Simulation results, including the average training accuracy, the average testing accuracy, the training time and the number of hidden nodes are shown in Table 4. As observed from Table 4,

Conclusions
This paper combines extreme learn machine (ELM) with fuzzy information granulation (FIG). The new approach not only improves the speed of basic ELM algorithm that contains many hidden nodes, but also overcomes the weakness of basic ELM of low learning efficiency and generalization ability. The effectiveness is demonstrated by simulating four examples: two regression problems (a regression dataset and a nonlinear function approximation) and two classification (Breast Cancer and Character Recognition). Experimental results show that FIG-ELM algorithm has a higher potential of enhancing the predictive accuracy, robustness and reducing the training time.
Furthermore, efforts will be further made to achieve a higher efficient level. We also need to improve ELM algorithm, such as online sequential ELM, etc.