Application of Teaching Quality Assessment Based on Parallel Genetic Support Vector Algorithm in the Cloud Computing Teaching System

With the rapid development of China's economy in recent years, the scale of students has expanded gradually, which has led to many new problems, including the problems of the quality and the quantity of teachers, and the teaching facilities being insufficient. The assessment of teaching quality is one of the most important aspects of teaching management, which come to the attention of every university. Therefore, it has become the current focus in the research of university teaching. At the same time, the traditional method of teaching quality assessment has not been able to deal with the phenomenon of big data in the field of education. As a new technology, cloud computing provides a broad space for the development of a new model in the aspects of hardware environment construction, software resource development, network teaching implementation and personal knowledge management. In order to effectively deal with the challenges of big data processing in the field of education, this paper proposes a GA-SVM teaching quality assessment algorithm which is based on MapReduce. Through the design of a map function and reduce function, this paper realizes the parallelization of the GA-SVM algorithm and the selection of the main parameters. Secondly, this paper uses a genetic algorithm to optimize the penalty coefficient C and kernel parameters 2 ! of SVM, and then solves the problem of difficulty in determining the parameters of support vectors. In addition, we improve the sensitivity of the search through the method of logarithmic transformation, and speed up the convergence rate of the GA model. Finally, we compare the parallel algorithm and the serial algorithm on the Hadoop platform. The results of experiments show that the GA-SVM based on MapReduce is suitable for teaching quality assessment under the environment of big data.


INTRODUCTION
With the rapid development of China's economy in recent years, the scale of students has expanded gradually, which has led to many new problems of the quality and the quantity of teachers, and the teaching facilities being insufficient. The assessment of teaching quality is one of the most important aspects of teaching management, which has come to the attention of every university. Therefore, it has become the current focus in the research of university teaching. At the same time, the traditional method of teaching quality assessment has not been able to deal with the phenomenon of big data in the field of education. As a new technology, cloud computing provides a broad space for the development of a new model in the aspects of hardware environment construction, software resource development, network teaching implementation and personal knowledge management.
In view of the problems of teaching quality assessment, many scholars have conducted thorough research and put forward many effective methods. The traditional methods of teaching quality assessment mainly include linear regression, partial least squares, multivariate statistical analysis, grey relational analysis and analytic hierarchy process [1][2][3][4]. These methods assume that there is a linear connection between the teaching quality and assessment index. However, the relationship between the assessment index and teaching quality are nonlinear. It is difficult to accurately describe the problem by using the linear model, which leads to the significant difference between the assessment results and the actual value [5,6]. In recent years, data mining technology has matured, and there are clustering analysis method, fuzzy method, support vector machine, neural network and other nonlinear methods of teaching quality assessment [7]. These data mining methods can well capture the mapping between the input and output, and improve the accuracy of teaching quality assessment. A support vector machine is a kind of machine learning method for small samples, nonlinear and high dimensional data. Compared with other learning methods, it has better generalization capability, and is the most widely used method in data mining.
The research of the support vector machine algorithm, it mainly focuses on the following aspects. (1) Decomposition algorithm. It usually takes much time to deal with big data, so we need to introduce the decomposition algorithm to improve the efficiency of the algorithm. The commonly used decomposition algorithm has Chunking algorithm [8], fixed working sample set algorithm [9], and SMO algorithm [10]. (2) Online learning algorithm. It is divided into online incremental learning and online reduced learning [11][12][13]. When the new data is added to the training data set, the method does not need to re-train the support vector machine model, so as to simplify the process of training purposes. (3) Multi-class classification algorithm. Support vector machine algorithm is usually used to solve the problem of two-classes. The multi-class classification is a generalization of the support vector machine for two-PAPER APPLICATION OF TEACHING QUALITY ASSESSMENT BASED ON PARALLEL GENETIC SUPPORT VECTOR ALGORITHM IN… classes [14]. (4) Privacy protection algorithm. With the emphasis on privacy issues, the privacy protection support vector machine is proposed. This method can be used to train the data, and it can get a more accurate calculation of the support vector machines, and also does not disclose the privacy of the parties [15][16][17][18][19].
To sum up, in order to effectively deal with the challenges of big data processing in the field of education, this paper proposes a GA-SVM teaching quality assessment algorithm which is based on MapReduce. Through the design of a map function and reduce function, this paper realizes the parallelization of the GA-SVM algorithm and the selection of main parameters. Secondly, this paper uses a genetic algorithm to optimize the penalty coefficient C and kernel parameters 2 ! of SVM, and then solves the problem the difficulty of determining the parameters of support vectors. In addition, we improve the sensitivity of the search through the method of logarithmic transformation, and speed up the convergence rate of the GA model. Finally, we compare the parallel algorithm and the serial algorithm on the Hadoop platform. The results of experiments show that the GA-SVM based on MapReduce is suitable for teaching quality assessment under the environment of big data.

II. CLOUD COMPUTING PLATFORM OF TEACHING QUALITY ASSESSMENT DATA BASED ON HADOOP
Next, we introduce the architecture of the cloud computing education platform. The platform is composed of a data layer, model layer and application layer.
Data layer: The data layer is a bottom operating system, responsible for the data storage and access. For example, the GFS is used as the Google file system, and the HDFS is used as an open source Hadoop file system.
Model layer: Computing power is an important indicator of cloud computing. The cloud computing platform must provide a simple and convenient calculation model to ensure that the computing power is high quality and has high reliability. The cloud platform computing model belongs to the category of parallel computing. Since the cloud computing data center is very concentrated, there is no problem of node failure caused by MPI. At present, the MapReduce model is usually used in the cloud computing model.
The application layer provides various teaching related application software. This layer mainly contains: (1) A platform for teaching management, academic management system and office system; (2) A teaching quality assessment system, examination system and score inquiry system; (3) Document processing software, courseware planning and presentation software; (4) A virtual computing environment; (5) A virtual computing laboratory based on cloud computing.

A. SVM model
The support vector machine is derived from statistical learning theory, and it is a kind of supervised learning method. Its basic idea is to transform the input space to a high dimensional feature space by nonlinear transformation, and then to find the optimal linear classification surface in the new space. Among them, the nonlinear transformation is realized by the kernel function. Therefore, the selection of the kernel function is the key to the SVM algorithm. The basic principle of SVM is shown in Figure 1. Where, "+" and "-" represent two kinds of training samples respectively, 1 x and 2 x are the two characteristic items of the sample, and H is a boundary. The SVM model must ensure that the risk is minimized, so the optimal classification surface not only requires the correct classification of the two types of data, but also requires the maximum classification interval. The purpose of this is to make the support vector machine have better generalization capability. When the SVM algorithm is used to estimate the regression function, it maps the data x of the input space to a high dimensional feature space by a nonlinear mapping ! , and then the linear regression is done in the high dimensional space. Assume a set of data points Among them, i x is the input vector, i y is the expected value, and n is the total number of data points.
SVM uses the following formula to estimate the function: is a nonlinear mapping from the input space to the high dimensional feature space, and the coefficients w and b are estimated by minimizing the following formula: Where, L ! is a loss function, and ! is insensitive loss function.
Because the ! is insensitive loss function, formula (2) can be translated into the following two convex programming problems.
Its constraint conditions are: Thus, the predicted output can be obtained as: Using the kernel function method, the following is obtained: Finally, the nonlinear prediction model is: B. GA model A genetic algorithm is a parallel random search optimization method which simulates the genetic mechanism and biological evolution in nature. The biological evolution theory of survival of the fittest is introduced to the series group encoding by this algorithm. Through the fitness function, selection, crossover and mutation it screens the individual and constitutes a new group. As a result, the new group inherits the information of the last generation, and is superior to the previous one. repeating this process, the individual fitness of the population is continuously improved until it meets certain conditions. The genetic algorithm is simple and practical. It can be processed in parallel, and can generate a global optimal solution. The penalty coefficient C and kernel parameter 2 ! are the main parameters that affect the performance of the SVM model. The set of main parameters in this paper are as follows:

1) Genetic encoding rules.
Taking into account the influence of parameters C and 2 ! on the generalization performance of SVM, we set the parameters C and 2 ! in the search range (0.01-1000) and (0.01-100) respectively. Then, in order to improve the sensitivity of the search process, we have a logarithmic transformation of the parameters, and obtain the lg C and 2 lg! . At this time, the search scope is transformed to (-2- 3) and (-2-2). Then, we use the binary encoding string with a length of 10 bits to represent two decision variables which are lg C and 2 lg! respectively. So, there are 1024 different discrete points in the domain. Finally, we connect two binary encoding strings, so as to form a binary encoding string with a length of 20 bits. This is the encoding method of this article.

2) The determination of fitness function.
The prediction accuracy of the support vector machine is the final standard to measure the model, so this paper takes the prediction sample's root mean square error as the fitness function.  The process of the whole algorithm is as follows: Step 1: ( , ) key value is used to mark each subset, and 1 1 ( , ) k v is used as the input of the map node, where, key is the class of sample subsets, and value is the feature item of the sample data.
Step 2: After the map function processing, the output of the intermediate results is 2 2 ( , ) k v , where, key is the class of sample subsets, and value is the SV of subset.
Step 3: In the phase of reduce preprocessing, we put the same kind of SV together 2 2 ( , ( )) k list v and put it into the reduce node.
Step 4: Finally, by the reduce function processing, the output is Firstly, we establish the index system of teaching quality assessment. The system of teaching quality assessment is a multi level and multi index problem. In this paper, the teaching quality evaluation system is set up with seven indexes. They are the quality of teachers, teaching attitude, teaching content, teaching methods, teaching design, teaching ability and teaching effect. The importance of each indicator is different. Their specific explanations are as follows: (1) The quality of teachers: The standard of mandarin.
(3) Teaching content: The scientific nature and practicability of knowledge; cultivation of learning capability.
(4) Teaching methods: The guidance of learning methods; instructiveness of the educational method; the application of multimedia teaching.
(5) Teaching design: The setting of teaching goal; design of teaching strategy; design of teaching process.
(6) Teaching ability: Thorough analysis of the problem; accurate experimental operation.
(7) Teaching effect: Student feedback; students detection; implementation of teaching objectives.
Whether or not the assessment system is reasonable and reliable depends on various factors. It is the basis for guiding students to grade the factors. Because each participant has a certain degree of subjective randomness in the scoring, the mean value method is used in this article. Then, the assessment result is more authoritative and reliable. According to the comprehensive principle of teaching quality assessment, different assessment results need to be assigned different weights to calculate the comprehensive assessment score. Next, we obtain a list of the original data. Because of the large amount of data, we only give a fragment of the original data list, as shown in Table 1.  Since the dimensions of the various indicators are different, a direct comparison cannot be made. In order to make the index have comparability, and to speed up the convergence rate of the GA model, this paper has conducted the normalized processing to each index.
Then, we give the parameter table of GA-SVM algorithm. After setting the parameters, we can use the parallel GA-SVM algorithm to evaluate and predict the quality of teaching.
In order to verify the performance of the high performance computing system in dealing with big data, this paper uses the Hadoop cluster environment made up of 10 IBM servers, each server being a quad core PC. One of them is used as NameNode, JobTracker and DataNode. The rest of them are used as the DataNode and Task-Tracker. Then, we take different data sets as the sample data sets of the parallel GA-SVM algorithm, and compare the data processing capability of serial GA-SVM and parallel GA-SVM. The experimental results are shown in Figure 4.
As can be seen from Figure 4, the total task processing time of the two GA-SVM algorithms in a same node dimension are different. The total task processing time of the parallel GA-SVM algorithm is less than the traditional GA-SVM algorithm. In addition, with the gradual increase of nodes, the total task processing time has gradually become different between the two algorithms. Although with the increase of nodes, the two algorithms can deal with more data, but the parallel GA-SVM algorithm performs better.
After we have proved the performance capability of this algorithm, we use four different methods to predict the teaching quality with 1000 samples. The experimental results are shown in Figure 5.
It is not difficult to see from Figure 5, this algorithm is very good for the prediction of the teaching quality, and its accuracy is much higher than the other three methods.  In order to effectively deal with the challenges of big data processing in the field of education, this paper proposes a GA-SVM teaching quality assessment algorithm which is based on MapReduce. Through the design of map function and reduce function, this paper realizes the parallelization of the GA-SVM algorithm and the selection of the main parameters. Secondly, this paper uses a genetic algorithm to optimize the penalty coefficient C and kernel parameters 2 ! of SVM, and makes the parameters of support vectors more easily determined. In addition, we improve the sensitivity of the search through the method of logarithmic transformation, and speed up the convergence rate of the GA model. Finally, we compare the parallel algorithm with the serial algorithm on the Hadoop platform. The experimental results show that the GA-SVM based on MapReduce is suitable for teaching quality assessment in the environment of big data.