The application of Coarse-Grained Parallel Genetic Algorithm with Hadoop in University Intelligent Course-Timetabling System

Course-timetabling problem is a NP-C problem at Universities. Many traditional methods for solving this problem show low-efficiencies in several aspects, such as a high conflict rate of teacher resources or classroom resources, low degree of satisfaction among students or teachers and so on. Thus these methods do not meet the requirements of modern university educational administration management. Parallel genetic algorithm (PGA) integrating the advantages of the traditional genetic algorithm (GA) and the computing power of parallel computing can improve effectively the quality and speed in solving problems. This paper, which is based on the cloud computing platform with Hadoop, presents an improved method fusing Coarsegrained parallel genetic algorithm (CGPGA) and Map/Reduce programming model, to solve the university course-timetabling problem. The simulation experiment results show that, compared with the traditional genetic algorithm, CGPGA not only improves the success rate but also reduces the conflict rate in solving the university course-timetabling problem. At the same time, this paper makes full use of the high parallelism of Map/Reduce to improve the efficiency of the algorithm. And then the corresponding computational results are reported.


I. INTRODUCTION
With the rapid development of economic and cultural level, the scale of university enrollment continues to expand, and the professional setting and course setting is also developing in depth and breadth. At the same time, the universities are faced with the scarcity of educational resources, especially in the classroom resources. Therefore, course-timetabling becomes one of the most difficult and time-consuming work of academic management at universities [1]. Course-timetabling problem is a dynamic combinatorial optimization problem involving many factors, such as teachers, classes, course, classrooms, times periods and so on, and it obviously had been proved to be NP complete (NP-C) problem [2]. Many traditional methods for solving this problem show low-efficiencies in several aspects, such as a high conflict rate of teacher resources or classroom resources, low degree of satisfaction among students or teachers and so on. So, these methods do not meet the requirements of modern university educational administration management. With the development of campus network and office automation, the university educational administrators realize that the con-struction of computer intelligent course-timetabling system is imperative [3][4][5][6].
Domestic and foreign scholars had carried out a large number of research works on the university intelligent course-timetabling problem. Many intelligent optimization algorithms had been successfully applied to the university intelligent timetabling problem because of its robustness and universality. Li Hongchan, Hu Gang and Zhu Haodong analyzed University Timetabling Problem detailed, established an optimization mathematical model, and designed a variety of improved schemes to solve UTP more effectively [7]. Tang Yong, Tang Xuefei and Wang Ling used genetic algorithm to optimize the initial course timetabling, and used matlab to programming. The test verified that the algorithm played a significant role in timetabling optimization [8]. YU Chengmin, JIANG Hua, LI Huan and JIA Baoxian presented an improved discrete Particle Swarm Optimization (PSO) algorithm to tackle the university course-timetabling problem, and verified the feasibility of the method [9]. According to the requirement and character of the course timetabling problem, Zhang Lin presented an algorithm based on the bipartite graph theory, and then he mixed the three different algorithms together and got an improved ant colony algorithm which can obviously improve the space of getting the answer and the quality of the answer [10]. David Abramson, Mohan Krishna Amoorthy and Henry Dang described the use of simulated annealing (SA) for solving the school timetabling problem and compared the performance of six different SA cooling schedules [11]. Ding Zhenguo and Zhao Hongwei introduced a method to solve the coursetimetabling problem which mixed the classical network flows algorithm and the modern heuristic taboo search algorithm; The method fused the superiorities of two algorithms together and improved the ability of processing problems [12] .Wang Lu and Qiu Yuhui proposed the framework of intellective class scheduling system to research theories and technologies in Multi-Agent Systems; And the study aimed to concern about majority teachers' expectations and it intended to solve teacher' s satisfaction problem which is based on the negotiation techniques of Multi-Agent Systems [13]. Nie Xiaodong tried to use the greedy arithmetic to solve the problem to aim the running of course-timetabling system [14]. The study developed a system which is based on the greedy arithmetic, resource matching method and the dynamic memory distributing best adapting method. The response time and the result of the system obtained by this method had reached satisfaction. In summary, all kinds of intelligent optimization algorithms have been applied to solve the university intel-PAPER THE APPLICATION OF COARSE-GRAINED PARALLEL GENETIC ALGORITHM WITH HADOOP IN UNIVERSITY … ligent course-timetabling problem, and the most widely used is Genetic algorithm (GA).
With the expansion of the scale of the problem and the complexity of the search space, the traditional GA is prone to premature convergence, local optimal solution, low-efficiency and so on. In this paper, based on the cloud computing platform with Hadoop, an improved method fusing Coarse-grained parallel genetic algorithm (CGPGA) and Map/Reduce programming model is presented to solve the university course-timetabling problem.
The rest of the paper is organized as follows. Section 2 designs Coarse-grained parallel genetic algorithm which is based on Hadoop. Section 3 builds the mathematical model of intelligent course-timetabling system at universities. We carry out an experiment in Section 4 and its results analysis is given after the comparison of the improved genetic algorithm and the traditional genetic algorithm. The last two sections are the conclusion and the references.

II. DESIGN OF COARSE -GRAINED PARALLEL GENETIC ALGORITHM WITH HADOOP
Hadoop is an open source project which is developed by Apache software foundation. Users can set up a large scale cluster system by using it on the common hardware.
Hadoop is an optional distributed system infrastructure. Its cores are HDFS (Hadoop Distributed File System) and Map/Reduce. Coarse-grained model is also known as Distributed style or Island-based model, which is one of genetic algorithm parallel models with the most adaptive and widely used [15]. The results are improved by increasing the population diversity, which not by the calculation of the adaptation.
The resolving idea of Map/Reduce which is based on CGPGA with Hadoop is that the serial genetic algorithm is changed into a Map/Reduce operation. The calculation of the individual fitness value and the genetic operation of crossover and mutation is calculated in Stage Map; Stage Reduce is to determine whether to meet the convergence condition, if "yes" then the results are output, or we enter the next Map/Reduce operation. Unlike general Map/Reduce operation, the Map/Reduce of CGPGA is achieved by Stage Partition after Stage Map. The optimal individual is transferred by circular algorithm between the sub populations and the others remain independent.
In order to ensure that all the sub groups alone multiply, we define that the number of nodes in Stage Map and Stage Reduce is n, and the data of Stage Map are processed in corresponding Stage Reduce. Each individual await is to give a sub group key, in the Stage Map, the optimal individual's key is (key+1)mod n, and key mod n is operated in Stage Partition, so as to achieve the optimal individual circular migration. The process is shown in figure 1.

A. Model description
The problem of the course-timetabling at universities is mainly related to five factors: teachers, classes, courses, classrooms and times periods. N p , N c , N l , N r and N t repre sent respectively the total number of all teachers, classes, courses, classrooms and times periods, the model encoding described as follows:  Set of all teachers is P ( ) and the number of courses corresponding to each teacher is K ).

Set of all classes is
) and the number of students corresponding to each class is G ( Set of all courses is L ( ) and the number of classes corresponding to each courses is X Set of all classrooms is R ( ) and the number of desks corresponding to each classrooms is Y ( Set of all times periods available is T ( Cartesian product of times periods and classrooms is M (M = T!R ={(T 1 ,R 1 ), (T 2 ,R 2 ),… ,(T Nt ,R Nr )}), then the course-timetabling problem is transformed into finding a suitable "time-classroom" for each class. There are total 2 Nr mapping relationships, which are as follows:

B. Objective function
The course-timetabling problem can be considered as a resource allocation problem, the main task of which is that the classrooms, teachers, classes and courses do not appear to conflict within a week [16]. The resource allocation effects of the universities intelligent coursestimetabling are reflected in the following aspects: • Teaching effect 1 f : the important courses need to be arranged in good teaching unit.
• Teachers' needs to meet 2 f : the teachers' requirements of the times periods, classrooms and so on must be meet.

PAPER THE APPLICATION OF COARSE-GRAINED PARALLEL GENETIC ALGORITHM WITH HADOOP IN UNIVERSITY …
• Course scheduling optimization in multi class hours 3 f : the time interval of the course include more than multi class hours in a week should be as far as possible to above one day to ensure the teaching effect.
• Classroom resource utilization 4 f : the number of classrooms' seats is equal to the number of students in classes.
In this paper, the goal of university intelligent coursetimetabling problem is to achieve the maximum weighted sum of the resource allocation effect under all constraints, the hypothesis is as follows: • The teaching unit of the university is divided into 5,  2,3,4) is defined as teacher title coefficient whose values are 1, 2, 3 and 4, which corresponding to the title as assistant, lecturer, associate professor and professor. j ! (j=1,2,3) is as the teacher willingness in each teaching units and classrooms, etc., the values of which are 0, 1 and 2, which corresponding to the willingness as unwilling, no matter and willing, then the optimization objectives of meeting the teachers' needs is: 2,3,4) is defined as the teaching effect coefficient of a course where the interval day is i days, whose values respectively are 1, 3, 4 and 2, which corresponding to the effect as poor, good, very good and general, then the optimization objective of the course arrangement is: • ! is defined as the number of the students learning the course and ! is as the number of the seats in the classroom, then the optimization objectives of the classroom resource utilization rate is: The classroom resources utilization rate of each course is the bigger the better, and the maximum value is 1.
To sum up, the objective function is: In the formula, i ! is as each target's weight, and we define to 3, 1, 2 and 4.

C. Constraint condition
The main constraints of the university intelligent course-timetabling problem are as follows: • Each class is only arranged one course in one time period, that is: • Each teacher can only teach one course in one time period, that is: • Each classroom can only be arranged for one class, that is: • The number of students allocated to the classroom should be greater than or equal the number of students in the class, that is:

A. Algorithm scheme of universities intelligent coursetimetabling
According to mentioned theories above, the coarse grain genetic algorithm program of university intelligent course-timetabling is as follows: • Use a binary for encoding and mapping "timeclassroom". • Uniformly generate an initial population, ensure that the initial solution is uniformly distributed in the solution space. • The fitness function is the objective function.
• Selection operation Roulette wheel method is be used. The higher the individual adaptation value, the greater the proportion of the roulette wheel and the probability of survival, that is the greater chance of entering the next generation.
• Adaptive crossover and mutation operation The probability of crossover p c and mutation p m is calculated as follows:  (12) In this formula, k 1 and k 2 are defined as the same random number (0, 1), max

B. Simulation experiment
The data of the experiment is from a university course table, the realization method of the experiment is C# programming, and the distribution of various elements is shown in Table 1.
Experimental parameters of the coarse grain parallel genetic algorithm based on Hadoop are shown in Table II.
We use the improved algorithm to make a parallel experiment comparison with the traditional genetic algorithm.
The experiment is carried out 10 times, the best fitness value of the population is recorded one time of each 100 generations, and the experimental results of the fitness value is from the average value of the best fitness value in 10 times records.
Ten average values which are calculated by the method in the paper are better than that by the traditional genetic algorithm from Figure 2. It shows that using this method to solve the problem of universities intelligent coursetimetabling get better effect.
The operating results by the two algorithms are shown in Table III. By using this method, the overall conflict rate is reduced, the success rate is 100%, and the time consuming of searching for the optimal scheduling scheme is greatly reduced from Table III. In a word, the experimental results are satisfactory and this method is a better method in intelligent course-timetabling system.  On the basis of analyzing the problem of the coursetimetabling at universities, the mathematical model is constructed and the framework of the problem is given. In this paper, we use the coarse-grained parallel model which is based on Hadoop to mix the advantages of genetic algorithm to solve the problem of university intelligent coursetimetabling. This method has reduced the workload of the educational administration personnel, and it also has a great effect on the teaching management in universities. Thus, it has extended value.
Although this method has some advantages in terms of computation results and operation efficiency, it introduces more parameters to increase the uncertainty of the algorithm. For example, how to determine the appropriate transfer mode becomes one of the problems to be solved in the future research.