Genetic Algorithm for Solving Multi-Objective Optimization in Examination Timetabling Problem

—Examination timetabling is one of 3 critical timetabling jobs besides enrollment timetabling and teaching assignment. After a semester, scheduling examinations is not always an easy job in education management, especially for many data. The timetabling problem is an optimization and Np-hard problem. In this study, we build a multi-objective optimizer to create exam schedules for more than 2500 students. Our model aims to optimize the material costs while ensuring the dignity of the exam and students' convenience while considering the design of the rooms, the time requirement of each exam, which involves rules and policy constraints. We propose a programmatic compromise to approach the maximum target optimization model and solve it using the Genetic Algorithm. The results show the effective of the introduced algorithm.


Introduction
Timetabling problems arise in various forms including educational timetabling, sport timetabling, transportation timetabling. In the training institutions, Timetabling is a difficult process faced every semester. It basically is arranging timeslot for a resource such as students, classes, lectures. The timetabling problems classified into groups: • University Course Timetabling: Schedule courses into timeslots and assign students, time, rooms to each course. Babaei et al conducted a survey on the problem in 2015 [1]. Ngo et al proposed a multi-objective optimization for solving the teaching task assignment [2]. Ibrahim et al. introduced an approach for the course timetabling problem based on instructors' preferences [3]. • Examination Timetabling: Examination timetabling is one of the most important administrative activities that take place in all academic institutions. It aims to allocate students to exams with limited resource. Qu et al conducted a survey trend and development of techniques of the problem [4].
This research focuses on examination timetabling. A non-trivial timetabling problems are normally NP-Hard problems [5]. This kind of problem is not always possible to reach one global optimal solution. With the increase in the size of resources, the complexity of the problem also increasing, which makes it unfavorable for a systematic approach. Through the literature, there some common constrains that remain after different research. The most common constrains for university exam schedule are: • Hard constrains: Examination timetabling problem is assigning exams to examination periods and rooms so that the following constraints are respected.
1. Only one exam can be placed in a room at any period. 2. A room cannot be used during periods during which it is not available.
3. An exam must be placed in a room (or a set of rooms) so that the overall seating capacity of the rooms equals or is greater than the number of students attending the exam, concerning the requested seating type (e.g., each room has a normal seating and examination seating capacities defined, an exam can request either normal or examination seating). A maximal number of rooms into which an exam can be split cannot be exceeded as well. 4. An exam cannot be placed in a marked period as prohibited for the exam or required if there is some other period required by the exam. Similarly, an exam cannot be placed in a prohibited room for the exam (by the room requirements set on the exam). 5. Required distribution constraints must be satisfied.
• Soft constrains / Objectives: Besides searching for an optimal solution that satisfies all hard constraints mentioned above, the following criteria are optimized.
1. Student preferences: For example: Exam time is continuous or too far away may affect students. 2. Resources penalty: Uneven exploitation of resources leads to resources being overloaded, while others are in a wasted state.
The above constraints are not necessarily present in all situations. Different problem variations may require different settings. In this study, we built an optimizer capable of optimizing three goals: minimizing the difference in the number of students when using the exam room, minimization of the number of exam rooms open, and maximize the student waiting time. While maintaining the hard constraints.

Related researches
In the past, many researchers study examination timetabling. Our literature focuses on three common aspects: the form of the optimal model for examination timetabling, the constraints as well as the target function used, the optimal problem-solving method. Researchers commonly use the Integer Programming model to descript the timetabling problem. For instance, Liyana and Aizam designed an integer linear Programming base on the preferences and demands obtained through three universities' survey in Malaysia's cost: UMT, UMP, and UMK [6]. Acostamado et al. proposed a mathematical model for the academic timetabling problem and the influence of the system's parameters on the model's size [7]. A non-linear Integer Programming has also been found in [8].
With different test organizations, there will be different requirements and constraints, which will influence the model of the problem that researchers have to take a concern. Dener defined some specific hard constraints relate to the maximum number of exams a student can enter per day, the time between each exam, and some soft constrain relate to the number of students in the same session, and the cost of the resource [9]. Özcan and Ersoy summarized several common constrain to the exam timetabling problem, including exclusions, presets, edge constraints, ordering constraints, event-spread constraints, and attribute constraints [10]. McCollum et al. applied the penalty method, where each violated the soft constraints the receive a reduction in the fitness [8]. There were some other exam schedules, but most of their constraints are common to each other [11], [12]. There some other factors that can be influent in the problem's constraints such as the maximum exams a supervisor can take, number part of the day a student go to exam [13].
Because of the disparity in constraints, there is no inclusive solution and formulation for the exam schedule. This diversity creates different approaches to solve the problem. For instance, Yan and Yu [14] used the hybridization method of Simulated Annealing with multi-neighborhood to improve the Simulated Annealing algorithm's speed. Ender and Ersoy [10] used the Memetic Algorithm to solve the examination timetable problem. The Memetic algorithm is a combination of Genetic Algorithms and hill-climbing. The algorithm uses Genetic Algorithms to create the individual, then calculate the individual's fitness using hill climbing. While using hill-climbing may produce more hard constraints violations case, in return, an individual's overall fitness will be improved. Ayob and Jaradat [15] used Ant Colony Optimization combine with Simulated Annealing and Tabu Search as a local routines searcher. Mandal and Kahar [16] proposed a model using a great deluge algorithm. Murat Dener solved exam scheduling problems in central exams using 2 phases of Genetic Algorithms [9]. One for the assigned courses into sessions, one for assigned students into courses. Shatnawi et al. designed the hybrid of greedy algorithms and genetic algorithms [11]. The greed was used to generate the set of the initial population for the genetic algorithms. Yang et al also introduced an improved Genetic Algorithm for designing their timetable [17]. Bashar et al. proposed a metaheuristic population-based algorithm Intelligent Water Drops Algorithm [18]. However, the solution to this algorithm was not on par with other algorithms. Aside from Evolution algorithms and Swamp algo-rithms, a graph edge coloring algorithm to search for the timetable had been implemented by Rakesh [19]. The algorithm is split into 2 parts, the first use a bipartite to get a daily schedule in the matrix, then the second phrase assign lectures into the timeslot. Wang designed a genetic algorithm to solve the enrollment timetabling problem [20]. The author used the combination of time, teacher, and course number as the gene coding, the weekly course schedule of each class was a chromosome, the course schedule of the entire school was the initial population. Finally, the fitness was designed according to each class's priority, curriculum dispersion, and teacher satisfaction.
Researchers favor metaheuristic algorithms because of the problem's complexity. In this study, we chose to use Genetic algorithms, a metaheuristic algorithm. Since we want to focus on optimize the soft constrain of the problem, we combine the algorithm with the greed algorithm to create the feasible first population. As for our study's objective, because we found that there are common objectives related to idletime between exams in other research. However, with different university, their requirement is different from each other. Therefore, we chose to use implement the idletime relate objective as a flexible function to easily change the content of this objective to satisfy different requirements. Because our study wants to consider the exams' fairness, so we decide to minimize the amount difference of students between exams of the same subject. This objective will balance the burden of each supervisor.

Contributions of this research
In this study, we present an approach to construct a timetable for the multiobjective schedule problem. Our model aims to maximize the benefit of the organization and the dignity of the exam. We use a combination of linear scalarizing and compromise programming for the proposed multi-objective problem. The model is described in the second section, together with the approach to the multi-objective optimization problem. Section 3 of the paper describes a scheme of the Genetic algorithm for solving the proposed model. We test the proposed model and algorithm by scheduling FPT University students in the spring semester of 2020. The test results are shown in section 4 of the paper. The rests are discussions and conclusions.

MOP for examination timetabling
There are some denotations: • S is the number of subjects.
• M is the number of students.
• R is the number of class rooms.
• ∂ denote the minimum number of students per class • T denotes the number of time-slots.
• is the number of conditions of the subjects.
Where $ denotes the slot that examination of subject "# begin.
There are 3 objectives: • The number of students in the same examination zoom is not too different: The break time between exams of student is minimized.
The function ( ) used to calculate the break time between each exam of student "# . It is designed depending on how each training institution defines the time units. • The number of students in the same examination must be restricted.
• Students are not arranged to examination zoom of subject they do not register for • Course only assigned to the room that satisfy the required conditions.

Compromise programming for proposed MOP
The proposed model optimizes multi objectives, which make it multi-objective programming (MOP). According to Hwang [21], there two frequent used approach to the solution of MOP: preference and non-preference. The preference method takes the solution that best fit to the preference base on the decision-maker. The second approach using a nature compromise solution to all objectives instead of the decision maker. Ngo et al [2] [22] used compromise programming to their timetabling and team selection problems. In this study, we want to archived a solution that satisfy all the objectives equally, therefore, we chose to using the second method, the Compromise programming to transform the MOP into single objective problem. The ideal point denoted as = { / | = 1 … 3} and the solution = { / | = 1 … 3} where: Where fuzzy function that calculate the score of the parameters on a given scale Objective function becomes: Where / represents a normalization function that rearranges the values of the dimension "# in the distance function to a particular range. / stands for the weighted parameters of the dimension "# in the fitness function.

Genetic Algorithm
The genetic algorithm is a population-based metaheuristic and is a band of Evolution algorithm. This algorithm and its hybrid are usually be used to find the solution to the timetabling problem. The algorithm's idea is to find the best individuals that fit the problem's goal through the hybridization of the best genes from the parent. The hybrid genetic algorithm with greedy initialization proposed by [23] inspire us. To generate the initial Population, we use the greedy algorithm to create a feasible population, where each individual in the population does not violate the model's constraints. We continue to follow the steps of genetic algorithms until the solution is converged.

Genetic representation and fitness function design
The chromosome represented as follows: • The chromosome is a vector of S dimensions, the dimension "# represents subject "# . • For each subject "# , contains the time when the exam of subject "# is hold, and a list of examination rooms that exam subject "# . • In each examination room "# , there are the room where the examination is hold, and the list of students in the room.

Genetic operations
Denote: • U represent the size of population • 5 = { / 5 | = 1. . } as the population at generation "# • / 5 as the individual i 67 of the population at generation "# • / represents the number of students that need to take subject "# . is rate of elites of the population. • φ is selection rate.
• is the exchange rate of genes between / (5) and 8 (5) • Ω as the mutation rate.
The algorithm contains 6 steps: and P '="#5! (5) to each other to create two new individuals for the next generation. Where exchange a gen is means swap the begin slot, the course list, the schedule represents each student and theirs course of this subject to each other.

If (w <
≤ Ω) perform mutation on P &<"#5! (5) or P '="#5! (5) (Described in step 5) to create two new individuals for the next generation. 3.3.3. Otherwise consider P &<"#5! (5) and P '="#5! (5) as two new individuals for the next generation. 4.1. For each student "# have a subject "# need to be exam but have not been assigned to a course yet, find a course that exam subject "# and not full to put student "# in. 4.2. For each room "# , that be used by two or more exams simultaneously, move the redundant exams to other room of same type, not occupy by other exams, and have the capacity more fit to its number of students than the current and change to the found room (if found). 4.3. For each exam "# that have number of students < , fill the course "# by other student from other course of the same subject 5. Mutate: Takes an individual £ as an input. 5.1. Modify position, the courses, and the student assignment randomly of a selected subject in £. 5.2. Repair £ 6. Repeat 2, 3, and 4 ,5 and set = + 1 until 3 = 3>+ = ⋯ = 3>? .

Experimental design
To conduct experiments to evaluate the proposed model and algorithm, we collected data of more than 2486 students in the IT department at FPT University, consisting of 74 subjects. We collected more than 9000 exams from those 2500 students mentioned and sort those exams into 100 rooms in 5 days. Each subject has a different requirement maximum number of students. For example, most exams only allow a maximum of 40 students per class, but HCM201 and MLN101 allow 20 more students in their exams. And each day we use 8 hours, 4 hours each shift, provide with 100 rooms with different type, and room's capacity.
We use function ( ) to calculate the break time between each exam of student "# . Function ( ) return the total number of slots between each exam of student "# , two exams happen on different days, and the return value has to add several slots that equal the number of a slot of a shift for each day apart.
Where function ( ) return array E is the descending order of array Q, and array F is the index that correspond to array E.
We also use function ℎ ( , ) to check if the slot the course is held is invalid (between 2 shifts) or not, which is denoted as: • as the number of slots per day. • Number of slots per shift denoted as . • represents the slot that the exam begins. • stands for the slot that the exam begins. • The invalid shift denoted as: a < ⌊ / ⌋ * + < • The invalid day presented as: a < ⌊ / ⌋ * + < Because in the experience, we notice the return value of each objective are in difference scale. It is because while, objective balance the number of students in exams of a subject, and minimize the cost of resource, are easy to archive. However, objective minimize the break-time between of exams are harder to minimize, because the diversity in the type of exams of each student. This phenomenon, may make some objective such as objective insignificant in the total value. Therefore, we decide to use fuzzy method [29] to balance the scale of each objective. In the experience we fuzzy the value of each objective by using function "# for the respective objective. Where: This experience is implemented in java 1.8.0 and executed on the computer with detail configuration as CPU Intel core i5-8300H @2.30Hz, 8GB. We evaluated both of the proposed algorithms on the tested dataset.

Results
The GA are governed by a series of parameters. We execute the algorithms several times with collected dataset to indicate the best set of parameters and weights that show in Table 1. Meta-heuristic algorithms do not guarantee to find the optimal solution with different initialization values. So, we run the algorithms 10 times with different initializations to find the schedule for 2500 students to exam 74 subjects. Figure 2 shows the result of 10 executions with the data of 2500 students. It can be observed from Figure  2.A that the difference of fitness value in each execution are not too significant, which show the stability of the algorithm. It can be observed that the flow of fitness values and F2 values are similar to each other. Since the return value of objective 2 are double the F1 and F3 value, it also influences to the flow to the fitness value. The reason for this phenomenon is because the data for the experience is an enrolment-base dataset, therefore each student in a same exam have to take many different other exams. So, in order to minimize all the break-time between exams of all student is hard to archive. Which make the minority students have to satisfy their time for the majority, and prevent F2 to decrease to a certain point.  Figure 3 illustrate the ability to convert of the algorithm over generations. We stop the algorithm at generation 350. It can be seen that the algorithm archives its best solution at generation 300. The convergence rate of the algorithm shows that popula-tion diversity can be preserved. Usually, schemes that make the algorithm converge early often lose diversity. Making it difficult for the next generations to make a good mutation.  In Figure 4, we category students base on the number of days that they have to exam. It can be observed that most of students only need 1 to 3 days to finish all of their exams. While there are still some students' schedule require 4 days to finish, and insignificant number of students require 1 to 2 extra days to finish the examination. The reason for some students requires 5 to 6 days to finish their exams. The changed values of Objective 1 are illustrated in Figure 5. The chart group exams according to the students' difference in each exam to the average number of students in the respective subject. Most of the exams are equals or have only one student difference. There are still exist some exams that have more considerable differences, the maximum is 6, but the number is insignificant. The case of 5 or 6 students is different between classes, usually because the solution decides to use rooms with various capacities to reduce resources' cost. Therefore, the return value of objective 1 is smaller compare to other objectives. Figure 6 illustrates the number of exams is created compared to the minimum requirement in each subject. According to the formation, the differences are not too significant in the solution compared to the ideal number. Most of the subjects that require more exams are the ones which have a larger number of students compare to other subjects. Those subjects need more exams to avoid using rooms with different capacities. While in the solution, each objective has some aspects that differ from the ideal result. However, those issues exist because of the balance of each objective and the scales and data design.
We can see that the second objective is more challenging to optimize than the other objectives. The data in Figure 4 also show some insignificant cases that the breaktime is not optimized, which is a small number of students still need 4 to 5 days to finish their exams. The cause of this phenomenon is because the experiences are enrollment-based, in which students do not follow a similar curriculum with each other but take the courses and exams independently from others. The diversity of different subjects in a subject is illustrated in Figure 7. According to the figure, some subjects have students take around 50 other subjects. Which makes it is hard to satisfy the requirement to minimize the break time for all students in a subject.  By changing the dimensions' weight parameter, we can customize the result of the algorithm. Figure 8 shows different sets of products with varying sets of parameters. In those executions, since we use smaller data, the students are divided into groups and have to take the same exams with others in the same group. Therefore, it possible for the solution to archive the ideal result when change the weight parameters focus on a single objective. That will minimize each objective's influence on each other and result that the best solution for the focused objective while satisfying other objectives. We can observe the impact between the number of exams per subject and students' balance in exams of the same subject in the first and third execution. With each objective archive the ideal result in the respective executions, the other value is increased significantly. On the other hand, while the relationship of the break-time objectives with others is not healthy as resource minimize objectives and balance student, but to archive the ideal solution, it still requires other objectives have to satisfy their condition. We visualize the feasible optimal solution of the algorithm in each pair of objectives. The Pareto Frontier was obtained by executing the algorithm 250 times with different weights parameters shown in Figure 9.

Conclusion
In this study, we have designed a new approach to examination timetabling. Comparing with previous studies, our proposed model is MOP. It has access to many business aspects, primarily of interest to students than any previous model. which seeks typically to save costs. Not only proposing the optimal model, but we also use a hybrid approach for the MOP problem. It is a combination of an idea point and linear scalarizing. This method allows even if decision-makers cannot assign preferences for each specific factor of the problem. The model still directs the algorithm towards an idea point. We also design a scheme of the Genetic Algorithm to solve a proposed optimal model. The results showed that the algorithm-maintained population diversity. However, the way we transform the objective values to the same scale is not good enough. It is not practical to balance the importance of the objective functions. It requires some groping to indicate the values for the weight parameters of the target functions. Once these parameters are defined, the program is an efficient tool that allows decision-makers to manipulate agents. In the future, we continue to improve the algorithm to increase population diversity [24] and improve scaling for the target functions to balance their importance. Improving computing performance with parallel computing is also one of our priorities [25], [26].