Educational Engineering for Models of Academic Success in Thai Universities During the COVID-19 Pandemic Learning Strategies for Lifelong Learning

The COVID-19 pandemic has had a devastating impact on the Thai education system, as many universities had to transfer their learning management to e-learning. This research aims (1) to study and compare the academic performance of higher education students affected by the COVID-19 health crisis, and (2) to build a model of academic performance with instructional engineering technology to support the learning management process of higher education institutions. The research approaches were carried out according to the theory of data mining techniques using the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology. The data collection was divided into two parts according to the education situation. The first part was a normal situation with data collected from 173 students during the second semester of the 2019 academic year. The second part, on the other hand, was an abnormal situation, with data collected from 167 students during the first semester of the academic year 2020. It was found that traditional and online teaching and learning enabled students to achieve similar academic results. The data collected and analyzed from both situations showed that the students' learning outcomes were not different. In addition, the appropriate grouping of learners allows for three sub-groups to be classified according to the research findings. Therefore, consideration should be given to grouping learners with similar characteristics in a small group and designing more effective learning activities. This issue will be the subject of future study. Keywords—Academic Achievement Model, COVID-19, Educational Data Mining, Educational Engineering, Learning Analytics 96 http://www.i-jep.org Paper—Educational Engineering for Models of Academic Success in Thai Universities During the...


Introduction
The coronavirus epidemic that began in 2019 has spread around the world. Many countries are affected in various dimensions including educational [1][2][3][4], socioeconomic, health and many others. In addition, the effects of COVID-19 affect all age groups, including young children who cannot go to school, layoffs in the workforce, and health problems that particularly affect the elderly [5].
In Thailand, the COVID-19 pandemic began in early 2020 [6,7]. The epidemic has seriously affected the country's economy, particularly in the tourism, social and educational sectors [8][9][10]. The national government has announced a series of assistance measures, including 1.9 billion in value stimulus loans. One trillion Baht had been used to fight the virus, but the operation was able to benefit only a very small number of people, leading to protests against the Thai government's inaction towards the end of 2020. The initial impact was that more than seven million Thai people were laid off. As a result, various activities of the country's economic, social and educational sections had to be cancelled in order to prevent further spread of the disease [6][7][8][9][10][11].
One part of Thailand that has been severely affected is the education system. The COVID-19 education crisis in Thailand has affected people at all levels of education and in all professions [11][12][13]. The spread of the epidemic has resulted in a huge and growing number of workers losing their jobs and becoming chronically unemployed, while educational institutions have found themselves unable to properly use digital technology for teaching and learning because of the costs involved. In the field of institutional management, it is essential to transform the model of teaching and learning by moving from physical learning styles or classroom teaching to distance education [14,15] or virtual technology [16][17][18], known as e-learning. There are many studies and research on the benefits of e-learning [14][15][16][17][19][20][21]. However, elearning also has many negative aspects. For example, e-learning lacks a learning atmosphere [19], and harms young children [19]. In addition, e-learning presents problems related to the teaching and learning management environment, such as personal, technical, logistical and financial barriers [14].
Based on the review of literature reviews and related work [1][2][3][4], the researchers felt that it is absolutely essential to conduct a study on "Educational Engineering for Models of Academic Success in Thai Universities during the COVID-19 Pandemic". The main reason for this is that most of the research articles have conducted studies on the attitudes and satisfactions of the general public with respect to the COVID-19 pandemic situation. Moreover, the researchers have conducted a comparative study of normal and abnormal situations affected by the COVID-19 pandemic. In addition, the researchers have applied the disciplines between educational theory and engineering theory known as "educational engineering" as an approach to conduct the work.
The conceptual idea and research framework are presented in Figure 1 and have been divided into two stages: (1) the first stage was a normal situation with a physical instruction model where communication between instructor and learner was face-toface communication. The research data were obtained from the second semester of the 2019 academic year. (2) The second stage was an abnormal situation with an elearning model where the communication between instructor and learner was a distance communication. The research data were obtained from the first semester of the academic year 2020. The questions asked are consistent with the objectives and methodology of the research. Where data are collected for situations that have already occurred, they are analyzed using statistical techniques and basic data mining, which are divided into two sections. The first section includes basic analytical data, including data frequency, percentage data and average data. The other section presents a comparison of patterns of educational achievement classified by different study groups using data mining techniques.

Research objectives
The objectives of the research consist of two main goals. The first objective is to study and compare the academic performance of higher education students affected by the coronavirus situation in 2019 . The second objective is to develop a model of academic success using instructional engineering technology to support the learning management process of higher education institutions.
Upon completion of the research, the researchers hope that the research will suggest a model for managing teaching and learning that is responsive to changing future situations.

Research approaches
The research approaches are based on the theory of data mining development using the interprofessional standard process for data mining, known as the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology [22][23][24][25][26][27], which is widely accepted in data mining development. The research approach is divided into two sections. The first section is the research materials section which describes the research population, the research sample and the method of research sample selection. The second section is dedicated to the research methods. It controls the research according to the CRISP-DM methodology [22][23][24]. It includes six phases that will be detailed in the rest of this article: business understanding, data understanding, data preparation, modeling, evaluation and deployment.

Research materials
The research focuses on three main areas: the study population, the research sample, and the method of selecting the research sample.
Study population: The population of the study was selected from two universities, the Rajabhat Mahasarakham University, and the University of Phayao. It included students who had enrolled in the three courses: 7000103 Mathematics and Statistics for Information Technology, 7011303 Data Warehouse and Data Mining, and 221203 Technology for Business Application during the academic year 2019 to academic year 2020 The data collected consisted of activity scores, midterm exam scores, and final exam scores. The instructor informed the learners of the purpose of using the information in the research.
Research sample: Research sampling was used to classify the comparative research according to the characteristics of teaching and learning management. The sample group was divided into two time periods: the academic year 2019, which is the regular teaching and learning management period, and the academic year 2020, which is in an abnormal teaching and learning management period (the COVID-19 situation). The purpose of selecting two time periods was to compare the student academic achievement of the same course.
Method of selecting the research sample: In the sample selection process, the researchers selected all enrolled students in the three courses for the two academic years, which is comprised of the second semester of the academic year 2019 and the first semester of the academic year 2020. The details of the data collection are presented in Table 1 to Table 2. After the data was collected and investigated, the researchers developed a comparison of academic achievement model using the CRISP-DM methodology, which is detailed in the research methods section.

Research methods
The Cross-Industry Standard Process for Data Mining (CRISP-DM) provides a framework for translating business problems (research problems) into data mining and conducting data mining projects independently of both the application area and the technology used. It is part of the general knowledge discovery (KD) process in the industry [22][23][24]. This research methods section has shown the process of the research. It is composed of six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.
Business understanding: Business Understanding is the initial step in searching for a research problem. It is of the highest importance because it is the starting point for doing the research. It has four essential components: determine business objectives, assess the situation, determine data mining goals, and produce a project plan [22][23][24].
The COVID-19 pandemic has severely affected the teaching and learning management of all educational institutions. The learner and the instructor had to adapt to situations where they could not meet normally. Therefore, education management shifted from the physical platform to the virtual platform. These are the research questions asked: How will the effectiveness of that physical learning platforms and virtual learning platforms affect academic achievement? According to the research question: (1) the researchers hypothesized that the physical learning platforms and virtual learning platforms had impacts on different academic achievement. (2) Learners who were studying each category had different academic achievement.
At this stage, the researchers had established a four-step workflow. The first step is to define the research problem. The second step was an assessment scenario where researchers considered that the Coronavirus 2019 (COVID-19) situation will be sustained, which makes it imperative that researchers develop appropriate academic achievement models for future institutional management. The third step is to determine the data mining goals. The final step is to produce a project plan where the researchers has defined the project plans, tools and techniques, which are described in the next section.
Data understanding: .Data understanding is like understanding the environment, the context of a research problem, and being able to understand how to acquire data and information [22][23][24]. The other point of view, it is a link between a business problem and the information that needs to be prepared for data analysis. It has four key elements; collect initial data, describe data, explore data, and verify data quality [22][23][24].
In this research, data understanding is based of four key elements. The first step is to collect initial data by doing a research ethics to allow it to conduct and control the research. The second step is a data description report, which plans to understand the available data. The third step is a data exploration report, which plans to understand the data types, and appropriate data collected. The final step is a data quality report, which plans to understand the data quality.
Data preparation: The data preparation phase is the part that takes the longest time in the process. It serves to collect data, consider relevant data, and organize data in an appropriate format for use in model development [22][23][24].
In this paper, the data preparation was performed in six stages according to the CRISP-DM methodology's data preparation components. The first stage was to scope a data set that the researchers divided into two perspectives from three courses at two universities: 7000103 Mathematics and Statistics for Information Technology, 7011303 Data Warehouse, and 221203 Technology for Business Application. The second stage was to select data from students who enrolled in four courses at two universities. The third stage was to remove the data from the students who withdrew from the three courses. The fourth stage was to manage the data structure by four attributes: activity score, midterm exam score, final exam score, and academic results. The fifth stage was to integrate the data, where the researchers combined the activity scores from every activity and calculated them as a proportion to determine academic achievement, so that it is consistent with the data structures required to develop the model. Finally, the sixth stage was to format data, where the researchers formatted the data to construct the model. The data sets that have undergone the data preparation process are presented in Table 1 to Table 2.
Modelling: The second objective of this research was to construct an academic achievement model with educational engineering to support the learning management process of higher education institutions. This section presents two components of the academic achievement model development: academic achievement model definition, and modeling tools.
Academic achievement Model Definition: In this research, the Academic Achievement Model (AAM) is defined as a prototype model to be used as a tool in the management style of learning and teaching for unexpected situations. The steps for prototyping this model are shown in Figure 2.  Figure 2 shows the prototype model which is composed of two modelling phases. The first phase is modelling with collected data for training, analysing data with data mining techniques, and selecting the model. After developing a prototype model, the second phase concerns preparing data for prototype testing to find the accuracy and precision of the prototype.
Modelling Tools: Modelling tools are the methods for selecting the data mining tools that correspond to the model construction. The tools used in this section are comprised of two parts: a tool for building an appropriate cluster of learners with -Means, and tools for finding the optimal -value for the reasonable cluster with -Determination. An important method of research is the clustering of data.
Clustering of data is a method of Unsupervised Learning grouping according to the similarity of data. The clustering techniques used for data analysis were -mean, DBSCAN, Fuzzy C-medoids, and K-medoids [28]. -mean is one of the most popular and applied methods. It is applied in education to group duplicate data into the same group for improvement of teaching and learning performance and academic achievement [29][30][31], such as learning courses, learning styles, student's behaviour, and other related aspects.
The process of -mean clustering begins as follows. It assigns a set of clusters and assigns each data to a single cluster [32]. The cluster contains similar information. The similarities between the data are based on measuring the distances between them. The cluster in the -means algorithm is determined by the value of the center in thedimensional area of the n attributes of the dataset. This position is called the centroid. The -means algorithm initiates the computation with the point , which is considered the effective centroid of . These starting points are the positions of the random data of the -values of the input dataset and are given by -means.
After that, the collected data and the centroid are evaluated, and the members classified by the resulting cluster. The nearest data is determined by the measurement type. Next, the cluster centroid is recalculated by taking averages of all the data for a single cluster. The previous step is repeated for the new centroid until the centroid is no longer able to move, or the maximum optimization steps are reached. Note that it is not certain that the -means algorithm converges if the measurement type is not based on Euclidean distance calculations. This procedure is repeated at the maximum working time with a different set of origins. The transmission cluster group has the sum of the least squares distance of all data to the corresponding centroid. The process of finding the optimal -value for a reasonable cluster with -Determination is part of the evaluation. The procedures of finding an appropriate -value have been described in the next section.
Evaluation: In order to find the optimal -value, it used the -Determination method [33], which is the elbow theoretical -value selection principle. It selects the -value based on the sudden change in value. For example, changing from vertical to horizontal, or from horizontal to vertical. An example of the optimal -value is shown in Figure 3.
From Figure 3, it can be seen that the -Value has a sudden change at = 4. Therefore, it can be concluded that this -Value ( = 4) should be used in the development of the model. Deployment: As shown in Figure 3, the prototype of the academic achievement model will be implemented in the academic year 2021. The researchers plan to prepare and apply the test in a small group at two universities. The researchers expected that the group of learners will have a high level of academic achievement.

Main Results
In this section, the research findings present the report of the data collection, as well as the synthesis and analysis of the data in three dimensions. The first section is a summary of the data collection. The second section is an executive summary, and the third section summarizes the patterns of educational success obtained from the analysis using data mining techniques.

Summary of data collection
The summary of the data collection was divided into two main parts. The first part was a normal situation with data collected for three courses during the second semester of the 2019 academic year, as summarized in Table 1. However, the second part was an abnormal situation with data collected in four courses during the first semester of the academic year 2020, which is summarized in Table 2. Table 1 provides a breakdown of the data, from three courses at two universities. It consists of 90 students who enrolled in 7000103 Mathematic and Statistics for Information Technology, 56 students who enrolled in 7011303 Data Warehouse and Data Mining, and 27 students who enrolled in 221203 Technology for Business Application. A total of 173 students is included. Table 2 provides a breakdown of the data, of three courses from two universities. It consists of 73 students who enrolled in 7000103 Mathematics and Statistics for Information Technology, 66 students who enrolled in 7011303 Data Warehouse and Data Mining, and 28 students who enrolled in 221203 Technology for Business Application during the 1st semester of the academic year 2020. A total of 167 students is included.

Executive summary
This part is to present the data in a graphic format to compare the academic achievement in various dimensions classified by courses and years. Figure 4 shows the comparison of the academic achievement in three courses. It can be seen that the learning styles of the three sets of data are not different, although the number of students is declining and changing in the traditional teaching and learning model to online teaching and learning is not affected. An overview of the three courses analyzed found that there were suitable for online management.

Summary of academic achievement models
In order to find the best and suitable academic achievement model, it is first necessary to find the right number of clusters. The tool used to effectively quantify the number of clusters is the -Determination method [33]. An explanation of the principle of operation is presented in Figure 4.
Annotation: In clustering, this model focused solely on activities scores, subactivity scores, and midterm exam scores. The objectives were to further study the model and its application in other educational periods. The results of finding the appropriate cluster of the research were classified into three sections according to a data set gathered from three courses at last semester (1 st semester in academic year 2020) as shown in Figures 5 to Figure 7, and Table 3 to Table 8. Figure 5 shows the appropriate number of clusters of course 7000103 Mathematics and Statistics for Information Technology during the second semester of the academic year 2020. It was found that the appropriate cluster number was -value = 3, with descriptions of suitable clusters shown in Table 3 to Table 4. Table 3 shows the average within centroid distance from its plot in Figure 5. It shows the change in the direction of the calculation results according to the -Determination principle. The details of the cluster results are shown in Table 4. Table 4 shows the details of the cluster results of course 7000103 Mathematics and Statistics for Information Technology during the first semester of the academic year 2020 according to the -Determination analysis.    Figure 6 shows the appropriate number of clusters of course 7011303 Data Warehouse and Data Mining during the second semester of the academic year 2020. It was found that the appropriate cluster number was -value = 3, with descriptions of suitable clusters shown in Table 5 to Table 6. Table 5 shows the average within centroid distance from its plot in Figure 6. It shows the change in the direction of the calculation results according to the -Determination principle. The details of the cluster results are shown in Table 6. Table 6 shows the details of the cluster results of course 7011303 Data Warehouse and Data Mining during the first semester of the academic year 2020 according to the -Determination analysis.    Figure 7 shows the appropriate number of clusters of course 221203 Technology for Business Application during the second semester of the academic year 2020. It was found that the appropriate cluster number was -value = 3, with descriptions of suitable clusters shown in Table 7 to Table 8. Table 7 shows the average within centroid distance from its plot in Figure 7. It shows the change in the direction of the calculation results according to the -Determination principle. The details of the cluster results are shown in Table 8. Table 8 shows the details of the cluster results of course 221203 Technology for Business Application during the first semester of the academic year 2020 according to the -Determination analysis.   The results are reported in Figure 3 to Figure 7, and in Table 3 to Table 8. The -Determination analysis is presented as an appropriate method for selecting the number of clusters, demonstrating the values of the average within centroid distance, and details of the cluster results in each course. It was discovered that all three courses should be grouped into three clusters. The results of this research are consistent with the 2nd research questions to find a model to support teaching and learning management consistent with the situation of the 2019 coronavirus pandemic.

Discussion
The discussion of this research is divided into two key areas. In particular, the questions addressed in this research consist of two points. The first question is: How can the use of e-learning tools and technologies support and replace traditional learning? The second question is what different learning groups are affected by the coronavirus epidemic situation of learners. The research therefore focuses on aspects of the data collected and situations, as well as on appropriate aspects of group management.

Aspects of data collected and situations
In a data collection perspective, the researchers collected data from four courses of the two universities over two periods of time with different situations.
The first situation is normal learning management. It consists of 90 students who enrolled in 7000103 Mathematic and Statistics for Information Technology, 56 students who enrolled in 7011303 Data Warehouse and Data Mining at the Faculty of Information Technology, Rajabhat Mahasarakham University, and 27 students who enrolled in 221203 Technology for Business Application at the School of Information and Communication Technology, the University of Phayao during the 2nd semester of the academic year 2019.
The second situation is abnormal learning management. It consists of 73 students who enrolled in 7000103 Mathematics and Statistics for Information Technology, 66 students who enrolled in 7011303 Data Warehouse and Data Mining at the Faculty of Information Technology, Rajabhat Mahasarakham University, and 28 students who enrolled in 221203 Technology for Business Application at the School of Information and Communication Technology, the University of Phayao during the 1st semester of the academic year 2020. A comparison of both situations is shown in Table 9.
From Table 1 to Table 2, Table 9, and Figure 4, it can be concluded that traditional learning management and online learning management produced little changes in students' academic achievement.
The above findings support the first research hypothesis. It is imperative to find differences in academic achievement between different learning arrangements in normal and abnormal situations where online learning tools and technologies are used to support and replace traditional learning. It is concluded that the finding is a little different.

Appropriate cluster management aspects
The objective of finding appropriate cluster management is to find the learning model that will improve the academic performance of students. Researchers believe that good clustering of learners promotes student learning. The fact that groups of learners have similar characteristics and attitudes has an impact on educational outcomes.
From Figure 5 to Figure 7, and Table 3 to Table 8, it can be concluded that the management of teaching and learning can identify appropriate groupings in both dimensions of traditional learning management and e-learning management. For the majority of the research, the number of appropriate clusters for traditional and online teaching was found to be three clusters. This reflects the importance of clusters that reflect learning outcomes. It also underscores the importance of encouraging learners to have a learning cluster based on the attitudes and interests of the learner.
The results are consistent with a second research objective, which is to find a model to support learning management with the COVID-19 situation.

Conclusion
The unusual situation that occurred with the COVID-19 pandemic severely affected the world in all dimensions of our society. This research has been carried out with sincere intentions so that it will be of great benefit to humanity. The research has two main objectives: (1) to study and compare the academic performance of higher educa-tion students affected by the COVID-19 health crisis, and (2) to build a model of academic performance with educational engineering technology to support the learning management process of higher education institutions. These two objectives are fully consistent with two research questions: how can the use of e-learning tools and technologies support and replace traditional learning? and what are the different learning groups affected by the epidemic situation of learners due to the coronavirus?
The approaches implemented in this article are divided into two parts. The first section is the research materials section, which describes the research population, the research sample, and the method for selecting the research sample. The second section is the research methods section. It controls the research using the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology. Data collection is done in two parts. The first part is a normal situation with data collected from 173 students during the second semester of the 2019 academic year. On the other hand, the second part is an abnormal situation with data collected from 167 students during the first semester of the academic year 2020.
The researchers find that the learning patterns that have changed with the COVID-19 epidemic are slightly different. Since the academic outcomes of the learners are not different, it can be concluded that the implementation of the e-learning model in teaching and learning is possible. While the outcome of the research is to develop a model of academic success. The researchers found that managing three sub-groups for each course is most appropriate. Therefore, teachers should apply this principle in managing online learning in order to effectively manage small groups.

Conflict of Interest and Research Ethics
The author declares no conflict of interest. With regards to ethics, the researchers were allowed to conduct the study according to the announcement of the University of Phayao: No. 2/020/63 on April 22, 2020.