Big Data Application and its Impact on Education

Big data employed in widely different fields; we here study how education uses big data. We review the literature of the research about big data in education in the time interval from 2010 to 2020 then review the process of big educational data mining, the tools, and the applications of big data in education. This paper, with the help of these applications, explores the idea to improve the education process. Two methods are applied to validate the education process, and many parameters are discussed to complete the research. Keywords—Big Data in Education, Educational Data Mining, Data Mining Tools, Big Data Applications.


Introduction
Big data is related to a large amount of data, but it is more than the only size it is characterized by four Vs: volume, variety, velocity, and veracity. Where volumes indicate to the size, in 2014, there were 2.4 billion users of the Internet. That number became 300 million internet users in 2017. By 2019 there are over 4.4 billion internet users, which is representing an 83% increase in the number of people using the Internet in the last five years [1]. Variety refers to all types of data, structured or unstructured get generated by humans or by machines. Velocity refers to the speed of generating and processing data [2]. Veracity refers to the noise and abnormality in generated data, and how much can trust this data when decisions need to make on this data [ 3]. Big data can be applied to many fields such as healthcare, administration, or security. It can revolutionize intelligence in transport, energy, financial, and education [4], which is what we intend to study here. In general, there is a need for using big data, which is given many advantages, such as help to predict the performance for example coronavirus modelling and behavior for COVID-19 is predicted with big data [57]. In this research, we will highlight on education filed because Big Data is already using widely in education [5]. According to [9] up to 2016, there was increasing in several articles that study the big data in education up to 50%, and since then, the number has fallen an average of 6%, so there is a need to do more researches in this field. This research aims to review big data applications in education. We attend to explore how it improves the education process. Furthermore, review the stakeholders, the process of educational data mining, and the tools. Finally, we summarize the applications they applied to big data on the education filed.

Related Work
To find the literature related to the research, this research follows [10] method to find relevant literature, which is three phases: planning, conducting the review, and documenting the result of the review. First step: planning, in order to define the articles, including the term "Big Data for Education," this research depends on search in Saudi Digital Library SDL with customizing the years of publishing from 2015 to 2020. The finding was about 521,716 items. Second step: conducting the review, according to [10] this step aims to identify the relevant researches, select main studies, Assess the quality of it, and synthesize data. Last step: documenting the result of the review. This research depends on the date to order the results. [11] introduce applications for education big data mining to show how big data provided to solve issues in education. [12] They address challenges of handling big data and the most popular techniques used for educational data mining which is Regression, Nearest Neighbor, Clustering and Classification as well as the top tools used such as Hadoop and map-reduce then they discuss how those techniques can be used in a variety of ways in learning analytics. The study also revealed a few numbers of articles focusing on evaluating learning results by analyzing natural language text. This study has reviewed only a few numbers of available articles, on the time interval between 2010 to 2019, there is a need for a systematic study of literature on the big data in education and learning. [13] Examines the state of learning analytics stakeholders such as students, instructors, researchers. They addressed the benefits and challenges these stakeholders face. [14] They explore how big data mining that is based on the cloud can be applied to Indian education and research; then, they address the challenges, and the benefits can be obtained. They use "educational intelligence" as a term that includes terms like learning analytics, academic analytics, and big data analytics. [15] They clarify the Positioning of Big Data in Education, then clarify that theoretical research and practical application of the big data for education remains in the initial stage. [16] He believes big data hold big transforming in education. He points to the opportunities of using big data in education in two domains: learning analytics and educational policy. He also points to that big data for education remains in the initial. [17] highlight on map-based management and visual analysis method they believe it will benefit the users and the researchers from taking advantage of the big data in education. [18] He claims that very limited education policy studies on big data. He discusses how to approach big data and how big data can be used in education policy research. [19] In his book he describes how education is becoming increasingly 'digitized' and 'datafield'. which being conducted through software programs that have been written in code, he details how digitization and datafication support each other, and how they have started to animate the visions of the social member. They introduce the concept of "sociotechnical imaginaries," how "desirable future visions for education based on digital data are now being projected and enacted" [19]. [19] He also provides a historical and conceptual map to an understanding of big data. This demonstrates how a concern with data has become central to the activities of businesses, governments, and social scientific practice. [20] He frames the discussion around policy questions about using big data in education, which are 'Who owns educational data?' 'Who must consent to data collection?' 'Who has the right to access stored data, and for what purpose?' 'What, if any, rights do students, instructors, or even educational institutions have to privacy or anonymity in their educational records?' issues must be addressed. [21] He presents the idea of how big data using in personalized learning and the opportunities offered by big data. [22] They point to the data mining methods and application using for the education they explain learning analytics using in colleges and universities. They review the literature to define the Benefits of Educational Data Mining (EDM) and learning analytics (LA) and the Value of Big Data in Education. Then they point out the challenges. [23] They discuss big data in Nurse Education. They explain the benefit of big data in Nurse Education, challenges, and potential dangers. [14] They propose PABED (Project Analyzing Big Education Data), which is an educational intelligence tool, using for analyzing big education data. N. Xu et.al [23] They study the big data using on the education they applied their study on the MOOC, which is a training platform. [42] proposed a depth learning analysis algorithm based on Map Reduce to deal with college data [44] analyze the factors that affect learning students' behavior of online education in the big data environment. The factors effect on a students' knowledge level of online education and propose strategies of corresponding control according to those factors.

Methodology
In this paper, a research methodology approach that comprises of two methods. The first method will be interpretive to define big data in education means, the stockholders, and the tools. While the second method would be a literature review of journal articles using a well-known systematic literature review by conduct systematically searching, selecting, reviewing, and synthesizing the relevant academic publications which addressed by Kitchenham and Charters [26]. To summarize the big data applications in education: First, we search about the most cited related articles over time and then search again with customized date from 2010 to 2020 about the related article that applied big data mining in education filed, we filter the article based on the titles and abstract.

Big Educational Data
Big data technologies using to extract valuable and meaningful information from vast volumes of a wide variety, veracity, and fast-growing data [27]. Using big data to develop a different type of applications for education data mining, extract knowledge from those data help education sectors such as schools and universities to be smarter. Education is different when compared with different sectors like a business that using big data in terms of the difference in participants who involve in the education data mining process and look from different views as their specific mission, vision, and objectives [8]. The education data consider big data because the volume and the variety it's daily produced a large amount of data about the student, their activity and their interaction with learning systems or the platforms of learning, also the activities of learning, course information which differs from one another, also different information that helps to improve the education processes quality [28].

4.1
Big educational data mining process: Educational data are collected from many different sources in different education environments such as traditional classroom or different learning management systems it could be the records of students, their behaviors, exams performance, social forum, etc. demographic data, IOT data [55] administrative data. A huge amount of data can produce from the education sector. Mining big educational data as any big data mining its follow steps of data mining a) collect raw education data, b) select the useful data, c) data cleaning and preparation, d) data transformation: normalize the data, smoothing and any other process to make the data ready to mining, e) Data mining: extract a pattern from data, f) evaluation and presentation the data, and g) discover the knowledge by interpreted the result [29]. Figure 1 shows the process above.

Educational data mining stakeholders:
The educational data mining stockholders are the people who produce the raw data and get benefit from the knowledge that extracts from the mining process. It may be the same data, but every stakeholder can use it depend on its objectives. The stakeholders are: 1. Learners / students: They consider the producers of most education data. They also can benefit from the result of data mining, such as to personalize e-learning, to get the recommended courses, activities, and recourses, which help them to improve their learning process. 2. Educator/ instructors /teachers: They are who can use the data in a way to improve the teaching methods [30] they can get the feedbacks, analyzing the student behaviors, predict students' performances, identify different methods to organize their courses, to detect the most frequent mistakes they have to avoid, to find any irregular patterns, monitor learning progress, etc. 3. Course developers: It helps them to evaluate the course structures, evaluate the student's satisfaction, and the outcomes of the courses. 4. Researchers: They can evaluate the mining tool, help them to develop specific data mining tools, and compare the data mining tools to identify the most useful tool. 5. Universities/Administers/Learning providers: They are responsible for allocating the resources, either humans or materials, for implementation [30]. They will get benefit also from educational data mining in a way they can enhance the decision process, help in the admissions process, to evaluate the educators, etc.

Data mining tools
The most popular data mining establishment in educational filed are [8,49,50

Applications of Big Educational Data
Here summarize of the applications from related articles that can result from analyzing data that generated in education: 1. Predicting student performance: It is conceder is the oldest and most popular application of Data Mining in education [31]. Using the prediction is to estimate value describes the state of the student, such as the dropout [32,33] or their tasks [34] 2. Data visualization: It can help in many different ways, such as analyzing the activity of students and statistically visualize their activities [35]. By depending on techniques of big data mining, we can produce details about the education environment and extract knowledge such as statistical indicators on the students' interaction in forums, the order in which students study topics and the number of material students use the and so on. [36]. Also, visualize the data to use it, for example, for visual representations to help people to understand the analyzed data [8].
3. Providing smart feedback: Such as feedback about how to make student learning more efficient, organize course resources, and then help them to take appropriate action [37]. Providing feedback also helps to extract new, hidden, and exciting insights from data [8]. Different data mining technologies used to provide feedback such as clustering, classification, and association rule mining all used to allow the teachers to gather feedback from the learning progress automatically. 4. Big data provide insights and recommendations: It can help to provide insights to support the education process [38]. Furthermore, it can recommend the student depending on their activities, links they visit, the next task needs to be done, and so on. Also, to be able to create such as content, interfaces, or personalized activities for every single student [39]. 5. Big data modeling 6. To develop cognitive models of students, include a model of their skills and knowledge. Also, to automatically consider student characteristics such as motivation, satisfaction, learning behavior and so on, in order to automate the student models. For example, association-rule algorithms applied for personality mining on online education to model the students' personality characteristics [40]. 7. Detecting undesirable behaviors of student: Detect the student who has some type of problem or unusual behavior such as erroneous actions, low motivation and so on. Several DM techniques, such as classification and clustering, used to detect these types of behaviors. That is to provide appropriate help [8]. 8. Grouping students to personalized activity: This means to create groups of students according to, for example, features or personal characteristics that help instructors to group students and build a personalized learning system [41,42], to build effective group learning. The data mining techniques used for this are classification and clustering [8]. 9. Constructing course content: The objective is to help instructors and developers automatically develop the course and learning content [43]. It also promotes the exchange of existing learning resources among different systems-the clustering DM technique used to develop personalized courseware by building a personalized web tutor tree [8]. 10. Social Relationship mining: Using for studying relationships between individuals, rather than individual attributes or characteristics. A social network includes a group of people who are connected by relationships like friendship. The most common DM technique used to mine social networks in education is collaborative or social filtering, which is a method of making predictions about the interests of a student by collecting information about what they prefer [8,53,54]. Text data from forums, chats, social networks can apply [44] 11. Scheduling and planning: The objective is to enhance traditional education by planning future courses, such as course scheduling, resource allocation, help in the admission and counseling, developing curriculum and so on [8], for example in academic planning, to explore the effects of changes in admissions, and so on. Furthermore, it used to generate strategies [45], such as to inducing effective pedagogical plan that design Intelligent Systems, that adapts reacts to student behavior in the short term which enhances the learning process in the long term [46]. Association rules the data mining technique used for planning tasks by categorization, estimation, and visualization in higher education for different objectives [8]. 12. Skill estimation: Estimate the student's skill in the educational environment [9,47]. 13. Foreign language learning [48]

Conclusion
In this research, we discuss the big data means to the education field, the stakeholders, the process mining its big data, and the tools. Also, review the related work and summarized the application applied big data mining techniques on educational data. We provide a brief study of the big data mean for education try to help to understand the mine topics.