A Conceptual Framework to Aid Attribute Selection in Machine Learning Student Performance Prediction Models

One of the key applications of Learning Analytics is offering an opportunity to the institutions to track the students’ academic activities and provide them with real-time adaptive consultations regarding the students’ academic progression. However, numerous barriers exist while developing and implementing such kind of learning analytics applications. Machine learning algorithms emerge as useful tools to endorse learning analytics by building models capable of forecasting the final outcome of students based on their available attributes. Machine learning algorithm’s performance demotes with using the entire attributes and thus a vigilant selection of predicting attributes boosts the performance of the produced model. Though, several constructive techniques facilitate to identify the subset of significant attributes, however, the challenging task is to evaluate if the prediction attributes are meaningful, explicit, and controllable by the students. This paper reviews the existing literature to come up with an exhausting list of attributes used in developing student performance prediction models. We propose a conceptual framework which identifies the nature of attributes and classify them as either latent or dynamic. The latent attributes may appear significant, but the student is not able to control these attributes, on the other hand, the student has command to restrain the dynamic attributes. The framework presents an opportunity to the researchers to pick constructive attributes for model development. We apply artificial neural network, a supervised learner, over a dataset to compare the performance of prediction models with distinct classes of attributes. It confirms the significance of dynamic attributes for student performance prediction models. Keywords—learning analytics, student performance prediction, academic analytics, machine learning


Introduction
Learning analytics (LA) examine student's academic activities by making use of the data collected from different sources and search for correlations linking student's activities and learning outcomes. LA put forward a favorable approach to understand the learning atmosphere and support learners during the instructive process [1]. The raw data is collected from learners, normally through learning management systems, browsing and interaction behavior and then the stakeholders of the process act upon the data to arrange constructive procedures. However, the stakeholders may be expanded or substituted by other groups such as researchers, service providers, or governmental agencies [2]. LA helps to improve teaching methods, learning activities, and extract concealed knowledge about learners [3].
Monitoring student performance is one of the core applications of Learning Analytics. It monitors student's academic performance, while the course is still in progress, and intervenes when the students are leading towards a disappointing academic ending [4]. LA make use of technologies, for instance, educational data mining, machine learning, classical statistical analysis techniques, and social network analysis to accomplish the projected purposes [5]. Machine Learning classifiers are among the tools appearing productive in monitoring students' academic performance.
Some machine learning algorithms, particularly supervised, have been broadly used to build prediction models. The supervised algorithms construct a classification model with the training dataset. The model encompasses the taxonomy conventions revealed from the training dataset. The testing phase executes the model with a subset of the training dataset to evaluate the performance of the model. The training dataset consists of instances (records) and each instance has a number of attributes. For instance, gender, age, and grades are some of the students' attributes. Several authors preferred mining the attributes to come up with a set of significant attributes, although a number of authors preferred to make use of the entire set of attributes [6,7]. The performance of the prediction model relies on the student attributes available in the dataset. Sudani et al. [8] categorized the prediction attributes as academic, psychological, demographic, and others. Papadogiannis et al. [9] concluded grades, demographics, and academic data besides grades are widely used prediction attributes. The review by Shahiri et al. [10] shows Cumulative Grade Points Average (CGPA) and internal assessments as the most widely used prediction attributes.
One of the essential phases of prediction model development is the identification of suitable prediction attributes [11]. The performance of machine learning classifiers might downgrade if the whole set of attributes is used [12]. It points towards the fact that a careful selection of predicting attributes tends to improve the performance of the prediction model [13,14]. Several features selection algorithms facilitate the researcher in this phase. Though, a chosen attribute may appear significant for the classification model but may have rigid nature and thus the student will fail to boost its performance. The evident concern demands the nature of predicting attributes must be flexible so the student can get an opportunity to get inspiration and make improvements in each of the predicting attributes. For instance, gender may emerge as an appropriate predicting indicator however its fixed nature prevents students from any variation. The selected attributes must appear as proxies for learning [15]. The moment a student is forecasted with a probability of ending up with an unsatisfactory outcome, then, the student must have a command to rework and improve its performance in the prediction indicators.
The major contribution of this paper is to review the existing students' performance prediction models and identify the prediction attributes preferred by the authors under distinct educational settings. This paper proposes a conceptual framework to categorize and subcategorize the prediction attributes based on whether the attributes are meaningful, precise and controllable by the student. The paper is organized as; section 2 provides a literature review of learning analytics, machine learning and student performance prediction models. Section 3 describes various classes and levels of prediction attributes and demonstrates the proposed framework. Section 4 provides experimental evaluation to validate the conceptual framework and section 5 concludes the paper.

Literature review
The institutions are craving to collect more and more data than the past to maximize their strategic planning [16]. Despite possessing a huge amount of data, the institutes require standard means of organizing the data and utilizing it for appropriate decision making. Various computer technologies offer numerous opportunities to transform complex educational material into a form easy to understand and remember [17]. The educational institutions adopt novel technologies such as Interactive Learning Environments (ILE), learning management systems (LMS), intelligent tutoring systems (ITS), and online learning platforms which results in an access to a huge quantity of data about the students and the underlying learning environment [18,19]. The intention of transforming large amount of data into constructive information, leads towards the significance of learning analytics [20]. Learning Analytics acquires the data relevant to students and instructors at both individual learner or course level and applies analytic techniques to improve students' learning outcomes through better instructional, curricular well as supporting resources, interventions, and learning culture empowerment [21].
Monitoring individual student performance, in a course, is one of the key areas of LA applications [22] and it appeared, as one of the eight categories of instructional applications, in a white paper entitled "Analytics for Achievement" published by IBM [22]. Monitoring represents the observation and inspection of the progress or quality of something over some time [23]. Student monitoring appears as an incredibly imperative factor in higher educational institutions since the institutions have been systematically monitored by governmental bureaus and accreditation agencies [24,25]. Therefore, to remain competent in pursuing an admirable reputation in pedagogical society, the institutes ought to implement novel procedures to track students' academic progress. This monitoring emerges as a supportive tool for an instructor to identify the students with unsatisfactory academic progress. The instructor can accordingly provide instantaneous interventions to help students understand their endangered circumstances and rework to improve. In order to achieve such goals, the institutions need to put into action novel methodologies that foretell the outcome of students based on their ongoing academic activities. Course Signals implemented at Purdue University [26,27] is an eminent application that extracts data from several sources and the statistical techniques are applied to forecast students at the risk of failing the courses. The instructors then intervene and organize counseling sessions with the weak students.

Machine learning
There are various tools used to develop models which are capable of predicting student's outcome. The models are used as effective means of identifying the students having a high probability to end up with disappointing final results. Machine Learning [28] is considered a useful tool to develop prediction models. Machine Learning (a branch of Artificial Intelligence) refers to learning from the previous dataset to enhance future performance automatically without any explicit programming [29]. The two major categories of machine learning algorithms are supervised and unsupervised. Supervised learning consists of input and output and the aim is to estimate the mapping function sufficiently well so it can forecast the output for unseen data. Supervised learning can be either classification or regression. In classification, the output is a category, while in regression it is a real value. Naive Bayes, Decision Trees, Support Vector Machines (SVM), Linear Regression, are some of the popular supervised learning algorithms. Unsupervised learning has only input data and it has to model the underlying structure in the data to learn more about the data. Unsupervised learning can be clustering and association. The aim of clustering is to find out the inherent groupings in the data, while in association rule the fundamental rules that represent the large segments of data are discovered. K-means and Apriori are well-liked clustering and association algorithms respectively.
Particularly supervised algorithms have been implemented to develop models that accurately predict the characteristics of students which provoke their behavior and performance [30]. Figure 1 illustrates the working of supervised learning. It is mainly a two step process of model development and validation. In the first step, the supervised classifier constructs a classification model with the input dataset (called a training dataset). The training dataset consists of instances (records) and each instance has a number of attributes where an attribute denotes a single characteristic of the student which may influence its academic behavior. These attributes constitute the set of independent variables that forecast students' outcome by labeling them into the most probable category. The produced classification model constitutes classification rules as it is discovered from the training dataset. The subsequent step, called testing, runs the model with a subset of data from the training dataset to determine the effectiveness of the model. The model gains knowledge from the prearranged training dataset, henceforth, ready to classify the instances from the unseen dataset (validation dataset) [31]. The validation data consists of instances with unknown classes. This validation data is provided to the classification model as an input. The model evaluates each instance and tags it with an appropriate class label. Numerous prediction models make use of students academic, demographic, and social attributes to develop prediction models which can forecast students' outcome based on several prediction features at a specific stage of the semester [32]. Asif et al. [33] applied supervised classifiers to forecast the final result of university students at an early phase. Al-Sudani et al. [8] implemented Artificial Neural Network for the discovery of low-performing students at an early stage of the semester so that the university can propose suitable interference procedures to decrease the attainment gap. Ghosh et al. [34] made use of lazy algorithms to identify the student vulnerable of failing mathematics courses and forward the information to the corresponding instructors. Kausar et al. [35] made use of ensemble techniques to examine the relationship between students' semester course and final results. Hussain et al. [36] concludes decision tree as a robust solution in successfully recognizing the students who truly exhibit low-engagement during assessment activities. Similarly, Jishan et al. [37] used Naïve Bayes, Decision Tree, and Artificial Neural Networks to forecast students' final result before the final exam. There are a number of models which are based on decision tree [38,39], lazy classifiers [40], Artificial Neural Networks [6,41,42], Naïve Bayes [37,43] and Support Vector Machine [44,45].

Prediction attributes conceptual framework
In educational institutions, success is measured by the students' academic performance or how well the students achieved the standards set out by the instructor and the institution [46,47]. In the dataset, an instance possesses several attributes. An attribute or feature demonstrates the unique characteristic of a student. Not all the attributes emerge vital in designing prediction models. The machine learning algorithm's performance relegates if all the attributes are used. Therefore, a selection of predicting attributes is required to enhance the performance of the prediction model. The selected attributes must exhibit the dominance to measure student learning rather than boosting student retention [48]. Therefore, it is an obligatory task to appropriately deal with the raw data and recognize the attributes reliable for decision making [49]. This turns towards the need for appropriate and accurate indicators that are meaningful for predicting student's performance. We provide a review of the attributes utilized for predicting students' final outcome through machine learning classifiers.
In the light of this review, we believe that based on the significance, the attributes can be either latent or dynamic. Latent attributes stay along with the student but a student does not have a control to modify these attributes and improve academic status. On the other hand, dynamic attributes not only compute student current status, but the student has command and tends to modify these attributes. As an example, the age is an attribute where a student has no control to alter and thus is a latent attribute. On the other hand, grades in assessment tool (such as assignment) are dynamic attributes as the student can rework to improve their attribute and enhance academic standings. Further, there exist several levels of both latent and dynamic attributes. Here we explain, the levels of attributes and justify whether it is a latent or dynamic.

Latent attributes
3.1.1. Presage. This set of attributes is associated with the past academic record of the student. Several authors used attributes that highlight the type of previous institution or school, the medium of instruction in the school, and the type of program of study in the school. A set of attributes stores the academic results of the students in the previous school or institution. Student's grades at various milestones in school life such as grade-10 and grade-12 are considered essentials by several authors [50][51][52]. These attributes may appear handy in the model developed for forecasting students' dropout at the early stages of the higher institution and for admission eligibility. Several authors [38,53,54] and [55] made use of types and location of the school. Hamsa et al. [56,57] considered the student's admission score and gap years [41,58] as an essential attribute in the prediction models. However, due to the rigid nature, the student is never able to control these attributes.
3.1.2. Demographic. Statistically, demographic attributes are the quantifiable attributes of the population [59]. Student gender, age, disability, ethnicity, and place of birth are some of the widely addressed demographic attributes. Student age (year of birth) and gender are the broadly used demographic attributes in prediction models in pedagogical environment. Several studies [8,60,61] made use of nationality/place of birth, ethnicity, and disability among the prime prediction attributes.

Academic non-reactive.
The academic non-reactive attributes may perhaps inspire a student to improve, but it is hard to grasp them and modify them. For instance, student major of study, year of study, type of degree, and scholarship go along with the student academic period, but a change in such attributes appear very rare. Several authors [6,62] made use of student ID, course ID, and section ID in their models. Several models [36,54,63] considers scholarship as an essential attribute in students' performance prediction.

Social behavior.
These attributes interrelate with the students' academic and social life. This can be subdivided into family background and social behavior. The social behavior of the student may include attributes such as the employment status of the student, the time spent with friends and family, social relationships, and interest in extracurricular activities. Several others attributes such as student health issues, commuting, and stress management capability may affect a student's academic performance.

3.2
Dynamic attributes 3.2.1. Academic reactive. These attributes are related to the students' academic activities and thus can appear useful to compute the current academic status of the student. The academic reactive attributes stick to the student throughout studies and calculate the academic standings of the student. The most broadly applied attributes consist of; Cumulative Grade Point Average (CGPA), student attendance in the course, grades in assessment tools (such as quizzes, assignments, and lab work), and grades in the midterm exam [6,62,66]. Indeed, the students have command to control these attributes to recover and improve their academic standings. Several other attributes we suggest include; students' GPA in the previous semester, number of subjects registered in the current semester, and grades in the prerequisite subject.

Psychometric attributes.
The psychometric attributes underscore the students' behavior, interest, and barriers towards their studies. Several authors [58,67] used psychometric attributes in designing performance prediction models. These attributes may be reactive such as private tuition, internet access, motivation for higher education, or maybe non-reactive, for instance, homesickness, self-motivation, extra abilities in the student. Thoroughly organizing these attributes might improve the reactive attributes and thus enhance student's academic performance. Table 1 provides a summary of the various prediction models and the number of attributes they used from each of the categories.   Table 2 provides a brief summary to illustrate the class of attribute, the frequency they appear in models and broadly used attributes in each category.

Conceptual framework
We propose a conceptual framework to visualize the sets of prediction attributes and their impact in the prediction models. The framework, in Figure 2, illustrates the two major classes of prediction attributes, which are further subdivided into classes. Some of the most widely used attributes from each category are listed. The conceptual framework proposed does not attempt to explain or describe every possible attribute and relationship among attributes in a machine learning prediction model. It rather provides a set of concepts that can be used to think about developing prediction models. It can be used as a heuristic tool to examine relationships among concepts, prediction models, develop additional lines of research, and raise new questions. Fig. 2. The proposed conceptual framework to classify the available student's attributes iJIM -Vol. 15, No. 15, 2021 Attribute selection is among the key stages of developing machine learning prediction models. Besides a number of techniques exist to rank the attributes in a dataset however, our conceptual framework emphases over the nature of attributes as well. Different levels of attributes tend to appear beneficial in distinct situations. The dynamic attributes requires additional weight whilst designing prediction models to forecast the final outcome of the students. This prediction can be more beneficial if the prediction model is accompanied with adapted recommendation module. The recommendation model informs the students of their current academic standings. The students can then have a chance to work hard and improve their academic position. This is more meaningful if the model is designed based on dynamic attributes.
However, the latent attributes emerge constructive in the models intending the classification of students and with no recommendation is envisioned. For instance, forecasting the dropout rate [79,80] at an institute may consider different levels of latent attributes. Similarly, several models, for instance [81], finds latent attributes meaningful in models for judging the admission eligibility of students in the higher education institute. The latent attributes can support administration in decision making after exploring useful patterns [53].

Experimental evaluation
In this section we apply supervised learner, Artificial Neural Networks [82], to observe the delineation between latent and dynamic attributes. The dataset consists of the student academic records for a course taught at Al-Buraimi University College (BUC), Sultanate of Oman. There are total of 151 instances in the training dataset with 12 attributes and one prediction class. The experiments are performed in Waikato Environment for Knowledge Analysis (WEKA) with 10-fold cross-validation.
Initially, we produced a model with only latent attributes in the training dataset. As Figure 3 shows, the model is able to achieve an accuracy of nearly 52%. The following experiment eradicated the latent attributes and the model is built with merely the entire set of dynamic attributes. An increase in the accuracy evidences the significance of dynamic attributes for better prediction performance.
In order to further validate, the attribute selection is performed with Correlation Attribute Evaluator Filter with Ranker search to reduce the number of attributes. Correlation-based Feature Selector (CFS) measures the Pearson's correlation between each attribute and the prediction class and the unrelated features turn up with low correlation value. Table 3 provides the list of attributes, their description, class and the correlation values. The final model produced with the attribute selection achieves accuracy slightly better.  These experiments confirm the significance of dynamic attributes over latent attributes for student prediction modeling. It is observed that most of the latent attributes have correlation values less than 0.15. On the other hand, the dynamic attributes possess a higher value. Non-reactive attributes have extremely low correlation comparing to other latent attributes. Dynamic attributes have highest correlation values, which confirm their significance in the prediction models.

Conclusion
Machine learning algorithms are constructive tools to support Learning Analytics by building prediction models capable to forecast the final outcome of students based on their key attributes. The dataset usually suffers from high dimensionality and not all the attributes play vital role in the prediction process. A cautious selection of predicting attributes can boost the performance of the produced model. However, it is necessary to consider the nature of the selected attributes, especially, if the prediction model is accompanied with recommendation practices. This paper lists the attributes used in student performance prediction models and proposes a conceptual framework which demonstrates the attributes as either latent or dynamic. The conceptual framework provides a set of concepts for researchers while developing prediction models. Latent attributes may emerge essential prediction indicators, but the students are unable to modify and improve. On the other hand, the dynamic attributes are well in the students' control. An experimental evaluation confirms the importance of dynamic attribute in the student performance prediction modeling. We apply artificial neural network over a dataset to produce models with all latent attributes, with all dynamic attributes and in the third experiment Correlation-based Feature Selector algorithm was chosen to select the attributes with highest Pearson correlation value. The experiments illustrate the significance of the dynamic attributes.