A Rule Induction Framework on the Effect of ‘Negative’ Attributes to Academic Performance

Attaining high retention rates among engineering institutions is a predominant issue. A significant portion of engineering students face challenges of retention. Academic advising was implemented to resolve the issue. Decision support systems were developed to support the endeavor. Machine learning have been integrated among such systems in predicting student performance accurately. Most works, however, rely on a black box model approach. Rule induction generates simpler if-then rules, exhibiting clearer understanding. As most research works considered attributes for positive academic performance, there is the need to consider ‘negative’ attributes. ‘Negative’ attributes are critical indicators to possibility of failure. This work applied rule induction techniques for course grade prediction using ‘negative’ attributes. The dataset is the academic performance of 48 mechanical engineering students taking a machine design course. Students’ attributes on workload, course repetition, and incurred absences are the predictors. This work implemented two rule induction techniques, rough set theory (RST) and adaptive neuro fuzzy inference system (FIS). Both models attained a classification accuracy of 70.83% with better performance for course grades of ‘Pass’ and ‘High’. RST generated 16 crisp rules while ANFIS generated 27 fuzzy rules, yielding significant insights. Results of this study can be used for comparative analysis of student traits between institutions. The illustrated framework can be used in formulating linguistic rules of other institutions. Keywords—machine learning, academic advising, mechanical engineering


Introduction
Student retention is an indicator of an educational institution's performance. However, many engineering educational institutions experience problems of low student retention. Ref. [1] declares one of three engineering students graduates on time while one of two graduates at a longer duration. Academic support systems were implemented to address this problem. Among them, academic advising is widely implemented. The model's approximation is reasonably sound. Rule induction algorithms overcome this barrier through formulation of simple if-then rules [20]. Algorithms such as adaptive neuro-fuzzy inference system (ANFIS) and rough set theory (RST) were identified in literature as promising for rule induction. Ref. [21] argued that ANFIS has potential for rule induction as fuzzy logic systems are notably interpretable. Meanwhile, [22] argued that RST can generate simple if-then rules with high interpretability and can also utilize asymmetric and incomplete dataset.
ANFIS was used in the work of [23]. They developed a student classification tool in predicting student performance highlighting the student's interest, talent, and motivation as key attributes. RST proved to have relatively high accuracy in predicting student performance. Ref. [24] showcased an accuracy rate of 90.50% in predicting performances of students in programming. Ref. [25] achieved an accuracy of 98.30% in predicting final course grades based on class performance.
Attributes used in previous works has served as benchmarks in the development of DSS. Notably, attributes affecting positive or higher academic performance was considered. Improving retention rate, however, require understanding attributes resulting to failures. Ref. [26] noted that resolving the dropout rate requires analyzing the underlying factors affecting poor performance. As evident in literature, existing works assessed the causal factors of academic failure. Ref. [27] found that stress overload impacts failure rate for freshmen students. Meanwhile, [28] identified 'negative' academic attributes as having significant effect to poor performance. They determined such attributes can originate from socio-demographic, academic, psychological, and health factors.
As 'negative' attributes are required in identifying 'at-risk' students, understanding its relationship to academic performance is necessary. ML techniques can recognize patterns in the relationship and provide a highly accurate model approximation. However, black box model approaches inhibit verifying the model's internal reasoning. The lack of verification is a barrier for academic adviser's utilization. Rule induction techniques circumvents the problem through formulation of linguistic rules, enabling ease of insight generation. Therefore, this work focuses on rule induction techniques of engineering student performance using 'negative' attributes as predictors. The considered 'negative' attributes in this work were workload, course repetition, and absences. Selection process of the attributes are further elaborated in Section 3. This work is an extension of our previous work, [29]. As our previous work focused on rule induction using RST, this work considers ANFIS as a rule induction technique as well. The advantage of ANFIS is its ability to generate fuzzy rules providing considerations for uncertainty.
We also present here the DSS framework for implementing the rule induction techniques. The framework aims to predict academic performance of at-risk students. The framework can either be used as an offline tool or as a software-based tool, such as [30].
The following discussion of this paper are as follows. Section 2 details the DSS framework and the theoretical background of the rule induction techniques. Section 3 discusses the characteristics of the data used and the attributes' categorization. Section 4 provides the details of the two techniques' performance, formulated rules, and tuned membership function of ANFIS. Section 5 discusses the insights obtained from the formulated rules and its implications. Lastly, section 6 summarizes this work and enumerates the suggested future works.

Decision Support System Framework
The proposed DSS framework is updated on a per academic term basis. Figure 1 depicts this framework. The framework is designed to integrate the institution's academic records, illustrated as the historical database. These records contain students' metric on academic performance, such as attendance, course grades, year level, and units enrolled. Relevant metrics from the records are retrieved and are used by the framework. Patterns in the database are then recognized through rule induction techniques. Rule induction techniques yield rules between academic attributes and academic performance. The rules are provided to the academic adviser to generate insights on student characteristics. Mixed with personal perspective, the additional insights help the adviser formulate robust advice. After each term, the students' performance is tallied in the academic records.

Adaptive Neuro Fuzzy Inference System
ANFIS is a rule induction technique integrating core concepts of ANN and fuzzy logic [31]. Developed by [32], the technique utilizes five nodal layers in its architecture. The architecture is depicted in Figure 2 where data flows from input to output. The five layer comprises of fuzzification, rule firing, normalization, defuzzification, and summation.

Fig. 2. Visual representation of ANFIS architecture
For simplicity, the discussion focuses on a two input, x and y, and one output, z, scenario. Input x has possible attributes of Ai while input y has possible attributes of Bi. The work of [33] provides a general overview of the architecture's algorithmic process. The first layer function is fuzzification of the inputs. The function is represented by eq. (1) where Oi 1 indicates the output of ith node in the first layer. Calculation of the layer's output is based on the membership function, μ(x), depicted in eq. (2). The variables a, b, and c are termed as the premise parameters and determine the membership function's form.
The output of the second layer, Oi 2 , is depicted in eq. (3). The layer's function is in firing the proper rules of the ANFIS architecture. The rules are generated during the tuning procedure. Outputs Oi 2 are then normalized through the third layer. The normalization process is done using eq. (4).
Normalized outputs form the third layer, Oi 3 , are defuzzied in the fourth layer. The computed output, Oi 4 , is obtained from eq. (5) where fi are the yields of ANFIS' if-then rules such that if x is Ai and y is Bi then fi. The variable p, q, and r are termed as the consequent parameters. The final output, z, is the fifth layer's output, Oi 5 and is calculated through summation of Oi 4 , as represented by eq. (6).

Rough Set Theory
RST is a rule induction technique integrating approximations on the boundaries of ordinary sets [34]. The approximations provide better classification accuracy with the laxing of boundaries, as illustrated in Figure 3. Ref. [35] provides a detailed discussion on the algorithmic process of RST. The technique follows an information system space IS which contains the inputs and the attributes. The input set is denoted as set U while the attribute set is denoted as A. Set U consists of the input variables xi while set A contains the attributes ai. The output set, on the other hand, is denoted as set D, containing the output variables di.

Case Study
The historical dataset was extracted from the academic database of the institution, FEU-Institute of Technology. Characteristics of 48 mechanical engineering students were used as attributes for the rule induction techniques. Course grades from their machine design course was assigned as the decision variable. The machine design course is an engineering course engaging students in the analysis of mechanical properties and analysis of mechanical stress. The course is ideally taken among third year mechanical engineering students. Tasks and exams of the course requires extensive computational and analytical skillsets.
The identified 'negative' academic attributes in the case study are workload, course repetition, and incurred absences. Selection of the three attributes was based on insights of previous works and its availability in the academic records. Table 1 enumerates the attributes and decision of this work. The decision variable or the output was categorized according to the institution's grading scheme with 1.0 as the passing grade and 4.0 as the perfect grade.
The first attribute considered is the student's workload. Workload reflects the weight of assigned tasks to the student. This attribute affects the mental stress experienced by engineering students [36]. Stressed caused by significant workload result to poor academic performance. As [37] highlighted, workload is positively correlated with test anxiety, potentially resulting to poor test results. Ref. [38] also noted that increased workload can lower student motivation. There were cases, however, where the contrary is true. Ref. [39] found the attribute act as the initial step towards efficient learning, arguing its relevance in longitudinal research. Meanwhile, [40] argued that workload affect positive academic performance with the support of student interest and teaching quality. As workload has significant implications to academic performance, the case study considered it for rule induction. In the case study, workload is reflected as the number of units enrolled. The units enrolled is a composite indicator of time spent in class, weight of assignments, and course difficulty. The metric provides a general approximation of student workload. The range of the category levels were based on the existing workload category of the institution.
The second attribute is the student's course repetition. Course repetition is indicated as the number of times the student enrolled in the subject. Ref. [41] argued that the attribute can result to demotivation, leading to poor academic performance. Ref. [42] noted that students who retook classes is likely to drop out. Meanwhile, [43] found that students who repeated a course in economics attained lower course grades relative to their peers. However, some works found positive effect of course repetition. Findings of [42] revealed higher scores for retakers relative to the first timers. The findings suggest that with proper motivation, retakers have a good chance of attaining a significantly better grade. Ref. [44] also supplemented this finding as students who took a finance class for the second time attained higher course grade. Overall, course repetition may reduce student motivation or improve learning ability which is dependent on the student's inherent traits. As course repetition can result to reduced motivation and increased chance of dropping out, this work considers the attribute. In the case study, the range for this attribute's category level was based on perceived common course repetition among students.
The third attribute is the student's incurred absences. Incurred absences are instances when the student did not attend the class. In literature, the attribute has a notable influence on poor course grade. Ref. [45] found high correlation for low attendance rate and low course grade. They suggested compulsory attendance for higher academic rating. Ref. [46] has similar finding for civil engineering students in Ireland. Ref. [47] discovered that even among graduate students, high incurred absences correlate with lower course grades. Lastly, Ref. [48] found the impact of absences to lower grades in a calculus class. Predominantly, incurred absences have negative impact to course grades. This work, therefore, considered the attribute in the case study. The range for the category levels of this attribute were based on the maximum allowable instances of the institution.
The software, ROSETTA, generated the crisp rules using RST while MATLAB generated the fuzzy rules using ANFIS. A 70:30 approach was used for the model's training and testing phase.  Figure 4 shows the categorical level's membership functions in the first layer of ANFIS. The y-axis depicts the degree of truth and the x-axis depicts the attribute's value. Parameters, a, b, and, c, of the membership functions were tuned through the training phase. Tuning of the three parameters resulted to ranges similar to the predefined categorical levels enumerated in Table 1. The ranges, therefore, coincide with the guidelines enumerated in Section 3. Meanwhile, Figure 5 Figure 6 shows the confusion matrices obtained from both rule induction techniques. As shown in the figure, both models yielded similar classification accuracies of 70.83%. Predicting 'Pass' and 'High' have higher accuracies as compared to predicting 'Fail'. The models, therefore, are more reliable in determining passing students than failing ones. The RST model formulated 16 crisp rules, as tabulated in Table 2. The table shows the possible course grades according to each attributes' category levels. Each rule has the possibility of yielding one or more outputs brought by the methodological structure of RST. Isolating to a single output, however, is necessary. The output with the highest accuracy rating during the training phase is assigned as the prediction output. A limitation in this model's rules is in the scope of the training data. Rules for combinations not evident in the training data were not formulated. The limitation was observed in the case of a data point with workload of '3 -Overload', course repletion of '1 -None', and incurred absences of '2 -Normal'. The ANFIS model formulated 27 fuzzy rules as tabulated in Table 3. Unlike the crisp rules, the fuzzy rules are not directly translatable needing the process of defuzzification. The fuzzy rules, however, derive insights on course grade prediction. Numerical outputs are tabulated in Table 3 and rounded off, similar to the illustration of Figure 5. The rounded off values, when compared with the predictions of Table 2, show similar results with the exclusion of ANFIS' rule no. 1. The rule yields a course grade of '2 -Pass' while RST's rule no.1 yields '1 -Fail'. Difference between the rules is the result of the RST having three possible course grades. An advantage of the ANFIS model is its ability to produce output even for combinations not evident in the training data. This is evident in the case with workload of '3 -Overload', course repletion of '1 -None', and incurred absences of '2 -Normal'. The ANFIS' rule no. 20 provides a fuzzy rule for such combination.

Discussion
From the formulated rules, workload of '1 -Underload' have predominant prediction of a course grade of '1 -Fail'. This insight is counterintuitive on how workload affects academic performance. Ideally, less workload should improve student concentration and result to higher course grades. The counterintuitive finding may be a result of an underlying attribute unobservable with the current selection. Underloaded students may have enrolled in fewer units due to external duties and responsibilities. Meanwhile, RST's rule no. 2 is of particular concern as the rule predicts failure at a 100% accuracy. The rule is further supported by the 'in-between' generalization of ANFIS' rule no. 2. As pointed out by ANFIS' rules no. 1 to 3, higher incurred absences with workload of '1 -Underload' and course repletion of '1 -None' yields higher chance of failure. Coupled with the failure rate among underloaded students, high rate of incurred absences may indicate that the student's focus is affected by external attributes.
RST's rules under workload of '2 -Normal Load', course grades are predominantly '2 -Pass'. Students under this category may experience less demand from external attributes. In the case of ANFIS' rules for workload, some rules indicate a course grade of '1 -Fail'. The low output may be inferred to either hidden attribute interaction observed elsewhere or to nonexistent combinations. Inclusion of additional data will be needed. Meanwhile, RST's rules no. 5 to 7 show incurred absences has an inverse effect on course grade. The trend is similar with the earlier discussion on ANFIS' rules no. 1 to 3.
On rules for workload with '3 -Overload', course grade predictions are largely on '2 -Pass' and '3 -High'. The trend of workload indicates it is a positive attribute for course grade. Students taking higher workloads may possess confidence and willingness in performing exceptionally. Advisers can utilize such insight in determining student's commitment in accomplishing their enrolled units. Meanwhile, RST's rules no. 15 to 16 are among the notable ones in this workload category. The two rules depict attributes at its extremes, such that course repletion and incurred absences are at their highest. Even though the attributes are in the extremes, course grades are either '2 -Pass' or '3 -High' at a 100% accuracy. The two rules further highlight student's confidence in their enrolled course.
The two techniques attained satisfactory prediction performance. A portion of advisers may find the number of rules, however, difficult to utilize. Reduction of rules in future case studies may be considered to ease comprehension [49].

Conclusion
This study presented a framework on rule induction techniques for academic performance prediction using 'negative' attributes. The rule induction techniques, ANFIS and RST, were utilized. Overall, both models attained 70.83% accuracy rating and yielded 16 crisp rules and 27 fuzzy rules. Generated rules yielded insights on student characteristics and their course grade. Workload has shown an inverse effect on course grades possibly highlighting underlying effect of student's confidence. Incurred absences have a direct effect on course grades possibly indicating the attributes correlation with student's enthusiasm.
Limitations of the current work is lack of data and complexity of rule tables. The data considered academic performance of student's taking machine design courses. Inclusion of other courses, such as those requiring different skillsets, may yield dissimilar rule sets. Future works may consider other courses as the case study. The data is also bounded for only one term. As time progresses, additional data can be included. The addition will yield attribute combinations that is currently not evident. The rule tables are complex, such that 16 crisp rules and 27 fuzzy rules were generated. Reduction of the rules can provide simpler insight generalization and ease of use by academic advisers.