An Intelligent Adaptive cMOOC “IACM” for Improving Learner’s Engagement

Morocco Abstract —Despite the massive number of enrollments in MOOC (Massive Open Online Course) platforms, dropout rates are very high. This problem can be due to several factors: Social, pedagogical, prior knowledge as well as a de-motivation. To deal with this type of problems, we have designed an adaptive cMOOC (Connectivist MOOC) platform for each registered learner’s profile. From the first human-machine interaction, the process adapts the learner's need according to a pre-established model. It is based on the processing of sta-tistical data collected by correspondence analysis and regression algorithms. Each generated learner’s profile will provide an adaptive navigation and pedagogical activities. The intelligent system presented in this work will be able to classify learners according to their preferences and learning


Introduction
Learning is getting easier with digital resources deployed on the web [1]. Among these accessible resources we find MOOCs (Massive Online Open Courses) [2]. By MOOC developing, universities have facilitated and accelerated access to high-level learning, free or at a very low cost. However, the dropout rates of these are flagrant. According to J. Daniel, the completion rate does not exceed 7% [3]. This rate is approved by the Software Engineering course offered by the University of California Berkeley on the Coursera platform; for more than 50 000 subscribers, only 7% of them were able to complete their courses [4][5].
This dropout rate [6] can have several origins: demographic (sex, age, level of education, location, etc.), pedagogical [7], social [8], prior knowledge [9] on the subject of MOOC and the motivation [10][11] to pursue a MOOC [12]. The last factor has a great importance as designers try to develop more advanced tools to attract learners [13]. All these works try to understand the learners' willingness and thus attach them to the course [14][15]. Historically, since its creation in 1989 by Tim Burners-Lee results of this paper are based on a survey administered to students from the Abdelmalek Essaâdi University of Tetuan (AEU) in Morocco [34].

2
Research Method

Research method
To reach the main objective, we shall respond to the following research questions: RQ1: What are the minimum inputs to request from AEU students, to build an adaptive cMOOC to their needs? RQ2: How do the selected factors influence the interest in collaborative learning among AEU students?
RQ3: How will the machine learn to provide an adaptive navigation, taking into account each learner's needs?
The answers to these questions will allow us to build an adaptive model to each profile. So, we can offer attractive MOOCs to the learners according to their profiles. Indeed, the RQ1 aims at optimizing the minimal inputs that will affect the learning process via an adaptive platform of cMOOCs. The RQ2 aims at determining the factors which influence more the interest in collaborative learning among AEU students. These parameters will constitute the initial inputs that the learner must enter when registering for the first time, while the RQ3 aims at explaining the architecture and the interactions between the different layers and factors of the adaptive system during the learning process via the platform.

Empirical study
In order to know the profile of preferences related to the socio-constructivist learning of our students, we were interested in the answer to a survey, dealing with the problem raised. It was administered to a representative sample of 383 of the students. We used the stratified random sampling method which gives a better representation in our case. The 13 establishments coming from the AEU are considered as strata, the draw was done from each of these strata, and we considered a second level of strata which is the diversity of the fields of study.
In order to ensure that the heterogeneous students (Arabic-speaking and Frenchspeaking) understand the survey; we have established Arabic and French versions validated by a white test group. In this work, we are interested in the social factors inducing the acceptance of MOOCs as being essential factors for learning or a complement to other soft skills. These factors are: Sex, age category, professional situation, establishment of the AEU, the diploma obtained, the field of study, comfort with technology and prior knowledge about MOOCs. In order to achieve the objective of our study, we used Correspondence Analysis (CA) algorithms to minimize the factors influencing learning through an adaptive cMOOC.

Factor analysis and linear regression
The CA allowed us to analyse the existing link between the qualitative variables and to reduce the factors to have those which maximize the explanation of all the variables. We opted for this data analysis method in order to optimize the number of input parameters that will affect the learning process via an adaptive platform of cMOOCs based on the socio-constructivist approach. A linear regression equation will be presented linking the input variables (the socio-constructivist factors Xn: sex, age category, professional situation, establishment of the AEU, diploma obtained, field of study, comfort with technology and prior knowledge about MOOCs) and the output variable (interest in MOOC learning based on the socioconstructivist approach Yi).
Yi is the interest in MOOC learning output understood by 3 variables: Y1: Interest in collaborative learning Y2: Interest in contributing to the design or the construction of the course Y3: Having social media accounts for learning reasons And the linear regression equation is: This paper considers only the results concerning Y1. The selection criteria of the main factors will be based on Cronbach's alpha as a reliability index, which must be greater than 0.65 minimum acceptance threshold [35], and the percentage of the explained variance for the eigen values greater than 1 [36][37].

Reduction of factors
By using SPSS software, two dimensions have been chosen which provide a good deal of information. These dimensions verify the conditions given (Cronbach's alpha is good for dimension 1 (0.767), and acceptable for dimension 2 (0.656). The eigen values are 3.04 and 2.35 respectively and the percentage of explained variance is around 38% for dimension 1 and 29% for dimension 2.
Phase 1: In order to reduce the factors, we based ourselves on the discrimination measure crossing the two dimensions with the explanatory variables. This measure allows to know the factors least influencing the explained result which can be removed without losing the total information (See Table 1 and Figure 1).  The component matrix (see Table 1) indicates that the items: Sex, Profession, Comfort with Technology and prior knowledge about MOOC have little influence on the two factors retained.
Phase 2: In order to reduce the factors, we used the discrimination measure crossing the two dimensions with the explanatory variables. The reduction of these factors (by eliminating the factors whose measure is weak) and the review of the study gave the following results: The Cronbach's alpha has been considerably improved, it has become very good for dimension 1 (0.820), and good for dimension 2 (0.720). In addition, the eigen values became respectively 2.599 and 2.175 and the percentage of explained variance went from 38% to 65% for dimension 1 and from 29% to 54% for dimension 2. Dis-criminate analysis on the factors used shows a clear improvement in the correlations between these and variables. In this second phase (see Table 2), these four selected factors have an improved explanatory influence: Category of Age, Establishment of the AEU, Obtained Diploma and Field of Studies, this is explained by the improvement of the average variance which went from 33% to 59%.
Once, the reduction of the factors is established, we will proceed to determinate an explanatory regression analysis based on the ANOVA method [38][39], in order to establish the conceptual model of the learners' needs based on the socio-constructivist approach.

Analysis of preferences based on collaborative learning
The ANOVA method allows us to know if the dependent variable (variable to be explained) will be influenced by the independent ones (the factors retained).
Linear regression will reduce the differences in a sum of squares between a dependent variable and a combination of independent variables (the predictors). The estimated coefficients indicate the mode of allocation of the response due to changes in the predictors [40]. Table 3 Includes R2 and adjusted R2 which takes into account the optimal coding: Although the coefficient is R2 equal to 0.101 which means dispersion around the regression line (10% of the variation of Y is explained by the variation of X), the absolute value of its square root is 0.318 which allows us to consider some linear correlation of the model. The following table (see Table 4) represents the results obtained from the first test of linear regression based on ANOVA. It includes the sums of the regression squares and the residuals, the average of the squares and the degree of freedom: This analysis was carried out in order to test the existence and the influence degree of the factors retained on the first variable explained Y1 = Interest in collaborative learning.
Considering the null hypothesis and alternative as: H0: All the factors of Xi are equal to zero. H1: At least one of the coefficients is different from 0. From the results in Table 5, it can be seen that the value of D obtained is greater than or equal to FTheoretical for the thresholds 1% and 5%, therefore the null hypothesis is rejected. This means that we have less than 1% to be wrong, affirming that the models obtained contribute to predict interest in collaborative learning. The regression equation to predict a value of Y1 from the dependency variables Xn (with n = 1: 4) at the threshold 99% is given by: Y1 = -0.102 X1 + 0.238 X2 + 0.140 X3 + 0.158X4 + εi Table 6 shows the simple and partial correlations with Pratt's measures of relative importance for transformed predictors, as well as the tolerance before and after transformation: This Table 6 also presents the value and partial correlations. This is the way to gradually choose the predictor variables. The choice of the variables is however based on the importance of the strongest correlation between the variables which are always available, and the part of variance which remains to be explained once we have removed it, which is explained by the first predictor.

Discussion
This section discusses the results found in this study, answering the various research questions already cited in the methodology section.

RQ1: Which are the minimum inputs to request from AEU students, to build an adaptive cMOOC to their needs?
To answer the question RQ1, we started by targeting the population studied through a survey, where eight parameters were considered in this initial stage: Sex, Age category, professional situation, establishment of the AEU, diploma obtained, field of study, comfort with technology and previous knowledge about MOOC. After doing a blank test, we had distributed the questionnaires and then collected the data. By treating this, we execute factor analysis algorithms in two phases. Finally, the results showed that among the eight input parameters, four were adopted for a significant explanation of the interest of students in collaborative learning [41]: Age category, establishment of the AEU, diploma obtained and field of study.

RQ2: How do the selected factors influence the interest in collaborative learning among AEU students?
To answer this question, a linear regression analysis based on the ANOVA method was applied in order to seek the influence degree of the factors retained on the interest of AEU students in collaborative learning. The results showed a significant model, where we could give the following regression equation: Y1 = -0.102 X1 + 0.238 X2 + 0.140 X3 + 0.158X4 + εi.
Followed by correlation and tolerance tests, the results showed that three input variables between the four reduced marked a positive influence on the interest in collaborative learning among AEU students: The field of studies, the diploma obtained and the establishment. So, with regard to the first output Y1, these factors will constitute the initial part of static variables that the learner must enter during his first registration.
RQ3: How will the machine learn to provide an adaptive navigation, taking into account each learner's needs?
In the literature, several studies discuss the possibilities of integrating learning styles in MOOC platforms [42-43-44], in order to personalize learning contexts and provide adaptive navigation. However, just thinking about learning styles is not enough to solve the problems that MOOCs platforms face, especially the high dropout rate [45]. Pedagogically speaking, the model of the adaptive system proposed in our work is based on the socio-constructivist model, the model of learning styles of Dunn & Dunn and the experiential model of Kolb [43][44][45][46]. Technically speaking, it is based on methods of correspondence analysis, linear regression and classification algorithms of students across a neural network [44][45][46][47]. The auto updating system will provide an adaptive navigation, based at the beginning, on static data concerning the learner's profile (up to now at the level of the results concerning Y1: The field of study, the diploma obtained and the establishment), evolving with dynamic data including pedagogical activities, learning styles and preferences, learning objectives, use and interaction in social environments, etc. [48]. Dynamic data will always be updated automatically, detecting the learner's behaviour [49], activities and performance in his adaptive learning [50][51] environment. The architecture of our IACM approach with its various elementary components is described in the following Figure 2:   Fig. 2. Overview of the intelligent adaptive system It includes six layers: a presentation layer (user interface), a data collection layer, a classification layer through a neural network algorithm, recovery of dynamic data and learner's model building and an adaptation layer. After being registered in the platform and filling all the necessary input data, an initial learner profile M0 will be build.
Concerning the classification layer, the system will start with the preprocessing of the collected data, in order to keep the right information for the application of the neural network algorithm. Then the system will extract the characteristics which mostly reflect the learning styles and preferences, learning objectives, and degree of social interaction. On the basis of this data, the system will create vectors for each learner's model in this step. These vectors represent the input of our model which will be automatically updated to the current learner's model. This is very important for the adaptation process, where each learner's profile will be provided by navigation, resources and pedagogical activities adaptive to itself. Also, from this phase, the intelligent system will identify the classes of learners who demand the same preferences, styles and learning objectives to create groups of learners who share the same characteristics (see Figure 2).

Conclusion
The data analysed through this research work were collected from an empirical study, where a survey was carried out within the different establishments of the AEU (Tetuan-Morocco). After considering eight input parameters in a first step, the execution of CA method showed that the eight parameters could be reduced to four. Among these, three factors had positive impact on our first considered output, which is the interest in collaborative learning. These parameters will constitute the static variables provided by the learner during his first interaction with the intelligent system. The results obtained made it possible to propose a first intelligent model of these cMOOC platform, adaptive to the needs of AEU students based on classification algorithms through a neural network.