Dynamic Identification of Learning Styles in MOOC Environment Using Ontology Based Browser Extension

—With the advent of the era of big data and Web 3.0 on the horizon, different types of online deliverable resources in the pedagogical field have also become raft. Massive Open Online Courses (MOOCs) are the most important of such learning resources that provide many courses at different levels for the learners on the go. The data generated by these MOOCs, however, is often unorganized and difficult to track or is not used to the extent that allows identification of learner types to facilitate better learning. The proposed approach in this paper aims to detect the learning style of a learner, interacting with the MOOC portal, dynamically and automatically through a novel, indigenous and in-built browser extension. This extension is used to capture the usage parameters of the learner and analyze learning behavior in real-time. The usage parameters are captured and stored as a learner ontology to ease sharing and operating across different platforms. The learning style so deduced is based on the Felder Silverman Learning Style Model (FSLSM), where learner’s behavior under multiple criteria, vis-`a-vis perception, input, understanding, and processing are measured. Based on the generated ontological semantics of learner’s behavior, multiple models can be made to facilitate precise and efficient learning. The result shows that this state-of-the-art approach identifies and detects the learning styles of the learners automatically and dynamically, i.e., changing over time. Our approach is validated with the original ILS questionnaire through a precision-recall analysis, and it is found to be substantially accurate.


Introduction
Nowadays, MOOCs as a platform has become an indispensable part of any pedagogical system owing to their "massiveness" and "openness." Such platforms have served to enrich existing courses and introduce new ones through open and continuous remote learning alongside flexible schedules and a modular structure. MOOCs have been incorporated in multiple institutions alongside regular physical classrooms and, as such, have provided a great many number of students with a broad and varied array of courses and skills to learn. Most of the courses offered are free of cost, are independent of geographic location, have flexible deadlines and an overall accommodating structure. This has led to their success with and subsequent adoption by the current academia since their introduction in 2008 [1] [2].
Even with the aforementioned supporting ideas, most MOOC platforms are not without their limitations that hinder a fluid learning experience. Most courses have a fixed sequence of interactable elements that needs to be adhered to for the completion of the course. While this rigid structure might serve to teach the learner, the details of the topic, in the most organized manner, it limits exploration and remains the same regardless of the different learning requirements of a learner. Another limitation is the lack of personalization: learners are not only made to go through the course in one fixed sequence, but they also hardly receive adequate personalized/customized recommendations even after interacting with the course to some degree. This has led to the observation of high drop rates, i.e., learners taking up courses but being unable to complete them [2]. Thus, to engage learners better, MOOC platforms need to become more adaptive and for this to happen, accurate learner-type identification and classification by capturing important learner's characteristics such as learning style becomes the most important step.
It has been observed that tracking learning behavior to deduce a learners' learning style, under different learning style models, through data-based or literature-based (Questionnaire) approaches has helped to orient teaching style (according to suggestions that have been made in the same literature) better. This has helped teachers to cater to the individual needs of a learner both online and in the physical classrooms, thus improving the overall learning experience. Learning style (how a learner interacts with course elements or what types of elements he or she prefers over others) thus becomes an important aspect in classifying a learner to make recommendations about course elements compared to other learner characteristics like skill level and previous knowledge.
Numerous learning style models have been suggested to understand a learner's behavior in classrooms and in E-learning/M-learning environments so as to better orient the learning content and teaching style accordingly. Some notable learning style models are the Kolb Model, HBDI Model, VAK Model, Felder-Silverman Learning Style Model (FSLSM), and the 4MAT Model [7]. Most of these models are literaturebased that have been adopted, in different capacities, to different data-based and hybrid models. Out of these, the Felder-Silverman Learning Style Model (FSLSM), as shown in Fig. 1, has been successful in dividing similar learners based on four separate criteria for better delivery of subject matter [8] and its implementation has shown promise in both physical and online settings. Still, since most MOOCs are not designed in accordance with any explicit learning style model, thus, lacking explicit learner classification parameters, capturing adequate and accurate usage data to predict learning style becomes a challenge [7].
Multiple usage capturing systems such as [11] [12] have been integrated with MOOC platforms to alert learners of deadlines or recommend similar courses upon specific triggers. The data generated by most such systems, however, is often unstructured and un-organized or at times has inadequate learner modelling and, thus, is not appropriate/sufficient to identify learning behaviors [4] [5] [6]. In this scope, a browser extension can be used accurately to scrap websites to capture and store usage data. The main benefits that browser extension-based usage capturing systems have over their contemporaries come from their range of scalability, ease of design, flexible development methodologies and widespread support over a range of applications. A browser extension developed to operate over a particular MOOC platform that has minimal GUI elements is also extremely light-weight and will cause no performance issues to normal browsing.

Fig. 1. FSLSM Categories of Learners
To accurately predict learning style, a large amount of data across a large set of learners needs to be recorded, stored and operated upon thereafter. Thus, the next challenge that arises is how to model and store this data. Since there is a strong industrial and academic shift towards linked data models, the tremendous information being generated from each learner's interaction with the MOOC platform can be related and semantically connected while learners are going through it using certain standardized approaches [6] [8]. A learner ontology, used to semantically model a learner accurately, can be drawn and operated upon to get results while ensuring scalability and flexibility across platforms.
Ontologies as a relational conceptualization have been shown to accurately capture the knowledge of related fields and interacting entities, bringing the two together to give a clearer picture of the "who-what-how." The structure or the semantics of an ontology makes it applicable in several fields, and as a result, multiple tracking and recommendation systems have been built entirely on top of relational ontologies. Ontologies provide a way of organizing data, labeling connections with different entities, and classifying how the two are actually related. Ontologies have also shown promise in the field of artificial intelligence, owing to the ease with which such data can be used by ma-chines of multiple types. The greatest applicability comes from how ontologies can be built, scaled, and operated upon in different programming constructs and platforms [9] [10] [11]. This paper emphasizes on finding a learner's learning style by analyzing the interaction with the MOOC platform based on the FSLSM model through an in-built Browser Extension. To capture the learners' behaviour and requirements, there are many techniques available which are used at the e-learning server-side using Web Usage Mining, however, those techniques are not useful to implement in the MOOC environment where the MOOC platform is distributed in nature [6]. The proposed novel Browser Extension is useful to capture the usage data at the learner side and can be analyzed to identify the learning styles of the learners dynamically. This extension anonymously and securely tracks and maintains learner data generated on interaction with the platform for the construction of a semantic learner model or a learner ontology. This ontology, so created, is then operated upon through the implementation of multiple algorithms that determine the learning style of a learner. The structure and design of the extension along-with the underlying algorithms have been discussed in detail in the later sections. Lastly, the extension generates a mapping of the learners and their calculated scores that defines the relation-ship between the learner and a particular tracked criterion under the FSLSM model. This whole process can then be used as a base to provide customized and personalized recommendations in MOOCs through systems that utilize such ontologies [7] [8]. All in all, the major contributions of this paper are as follows: • Design of a light-weight browser extension-based usage capturing system that records learner data extensively, accurately and completely anonymously. • Design of a learner ontology that utilizes the captured data to accurately model a learner interacting with the course on the MOOC platform. • Use of this semantically linked dataset to predict the learning style of the learner dynamically and automatically which has been found to be the most important link in making personalized content delivery.
The remainder of this paper is structured as follows: In section 2, we discuss and review the contemporary literature in this field, particularly in the sub-domain of learning behavior identification. In section 3, we discuss the design and development of our novel browser extension-based usage capturing system vis-a-vis how the data is recorded and stored and then discuss how the learner ontology is designed and how it is imported and operated upon to predict a learner's learning style. Section 4 is used to discuss the extensive tests we performed on our model and how accurately it performs compared to standardized learning style identification methods. Conclusion of this paper along with scope of future development is discussed in section 6 which is followed by a list of references.

Related Work
In this section we discuss past works and recent advancements made in the domain of E-learning, especially those that utilize ontologies to capture and record learning behavior in MOOC environments. We also discuss works that revolve around learning style recognition in E-Learning settings. Our goal is to present contemporary literature in the field, discuss its implementation and design and finally point out drawbacks that have necessitated this work. Before proceeding with the review, some key concepts to keep in mind are as follows: MOOCs (massive open online course) are large scale online courses that have a high enrollment strength and can be both open and closed access. There has been a constant rise in the popularity of MOOCs owing to their learn-on-the-go design and strong community support. In recent years there has been a shift towards x-MOOCs that often employ a semi-paid structure for course deliverables. Here, the MOOCs are offered by reputed institutions and while some are free to access, most require a fee to get a certificate of completion.
Learning style, as the name suggests, is the way a learner interacts with course material in order to understand and learn it. Learning style recognition is the process of recognizing the implicit learning behavior through statistical, algorithmic, psychological, psychoanalytical or a mix of these methods with the help of interaction metrics, past interaction data, behavioral analysis etc. Learning style models are models used to categorize learning styles under different criteria and, at times, corresponding teaching styles that complement these learning styles. Some of the most widely used models are Fleming's VAK model, Honey Mumford Learning Style model, Kolb Learning Style model and Felder-Silverman Learning Style model.
Ontologies are formal representations of data that outline individual data, objects/entities, categories and how they are related with each other. In simpler terms, ontologies are a way of modelling and visualizing complex interconnected data. Ontologies have components like individuals, classes, attributes, relations, rules, restrictions and axioms. Following the linked-data concepts, ontologies find great usage owing to their scalability and ability to be used easily across platforms.
Ali Mezouary et al. [12] have presented an approach for the automatic detection of learners' learning styles when going through a course in a MOOC environment. They also have used the Felder Silverman Learning Style model to categorize their learners. While this approach has been validated against the processing criteria with active/reflective clusters, the other three criteria are not discussed. This problem is not as simple as scaling the same algorithm for other criteria as they are based on very different metrics. Identifying and operating upon these metrics in a MOOC environment is a challenging task.
Kahina Rabahallah et. al. [9] have proposed a recommender system to suggest relevant MOOCs to online learners by combining memory-based Collaborative Filtering (CF) and ontologies. A pertinent issue with collaborative filtering is how data hungry the process initially is which can lead to the cold-start problem. In this approach, ontologies are used as semantic descriptors to alleviate this by providing information about the learner initially. This approach can be improved by better semantic modelling of the learner ontology by incorporating learning style metrics which have been identified as important criteria to gauge how learners interact with MOOCs.
John Tarus et al. [13] have proposed a similar recommendation technique combining collaborative filtering and ontological knowledge to recommend personalized learning materials to online learners on E-Learning platforms. They have also used ontologies to alleviate the cold-start problem that collaborative filtering-based recommendation systems suffer from by providing some information about the learner initially. The approach suffers from limited use of ontological knowledge, the data from which can be exploited even at later stages of the recommendation process to provide better recommendations.
Hashim et al. [14] have investigated the learning styles dimension of students in MOOC environment and have proposed a MOOC development model that is adaptive to learning styles. The Felder-Silverman Learning Style model was followed, and a mixed-method approach was adopted to investigate the learning style. Interviews were conducted for this followed by a literature study to propose a MOOC development model that is adaptive to students learning styles. While this study establishes the suitability of learning style dimension for adaptive MOOCs, the data collection and learning style recognition process can be scaled rapidly through automated data collection and prediction systems.
El Mhouti et. al. [15] have designed an approach with ontology-based multi-agent systems to describe a learners' requirements and make sure the required resources will be incorporated in the MOOC. The main aim of this approach was to manage drop-out students and improve learner's engagement and interest in the MOOC courses. The ontology-based Multi-Agent System (MAS) has been used to capture a learner's learning preference for adapting the learning resources proposed to a learner in a MOOCs platform. The learning preferences have analyzed using Myers Briggs Type Indicator (MBTI). The major drawback of this approach is that it does not use a standard learning style model suitable for MOOC environments but instead used the MBTI.
George Sammour et al. [16] have used linked data concepts for searching and recommending user-relevant MOOCs courses fitting the level of students' education based on the semantic representation of the user's knowledge of the subject area have presented. Ontologies are used to model a learner's knowledge level and the level of a course. Students can search for relevant courses during their active learning based on their knowledge. The major drawback of this approach is the fact that knowledge level is not enough to accurately model a learners' behavior and other factors, especially the learning style needs to be taken into account as how two different learners interact depends more on their innate learning styles than on their acquired knowledge level. Sucheta V. Kolekar et al. [17] have proposed a two-phase clustering of learners to provide adaptive interfaces and content. The first phase clustering follows the learning sequence while the second takes the time spent on course element into consideration. The design of the learner model is based on Felder Silverman learning style model. Authors have modeled the approach to capture the usage information of learners across sessions to identify required information which is mapped to FSLSM learning styles. Our work improves upon this approach by the incorporation of a light-weight robust browser extension for round the clock data collection and storage.
Sebastian Kagemann and Srividya Bansal [18] have used linked-data concepts and semantic-web technologies to create a semantic data model for Massive Open Online Courses (MOOCs) and published this data as linked data on the Web. The authors have also developed a web portal called MOOCLink that utilizes this data to discover and compare open courseware. While this paper is a step forward in availability of MOOC related linked data, it glosses over the learner aspects. Ontological models of learners correspond to the MOOC platforms they are using as each has different course deliverables thus, we will require more accurate MOOC models for advanced processing of corresponding learner data. Our approach improves upon this by using learning style metrics instead which remain similar across different MOOC platforms.
Xiong et al. [11] have proposed an ontology-based education resource retrieval system by semantic annotation of these resources through ontologies. This transforms the low-level human-readable resource format to high-level machine-readable format, capable of being linked, shared and retrieved. This approach provides a useful implementation of ontologies and linked data concepts by pre-processing disparate academic resource formats to a single high-level format. Our work improves upon this by providing round-the-clock data through a robust browser extension for accurate ontological modelling of academic resources.
Ming Zhang et al. [3] have suggested a multi-source data analysis method to provide personalized learning guidance in MOOC environments to combat low course satisfaction and high drop-out rates. They proposed a three phased approach to this end. The first phase is a content analysis of the course for identification of core concepts followed by a two structured model for finding knowledge state of the learner through quiz submission data in the second phase. Lastly, in the third phase, they have designed a drop-out prediction system using usage metrics. The major drawback of this approach is lack of consistence across MOOC platforms for accurate content analysis. Furthermore, different MOOC platforms employ different testing metrics making a knowledge state detection system difficult to scale.
Brahim Hmedna et al. [19] have talked about the scope of research in the field of learning style identification in MOOC environment. They have proposed a neural network-based approach that tracks a learner's interaction with the course contents and then predicts the learning style based on these metrics. Finally, these predictions are used to suggest appropriate resources through an adaptive recommendation system. Our approach uses semantic web technology in conjunction with robust data collection techniques to a similar end. Nataliia V. Morze and Olena G. Glazunova [20] have discussed how efficiency of course structures are dependent, in large parts, on the learning style of students taking these courses. They have then discussed the design of a course structure for IT students based on their learning style and gauged its efficiency be measuring their performance and course satisfaction. In this way, they have established a clear link between learning style and course satisfaction level and academic performance.
Rene F. Kizilec et al [21] have investigated the effect of self-regulating learning (SRL) strategies on performance in MOOCs. For this, they asked a group of 17 highly prolific and successful learners about their learning strategies and coded these into seven recommendations based on SRL framework. When these recommendations were provided to new learners at the start of the same course, there was no significant improvement in performance. The concluded that single SRL prompt at the course is not enough and that embedded technological aids that adaptively support SRL can have a better effect on performance. This can be done by dynamically predicting learning styles and suggesting SRL strategies based on the predictions. Manal Abdulaziz Abdullah [22] has suggested an approach to classify students dynamically on the basis of their learning style based on the Felder Silverman Learning Style Model by extracting student's behavior from the MOODLE log for a data structure course. The predicted results are then validated with quiz results. Our approach improves upon the scalability aspect of this approach by incorporating ontological models that ensure better modelling across different platforms.
Sara Assami et al. [6] have focused on the problem of high drop-out rates in MOOC courses. They have proposed an approach to enhance the personalization aspect of MOOCs through a better recommendation process. They have chosen certain learner characteristics for making better recommendations and have noted the importance of cognitive learning style in making MOOCs more adaptive.
Dagmar El-Hmoudova [23] has discussed how MOOCs have acted as "tech-asgame-changer" in contemporary pedagogy. He further discusses the poor rate of course completion and explores how learning styles affect learners' motivation. He also talks about the inclusion of individual learning styles in the MOOC environment to aid learning and notes the importance of learning style in improving the overall learner experience.
Lidia B˘ajenaru and Ion Smeureanu [24] have proposed a modality that uses learning style metrics in determining individual differences in an ontology-based Elearning system. They have built a system, aimed at health workers in Romanian hospitals, that recommends them required courses based on their learning style, skill, previous knowledge, etc. The system has, as its basis, a learner ontology, which is then operated upon to get recommendations for learners. Our work improves upon this with dynamic learning style prediction through our proprietary browser extension for accurate learning style predictions.
Birol Ciloglugil and Mustafa Murat Inceoglu [25] have discussed the use of ontology-based learner models for personalization in an e-learning context. They have also created a learner ontology that uses three different learning style models -Kolb, Honey-Mustard, and Felder Silverman Learning style model for personalized elearning. This work notes the importance of learning styles and their ontological models in personalized E-learning systems.
Radhika M. Pai et al. [26] have proposed a method that analyzes captured web usage data to identify the learning profile of the learners at the server-side. The learning profiles are identified by an algorithmic approach that takes into account the frequency of accessing the materials and the time spent on the various learning components on the portal. The authors have failed to capture the data at client-side which is a major requirement in the MOOC environment, hence the method can further be useful to analyze the data by generating mechanism to capture the web usage data at a client-side.
From the above discussion, we can conclude that, despite the popularity of MOOCs and their status of "tech-as-game-changes" in the field of E-Learning, they are plagued with high drop-out rates owing to poor scope of personalization. From the discussion of multiple works that try to combat this issue through different means, we can also conclude that learning style has been noted as an important parameter in making MOOCs more adaptive. We have also seen the prevalence of the Felder-Silverman Learning Style Model as a standard in approaches that employ operations on or prediction of learning styles.
The work in this paper tries to improve upon previous similar implementation through a robust "round-the-clock" data collection proprietary browser extension in conjunction with learning style predicting algorithms that take into consideration, major cognitive learning style factors. We also have attempted to make the learning style prediction more scalable through server-side data preprocessing and use of ontological learner models.

Methodology
The proposed approach is divided into three phases: the first phase focuses on how the data is collected at the learner side once the learner has interacted with course elements, second is concerned with how this data is semantically organized and the ontology is created, and the third phase is related to the identification of learning styles based on FSLSM using an algorithmic approach. The phase-wise process flow of the proposed system is described in Fig. 2.

Data collection
The process of the data collection phase is shown in Fig. 3. First, we have tried to explain how the browser extension functions. The extension primarily works by scraping usage data from the course page on the MOOC platform. An example can be tracking the percentage completion of a video watched by scraping the value of video player timer and total duration. Similarly, we can track parameters like quiz scores, participation in assignments, and reading completion by scraping. We use a counter to keep track of the number of visits to each item by the learner. We also track the time spent on each item by the learner. This is achieved by running a custom timer for each item that the learner interacts with. The time spent is tracked across multiple visits as well and stored. Apart from this, we keep track of the next item that a learner visits after leaving an item and whether the learner makes a sequential or a global movement. One important thing to note is that all the above data collections are anonymized. We only store SHA-256 hashes of learner ids, making all the data completely anonymous. The extension is distributed through Mozilla add-on store and Chrome Web-store for ease of access and installation. The extension is entirely unobtrusive and does not affect a learner's browsing experience in any way. For ensuring the safety and integrity of the data, we only use services from a leading cloud provider with strong security standards. All data is end-to-end encrypted during transmission. A combination of the mentioned features ensures that data collection happens round the clock automatically, comprehensively and reliably, leading to a better representation which is updated dynamically based on learner behavior.

Approach of browser extension
To make any prediction on usage behavior, usage metrics need to be recorded and stored. A learner, while going through any MOOC, generates substantial data in each session. This data can range from how much time he or she spends learning or the grade he or she scores on a quiz to what fairly complex sequence of course elements he or she takes to navigate to a particular course element. Even the tiniest details such as the reduction in time a learner spends on a course video each time, he revisits it can provide us with important learning style prediction metrics [27].
All this data cannot be manually collected or voluntarily requested from learners effectively thus, a need for a fast and automated approach arises which is found in web scraping. Web scraping is the technique used to extract certain important information from web pages and store it to operate upon later. When web scraping is performed across a large set of multiple web pages in conjunction with collaborative filtering and content-based recommendation systems, we essentially are performing web usage mining which is using mined user behavior on the web to recommend user interesting products or other sites to explore. In our approach, we have essentially designed a web scraper packaged as browser extension that collects data continuously and anonymously across web pages stemming from a single MOOC platform website.
Once the learner downloads the extension and provides it with the required privileges, the extension starts running on the MOOC website. The system can be largely divided into two distinct parts, a client-side unobtrusive browser extension, and a server-side data storage and management unit. The client-side is written in JavaScript and works according to the Algorithm 1.
The extension remains active only if the learner is actively interacting with the content. This is achieved using features available in modern browsers (in chrome and firefox, the two browsers where we deployed the extension). If the content window is inactive, the browser extension goes to sleep. Thus, we get reliable time spent data. To analyze the captured data, the server-side is hosted on the Google Cloud Platform and has been written in NodeJS and Python. The server-side is essentially responsible for storing data semantically, fetching it, operating upon it to find learning style through learner behaviour represented by usage parameters, and generating ontology for learner model. The steps are implemented at the server-side to achieve the same are shown in Algorithm 2.
Google App Engine scales the back-end infrastructure automatically, in case of an increase in load to the server, thus giving us the ability to efficiently process theoretically unlimited requests from multiple installations of the browser extension. The sample result of captured para meters is shown in Fig. 4. INPUT: URL of the current web page OUTPUT: title; course id; complTime; quizMarks; completion status; number of visits; con-tent type; initialize Step 1: URL of the current running window is analysed to identify the content type of course element being interacted with vis-à-vis video, quiz, lecture reading, etc.
Step 2: The title, course-id and unique-id of the course program are scrapped from the web page.
Step 2.1: In the case of a video lecture being watched, the percentage completion is scrapped.
Step 2.2: In the case of quiz, peer graded assignment and notebook, the score received, and pass status is scrapped. In the case of a lecture reading, completion status is scrapped. In the case of discussion forum usage, the participation status is scrapped.
Step 3: Each visit to an element generates a unique id which is tracked and recorded to store number of visits.
Step 4: The time spent interacting with a course element is tracked by calculating the difference between the current time and the time at which the page was loaded.
Step 5: The above 4 steps are repeated in the same order (1 through 4) every 10 seconds for accurate periodic updates of the data.
Step 6: The data is sent to the server side (and thus, accepted) if any of the following three conditions hold: Step 6.1: 60 seconds elapsed since data last sent to the server.
Step 6.2: The learner has moved to a different content page. In this case, also send details about new content to the server to track sequential and global moves.

Algorithm 2: Browser extension: Server-side usage capturing
INPUT: title; course id; complTime; quizMarks; completion status; number of visits; con-tent type; OUTPUT: Relevant Data for Analysis initialize Step 1: The server side is always awake and ready to receive data from the browser extension. This is achieved by hosting in on GCP.
Step 2: The data is parsed to identify the learner-id, content type, content id and course id.
Step 3: If data for the learner-id has not been previously recorded, a new entry is initial-ized.
Step 4: Queries are built to check if the learner has visited that particular element before. One of the following two courses is taken: Step 4.1: If visited, increment the visit count and add time-spent count as received from the extension to the existing value.
Step 4.2: If not visited, initialize a new entry with 1 visit and time-spent value obtained from the extension.
Step 5: Based on the type of content, one of the following flow is taken: Step 5.1: If the content type is video, the completion percentage is updated. Store only the maximum percentage of available value and provided value in the request.
Step 5.2: If content type is quiz, peer or notebook, scores obtained and passing status is updated.
Step 5.3: If the content type is reading or discussion, completion or participation status is updated respectively.
Step 6: Each new content type is used to augment the course structure part of the ontology too, which, is mirrored by the usage parameters of the learner.
Step 7: Steps 1-6 are repeated in the same order (1 through 6) for each time a request is sent.

The learner ontology
Ontologies have been used by most recent research in the field as the standard method of knowledge representation. Its ability to be adapted to different web platforms and to be scaled up has brought about this. Researchers on learner models or research works that employ learner models in different capacities have also used ontologies to describe student behavior and inter-action with the course material. Learner models may involve different sorts of parameters, from the plainly academic to the thoroughly psychoanalytical such as age, gender, level of study, grades, skill-set, subjects, co-curricular activities, course materials, learning style, cognitive traits, etc. Ontologies have come out as an efficient construct in modeling such complex and multifaceted systems [15] [28].
However, any learner ontology so designed cannot be "one-fit-for-all." Ontologybased learner models revolve around the course structure, be it part of the curriculum or not. For this paper, an ontology-based learner model (or the learners' ontology) has been designed according to learners' interaction with a MOOC platform. Courses on MOOC platforms are distinct from classroom lectures in multiple ways such as lower group interaction, flexible deadlines, and fixed sequence in the former, to name a few. The learner ontology for this paper has been made taking the above into consideration.
The ontology as shown in Fig 5 is continuously updated with fresh data. As and when a learner interacts with new course elements or revisits previously explored ones and thus can be considered a dynamic learner ontology in the contemporary sense. Once the learner downloads the proprietary extension dis-cussed above in the previous section, details of enrolled courses along with the course structure of the same are recorded, no personal information is collected. Once the student goes through course elements, his interaction and resulting interaction parameters with newer elements are stored. A sudden change in learning behavior can thus be recorded and studied.
As shown in the Fig 5, the three primary classes are course', learner', and learning style'. The learner class is used to record learner behavior by storing learner-generated parameters. These learner parameters are recorded with notebook data', peer data', discussion data', exam data', moves', video data', reading data', and quiz data' subclasses. Each of these subclasses records a specific type of usage data vis-`a-vis interaction with readings stored in reading data, interaction with quizzes stored in quiz data, and so on. The tracked parameters have been described in the first part of the next subsection.
We discussed how ontology-based models are contextual, and it is clearly represented in how the learner class and its subclasses have been modeled according to the course structure. The learner enrolls in a course that is represented by the course class. This class has a discussion, quiz, notebook, exam, reading, video, and peer as its subclasses. These classes store the structure of the course with which the learner interacts where video represents a set of videos in the course, peer represents a set of peer-graded assignments, and so on. It should be noted that the above-defined data storage classes individually store learner-generated data of each course element. The recorded parameters are further elaborated in the next section.
Lastly, the learning style class is used to record the learning style of the learner under four categories of the Felder Silverman Learning Style Model -Input, Understanding, Processing, and Perception and same subclasses are used. Course elements used to find learning styles under each category are related to these subclasses. Each of the four subclasses has two individuals that are used to assign a weighted score to the learner.
The learner ontology facilitates data flow to the python script that uses different algorithms to find the learning style of the learner. These algorithms are discussed in the following subsection.

Algorithmic approach to identify learning styles based on FSLSM
In this section, we explain the underlying algorithms that are used to find the learning behavior of the test subjects involved. The computations are all done in a python back-end, and output is drawn along the four criteria of perception, input, understanding, and processing.
Parameters tracked: This section describes tracked parameters and how they are used for each of the four criteria. The aforementioned are the main learner parameters that are tracked. Certain other redundant parameters that aren't used have been omitted. At different points in the code, the parameter names have been modified slightly to prevent namespace collisions. All the parameters are defined in Table 1. Interaction parameters refer to different parameters that are tracked to understand learner behavior better. These parameters have been listed in the above subsection. Some examples are-revisits, totalTime, actualTime, etc.   Perception: Perception is the idea of how a learner perceives what is being taught and reacts to it. Felder and Silverman classify learners as sensing or intuitive based on perception. A sensing learner is one who learns through observation and gathering of concrete facts and data while an intuitive learner is one who learns through unconscious indirect perception vis-`a-vis speculation, imagination, hunches. Sensing learners are better at grasping hard facts, find it easier to solve problems through standard methods, and dislike complications and have been shown to be slower but more careful and steadier. Intuitive learners are better at understanding theories and principles, prefer solving problems through the novel and exploratory techniques, and have been shown to be faster but less careful and more prone to errors.
Our approach takes these factors into account and adapts to the elements of a course offered by a MOOC platform. Most MOOC platforms offer quizzes and assignment to track learner performance and course completion. Further-more, most MOOC platforms have a predefined suggested sequence of course elements that need to be completed before a learner takes the quiz (however, most quizzes/assignments are not locked behind completion percentages). To find the behavior of a learner based on perception, we have tracked how he/she takes up a quiz and the amount of course material he/she goes through before attempting one. Based on the above factors, we have determined that an intuitive learner would attempt a quiz after interacting with a comparatively lower fraction of the course material required for the quiz/assignment than a sensing learner. An intuitive learner would rather prefer to interpolate more information from what he has already learned and tested it, whereas a sensing learner would like to ensure that his level of understanding of various concepts is adequately commensurate with what will be asked on the quiz. For this, we have set up a threshold limit of the fraction of each individual course element that a learner needs to go through before taking a quiz as 60 %. We have tracked completion of individual course elements owing to the modularity of these programs, where each element serves to teach something new and germane to some section of the quiz. Those elements that fall below this threshold contribute to intuitive behavior, while those above contribute to sensing behavior. The detailed steps are shown in Algorithm 3.
Step 2: For the current element, find a fraction of completion i.e., complTime Step 3: If it is less than 0.6 (or 60%), increase localIntuitiveScore by 1 else increase localSensingScore by 1.
With next element parameters, go to Step2.
Step 4: Once the quiz/assignment is encountered, wait. If localSensingScore is greater than localIntuitiveScore, increase globalSensingScore by 1 else increase globalIntuitiveScore by 1.
Step 5: Set localSensingScore=0; localIntuitiveScore=0; go to Step2 with next set of elements. Go to step 6 if all sets of elements for attempted quizzes/assignments have been accounted for.
Input: Input is the idea of the kind of teaching a learner best responds to, vis-`avis visual or verbal. Initially, Felder and Silverman classified learners based on input in two different criteria: visual and auditory; however, in a 2002 revision to the 1988 paper, Dr. Felder changed it to visual and verbal, which is still followed today. A visual learner is one who learns better with the aid of visual media such as charts, graphs, film demonstrations, etc. Verbal learners, on the other hand, prefer text or more "academic prose" for understanding the concepts over visual medium. A learner can also consume both forms of pedagogical inputs together and in moderation.
In our approach, we classify MOOC course elements as visual or verbal. Most MOOCs have video lectures, pictures, graphs, and hands-on explanations of models that fall under the visual criteria. Transcripts, slides, and text-based instruction manuals, on the other hand, make up the verbal part. We track a learner's interaction with the two and measure his/her preference for the same. Our calculations follow the idea that learners revisit elements they deem important or easier to follow. However, very short revisits such as those when skipping through different elements might give erroneous results; thus, we try to find the number of complete revisits, which is a more accurate measure of preference of an element by a learner. Since most courses have a disproportionate number of visual and verbal medium, we track average interaction or that we scale the score to compare the two more accurately. The detailed steps are shown in Algorithm 4. The total time spent on each element is calculated using equations 1, 2 and 3.
iJET -Vol. 16, No. 12, 2021 r->number of revisits Vi->time spent on the i th visit. R->total time spent on i th element(active/reflective) across r revisits. Ro->the actual time allotted to the element by the course. S->total number of complete revisits. k->total number of elements of a type.
Step 2: Fetch the interaction parameters of elements and divide into two sets: visualSet and verbalSet.
Step 3: For the elements in visualSet and verbalSet calculate score (visu-alScore/verbalScore) using formula shown in equation 3.
Understanding: Understanding is the idea of how a learner learns and follows the course structure. Based on understanding, Felder, and Silverman classify learners as global and sequential. Most MOOC courses follow a rigid structure complete with monthly/weekly portioned modules and deadlines. The course is laid out in a modular manner where modules logically follow and precede other modules. Most courses are structured in a way to slowly become more involved over multiple modules. Sequential learners are those who follow this pattern and interact with course elements in the manner they are laid out in by the platform. They have also been shown to be better at analysis, being able to solve problems through linear reasoning, and possess analytical convergent thinking. Global learners, on the other hand, don't stick to the structure laid out and may learn modules out of order. They may also forgo the structure laid out within modules. They have been shown to approach problems more holistically and possess a divergent, usually multi-dimensional thinking.
Our approach to classifying learners as sequential and global is based on the idea of moves made by a learner. A move in this context is defined as the switch from one interactable element to the other that is a certain duration long. This is done in order to prevent the counting of skips where a learner goes through multiple intermediate elements, skipping each, only to reach an-other one. Each move that is from an element to the one directly next to it, or to one directly before it is considered a sequentialMove. A globalMove, on the other hand, is any move that is not a sequential move. These two types of moves are tracked for each learner, along with total moves made.
Processing: Processing deals with how perceived data is processed and understood. Felder and Silverman classify learners as either being active or reflective in the scope of processing information. An active learner is a learner who chooses to actively experiment, explore, and implement novel ideas. They find practical and skill-based teaching better and have been shown to work better in groups. Reflective learners, on the other hand, are inclined towards theory and concepts. They choose to reflect on the course matter, try to understand it thoroughly before implementing, and prefer working alone or with at most one person.
In our approach, we have tried to identify MOOC elements relating to the aforementioned factors. While most MOOCs do not have many group activities, we have tried to track discussion posts and forum submissions as parameters of activeScore. We have also tracked quiz/assignment interaction that is preferred by active learners along with specific course elements that teach application (listed under the hands-on category in many platforms). We have tracked interaction with most other course elements that aim to teach theory or concepts as measures of reflectiveScore. Again, revisits play an important role in measuring the two criteria, and complete revisits have been calculated to prevent erroneous results due to skips. Lastly, scores have been scaled to accommodate the disproportionate distribution of elements under each criterion. The detailed steps are shown in Algorithm 6.
Step 2: Find the number of moves made to immediate next and previous elements by finding element-id of next element navigated to.
Step 3: Check if this corresponds to the ID of immediate next or previous elements, increase sequentialMove by 1 Step 4: Find the number of global moves by subtracting the above from total moves, store it in globalMove.
Step 3: Fetch interaction parameters of elements and divide into two sets: activeSet and reflectiveSet.

Experimentation and Result Analysis
A sample set of 50 learners were considered for this study, with most learners being 2nd and 3rd year undergraduate students. Each learner was enrolled in at least one course. Most learners had completed at least one course to different degrees. Data across ten courses were tracked despite the total number of courses enrolled in being greater than 10 because students had enrolled in courses, they had not started at the time of writing this paper. The personal info such as name, age, batch, department, etc. was not recorded in our system owing to privacy concerns; however, a record of students participating in the study was physically maintained. The study lasted for a period of roughly 36 days, during which students interacted with the MOOC platform at least four times a week on average, which grew significantly as course deadlines came near.

Results of predicted learning styles
The FSLSM learner styles predicted by our system for all 50 of the learners are shown in Fig 7. For each of the four categories, Strong A represents a tendency towards Active, Sensing, Sequential, and Verbal behavior, respectively, whereas Strong B represents a tendency towards Reflective, Intuitive, Global, and Visual, respectively. Moderate behavior, in any case, suggests that the learner has mild tendencies towards a specific behavior, whereas balance behavior indicates a learner might change their learning preferences frequently. We scale the scores found by our model on a scale of -11 to +11 so that our results are comparable to the original ILS questionnaire. From a quick glance at the above figure, we see that in each category, most students have a balanced preference where their behavior can change depending on the context. This is consistent with most research findings in the field. Furthermore, we can see that for processing (active-reflective) and perception (sensing-intuitive), moderate behaviour dominated over strong behaviour on each side of the spectrum while for input (visualverbal) and understanding (sequential-global) strong dominated over moderate on one side of the spectrum and vice versa on the other side of the spectrum.

Internal consistency reliability and correlation analysis
Internal reliability of a scale is used to describe how closely related a set of items in a group are, and thus how consistent is the scale used to score. We use Cronbach Alpha to measure the reliability of the scale used to score learning behaviour generated by our system. Higher positive values suggest more reliability, with a value below 0.5 being considered unacceptable. In our results, Cronbach Alpha ranges from 0.58 to 0.68, as shown in Table 3, which is considered as acceptable.
The weakest reliability was found in the Visual/Verbal scale, whereas the highest reliability was obtained in the Active/Reflective scale. The Cronbach alpha or the coefficient alpha for the scale for visual-verbal being in the range of 0.5 to 0.6 can be considered poor. The other three remaining scales are question-able and thus consistent with previous research. The poor internal reliability of the visual-verbal scale can be attributed to learners not having a strong preference for either visual or verbal but having an overall balanced preference which is consistent with the way MOOC courses are designed.
Further, we also evaluated inter-scale correlation and found no significant correlation among all the 4 scales used. This result is consistent with the diverse construct validity of ILS. All the correlation values obtained as shown in Table 4 are statistically significant.

Results of literature survey
We asked the participants who downloaded our extension to take up the ILS literature survey. This was done through the physical list maintained. The participants were asked to take up the questionnaire made available on the internal portal and submit the result, as shown in Fig. 8. They were given a period of three days to take the test and submit the result. Fig. 9 depicts the results we achieved. By looking at the graphs in Fig. 7 and Fig. 9, the trend of learners having balanced behavior in each category in our findings is consistent with the findings of the actual ILS questionnaire taken by the student themselves. Furthermore, learners showing a moderate preference under each of the criteria: perception and processing found in our result is also consistent with that of the literature test.

Comparison of prediction results with questionnaire results
As all of our learners are anonymous and thus learner ids are managed to obtain the results for each participant. We measure and compare learning styles in each category by both our prediction system and ILS literature sur-vey. For calculating the accuracy of results, we used the Chi Square test to identify the significance in obtained results (result of prediction system) and actual results (results of ILS literature) as shown in equation 4. A chi-square χ2 statistic is a test that measures how expectations compare to actual.
The following hypothesis statement set for the test. Alternate Hypothesis: There is a significant difference in the obtained results and actual results.
Null Hypothesis: There is no significant difference in the obtained results and actual results.
The 'p value' was set < 0.05 hence the null hypothesis rejected under significant level 0.05. The result of the chi-square test is: • Computed chi-square value: 27.55 • Degree of Freedom: 1 • Level of significance: 5% Tabulated chi-square value: 3.841. The computed value is greater than the tabulated value, which indicates that obtained results are different from actual results. And there is a difference between the identified learning styles of prediction system than the learning styles obtained by ILS literature survey approach. This result shows that, the ILS questionnaire approach to obtain the learning styles are not sufficient to consider the permanent learning styles of the learners as the learning styles are changing dynamically over time while going through the different course in the MOOC environment.
A comparative result of identified learning styles using both the approaches is shown in This paper emphasizes upon finding a learner's learning style by analyzing the interaction with the MOOC platform based on the FSLSM model through an in-built Browser Extension. The proposed novel Browser Extension is lightweight and is useful for capturing the usage data at the learner side which can then be analyzed to identify the learning styles of the learners dynamically. This extension anonymously and securely tracks and maintains learner data generated on interaction with the platform and records it in a semantic learner model or a learner ontology. This ontology, so created, is then operated upon through the implementation of learning style prediction algorithms that determine the learning style of a learner going through the MOOC. From our findings, we see that the predicted learning style closely follows the those from Index of Learning Style questionnaire, and thus our system produces accurate predictions.
In the future we will aim to track learning style across different MOOC platforms that are accessed by the learner, thus putting the concepts behind linked data to use. Since most users use a wide array of MOOC platforms, we believe that predictions made from data captured from multiple such platforms would be more accurate. Furthermore, effect of MOOC platforms on a learner's learning style, or change in learning style across platforms, is also an important area of research. Secondly, we plan on developing better metrics to predict learning style for the processing criteria, incorporating elements beyond group projects, forum discussions and peer-graded assignments for better accuracy. Thirdly, we plan on updating the browser extension with GUI elements that can help learners better understand how the extension operates, what type of data is captured and how it is processed. We plan on adding pop-up notifications and an FAQ section to enhance the user experience. Lastly, we plan on using the predicted learning styles to make course-element level predictions thus trying to solve the issue of lack of personalization and resulting high drop-out rates. We believe that this will help the modern learner to learn and explore more courses in a more optimized and personalized fashion.