Discovering Trends of Mobile Learning Research Using Topic Modelling Approach

This paper reported a map of identified topics from mobile learning research. Mobile learning is an emerging paradigm in an educational context as its adoption in an educational institution is growing rapidly. Students already use it and feel familiar with it. Some publications from the last ten years were examined. Two approaches were employed to identify themes, i.e. word cloud and Latent Dirichlet Allocation. The result showed that mobile learning research shifts from the development into an optimization paradigm. This research is beneficial for mobile learning literature to inform researchers and practitioners in the mobile learning area in terms of research topic trends and therefore it can serve as a basis for designing mobile learning systems in the future. Keywords—Mobile learning, topic modeling, mobile devices, latent dirichlet allocation.


Introduction
Mobile learning is an emerging research area and it continues to grow. Many studies have been done in this area. Mobile learning can be defined as the use of mobile devices in an educational setting. The devices are used as one of the primary means to deliver learning content, which is one of the main factors in mobile learning adoption [1]. With this kind of delivery, students or learners can access the contents easily anywhere and anytime using their own devices [2] [6]. Basically, students are satisfied with this kind of content delivery [7]. It allows them to maintain engagement in their courses [8]. It helps them acquire a complex learning activities easier, even in their early educational phase [9]. Furthermore, the use of technology in learning activities motivates students to be more creative, think more critically, and have better problem-solving ability [10]. Another advantage is the learning content can be distributed in a larger area [11]. The intuitive user interface in most mobile applications eases students or users to interact with it immediately [12].
There are three dimensions of mobility to be considered when particular institutions have an intention to adopt mobile learning, i.e. mobility of learner, mobility of learning, and mobility of technology [3]. According to [3], the mobility of learners means that students have a control to the content or course material as they will used the learning application apart from their class and teacher. In addition, they will use it at the same time as other activities. The mobility of learning means that the learning material should also be adjusted to comply with mobile technology characteristics. It is important to increase the engagement of students to the mobile learning activities [9]. Last and the most important is the mobility of technology which means each aspect of mobile features in the devices could be utilized and adapted to the learning design. This includes camera, location awareness [13], [14], wireless connectivity [15], and touchscreen [16]. These dimensions can be used to develop an understanding of the mobile learning environment and the learning behavior of the students. Moreover, it can help designer in developing learning application. The objective is to deliver the learning paradigm as close as possible to the mobile environment.
However, while various issues in the mobile learning context are described and published in many articles, not many articles discuss its topic trends. This is important to provide an insight for researchers and developers of mobile learning application to conduct and develop future mobile learning research and application. Mobile learning is an emerging field of research; some areas within this field are uncovered yet. For example, an evaluation method to assess the impact of using mobile devices for learning on the academic performance of students or users [12]. In addition, there is also an inquiry of the effectiveness of mobile learning application to increase students' performance [17]. Therefore, with the topic modeling, there will be more research opportunities being identified in this era of mobile device pervasiveness. The current article pro-posed the use of topic modeling approach to reveal and identify the topic trends of mobile learning research.
Topic modeling is an approach that seeks to find hidden topics from a set of text data [18]. With topic modeling, we can find patterns and relationships of words used in documents to link information and topics between documents [19]. One of the methods in topic modeling is Latent Dirichlet Allocation (LDA). It is the popular topic modeling approach that has been widely used by researchers in various areas to find latent topics within a large amount of data, in this study it is scientific papers and publications. At large, LDA count the topic as the distribution of probability for certain text in a set of particular documents [20]. This paper is structured as follows. There are five sections in this paper. Section one describes motivation. Section two describes related works. Section three explains the methods used in this study. Section four describes the result of the research. Concluding remarks will close this paper in section five.

Topic Modeling
In this section, we discuss previous studies that have investigated and utilized topic modeling as a tool to discover topics from scientific papers. Based on these works, it was found that Latent Dirichlet Allocation (LDA) is the most popular method, among others, to find hidden topics from scientific articles [10]. In addition, the LDA method has promising performance and the ability to discover latent topics when dealing with large-scale documents [5].
With regard to popularity, the LDA method has been proven to conduct document analysis in several research fields. Paul and Girju [21] employed LDA topic modeling and novel classifier in scientific research within and across these three research fields, such as Linguistics, Computational Linguistics, and Education. Particularly, the research focused on investigating how topics switched over time within each research field, the relationship between topics, temporal correlation and topic influences across fields, and topic trends. Nie and Sun [22] performed clustering using the Latent Dirichlet Allocation method to identify research trends of design research.
They conducted a two-dimensional text mining approach such as bibliometric and network analysis issued to obtain research trends from different perspectives. Amado, et al [23] presented a research literature analysis on big data in marketing by applying the text mining approach. Sharma, et al [24] discovered and analyzed the research trends in machine learning by investigating the literature of machine learning from 1968-2017. They employed several methods to identify the topics, such as LSA (Latent Semantic Analysis), LDA (Latent Dirichlet Allocation), and LDA_CM (Latent Dirichlet Allocation with Coherence Model). Westerlund, et al, [25] applied topic modeling using LDA in Technology Innovation Management Review. The study identified seven topics concluded from their associated keywords.
Moreover, LDA can be used to analyze a large number of documents. Amado, et al [2] collected a total of 1,560 articles published during 2010-2015 from ScienceDirect. They performed topic modeling using Latent Dirichlet Allocation to provide a summarized overview of the articles. Furthermore, they grouped articles on logical topics based on the relevant keywords. Sun & Yin [26] tried to find research themes and trends in the field of transportation. The empirical analysis was carried out on 17,163 articles published in the top 22 transportation journals from 1990 to 2015. In this study, the LDA algorithm was applied to the abstracts of the journal articles to conclude 50 topics.
Having considered the two factors, the current study proposed the use of LDA to reveal the prominent topics from the field of mobile learning.

Data collection
The data source for this research was the publication from the mobile learning area between 2007 and 2018. The keywords included was "mobile learning". There were 51 papers from ScienceDirect and 95 papers from Scopus (146 papers) selected. The inclusion criteria for these articles are listed in Table 1. The subject of the articles is mobile learning. 2 The articles are published in a peer-reviewed journal. 3 The articles are published between 2007 and 2018.

Data preprocessing
This stage aimed to clean the data for further processing steps. All the papers were in PDF format, therefore, in order to extract contents, they had to be converted into *.txt format. In this stage, we performed several tasks to prepare the data by removing irrelevant things from the texts. For example, we removed stop words (and, or, it, etc.), non-ASCII symbols, URLs, punctuations, and numbers. In addition, we also conducted a case-folding task to standardize the letters into lowercase. The complete steps of data processing are illustrated in Figure 1. Step of text preparation

Data analysis
In this research, we utilized LDA (Latent Dirichlet Allocation) to build our topic model. To begin the analysis phase, the data source was read and cleaned. Clean data means the data did not contain any punctuation and stop words. These elements should be removed as they had no significant impact on the topic modeling process. Second, the list of words was built, followed by document splitting, bigram and trigram modeling. Finally, we built a dictionary and corpus-based on the data source. This corpus was utilized as a basis to run the topic modeling process. The LDA process representation is depicted in Figure 2.
According to [27], the LDA model is represented as a probabilistic graphical model divided into three levels. Parameters α and β are the parameters of the corpus level, which are assumed to become a sample in the process of producing a corpus. Parameter α is used to determine topic distribution in the document. This shows that a greater value of α will lead to more topics discussed. Whereas β parameter is used to determine the distribution of words on each topic. This shows that a lower value of β represents that the words on the topic are more specific. Variable θ is a variable of tiers document M, where M indicates the variable repetition M as many times at each document. The θ variable represents the distribution of topics in a document. The lower the value of θ, the more specific the topic in the document. Variables z and w are the variables of the word level in document N. Similar with M, N also shows the repetition of the variable in it N times, but at every word. Variable z is the variable that represents the topic of certain word in a document, while variable w is the variable that represents the relationship between words with certain topics in a document. Fig. 2. LDA representation [27] 4

Result and Discussion
To perform topic modeling, this research conducted a modeling process using the LDA method with a range of topics between 5 to 50. The purpose of selecting the range of topics was to find out the best topic counts based on the value of coherence produced from the modeling process. Figure 3 illustrates the correlation between topic counts and the value of coherence produced.

Fig. 3. Correlation between the number of topics and coherence value
From Figure 3 it can be seen the best coherence value was reached when it reached 50 topics. Therefore, the topic modeling process was based on these topics. The visualization of the 50 topics generated from the topic modeling is shown in Figure 4.  It depicts the level of similarity among topics. One circle represents one topic. If one circle is located close to another circle, it means that the topic represented in that circle has close meaning to the topic represented in another circle. The left panel shows 30 of 50 words that appeared most often in the corpus. Each topic comprises the most frequent words. Table 2 provides the details of these words from the selected topics.  Of the 50 topics, we selected 25 topics that were most relevant to the research goal. These topics were then grouped into three categories representing mobile learning dimensions [3], namely technology, learning, and learners.
The most striking words in this topic group are words related to the technological aspects of mobile learning applications, such as adaptation and recommendation which can be interpreted that these two technologies can be utilized to build mobile learning applications that are adaptive and appropriate to the needs of each student. In addition, gesture and size also become the most striking words in this topic group which means that mobile learning applications must consider and be able to take advantage of specific capabilities of mobile technology that are different from PC (Personal Computer).
Secondly, the topic groups are related to the features or content of the mobile learning application, namely games, stories, discussions, and sharing, which means that these three features are the main features that should characterize mobile learning applications. It is slightly different from conventional learning method where teachers usually provide lectures, material, and assignments while students answer the questions in the assignments.
Lastly, topic groups are related to learners. They are related to the membership of application, which means that all the parties involved in mobile learning, i.e. both students and teachers, can be incorporated into a membership mechanism. This will increase their engagement with the application.
Based on the topic modeling result, there are several things that can be discussed further. First, most of the publications discuss the technological aspect. In Figure 5, there are more than 50% of publications that discuss this aspect. This is not surprising as the use of mobile devices in educational activities has just emerged and there is still a wide-open opportunity to improve the technological side of this field. For example, the industry has just begun to develop the educational version of their apps [12]. Moreover, the application quality still needs to be improved [28], [29]. Secondly, although the use of mobile technologies for learning and teaching seems to be potentially advantageous, there is a need for modeling the assessment methods regarding its beneficial value and identifying the factors that affect it [30]. In addition, it would be more beneficial if the mobile learning technology is aligned with the learning process in order to achieve more effective learning result [31]. Lastly, since the mobile learning becomes more pervasive and gains more users from various backgrounds, it is also good to consider the form of mobile devices used for learning in a special group of users, for example, people with special needs [32].

Conclusion
Mobile learning is an emerging area of research as many publications have been done with various topics [33]. To more deeply understand the emergence of research in this area, there is a need to conduct a review study. Topic modeling can be employed to grab an insight from previous publications.
Based on topic modeling analysis, there are several topics identified as the main themes that have been discussed in the literature in the last ten years. Most of the topics discuss the technological aspect of mobile learning. As this field grows, there will be more opportunities to study several aspects that correspond to the implementation of mobile learning and its context.
Seeing the trend of topic modeling analysis in the past 11 years, the research paradigm began to shift to the optimization of the use of mobile devices in the learning process. Therefore, future research can be more oriented to the development of theories or learning frameworks using mobile devices. This could be ranged from the early design of this application to the evaluation of its implementation.