A Bibliometric Analysis of Computer-assisted English Learning from 2001 to 2020

The aim of this study was to reveal hotspots and frontiers of computer-assisted English learning (CAEL) studies indexed by EI Compendex database from 2001 to 2020 via bibliometric analysis. The publication output has exponentially grown in the past two decades and is likely to progress in the next several years. China occupied the leading position, while Lecture Notes in Computer Science was the most prolific journal, and Deyi Xiong was the most productive author. Keyword analysis was assisted by VOSviewer software. Our results show that “computer aided instruction”, “computer aided language translation” and “learning systems” were the most frequently used keywords in documents. CAEL studies were mainly conducted from five dimensions (technology, learners, teaching, English acquisition and testing). The findings of this study have implications for English language instructors. Teaching methods and modes should be adjusted according to technological development. Keywords—Computer-assisted English learning, bibliometric analysis, VOSviewer


Introduction
Computer-assisted language learning (CALL) can be traced back to the 1960s in America [1], and since then, it has experienced constant changes due to the rapid development of computer network technology. The popularization of computer software promotes the reform of teaching model, especially in English listening, speaking, reading and writing. Considerable scholars focused on the studies related to computer assisted English learning (CAEL) [2]. Some of them have reviewed the development of CAEL in different aspects. Patino and Romero [3] developed a literature review to conclude that the use of a video game as a learning resource in the classroom was of great significance for students learning English as a second language. Sharifi et al. [4] summarized the retrospect and prospect of CAEL through the metaanalysis, pointing out that learners using computer-assisted tools generally performed better than peers who received only traditional face-to-face instruction in their English language courses. Saeed et al. [5] conducted a review of previous studies on learners ' interactional feedback exchanges in face-to-face and computer-assisted peer review of English writing.
Although studies describe as above have contributed to the understanding of the development of CAEL, few attempts were made to gather data and summarize the scientific production of CAEL research with bibliometric method. Based on the application of mathematics and statistical methods, bibliometric analysis can objectively measure the research productivity and publication contribution to the knowledge advancement in an academic discipline [6]. In this paper, we performed a bibliometric analysis of CAEL studies in terms of countries/regions, funding sponsors, journals, authors and keywords. It will help relevant scholars track the development trends of research topics at different stages. In particular, the following research questions are to be addressed in the present study: 1. What is the publication trend in the area of CAEL studies? 2. Which countries/regions, funding sponsors, journals, and authors contributed most to CAEL studies? 3. What are the differences in productive countries/regions and authors between different periods? 4. What is the distribution of the most-used keywords in different periods? What are the most frequently explored themes and topics related to CAEL studies?

Methods
The documents used for analysis were extracted from EI Compendex database. We used "computer-assisted/aided/based English learning" as the subject/title/abstract search term. The time span was set as 2001-2020. After eliminating 14 crude materials (4 retracted articles and 10 articles in press), a total of 2,157 documents were accessed for bibliometric analysis. The results were downloaded in RIS format.
In this study, analysis of publication output, countries/regions, funding sponsors, journals, authors and keywords was carried out with the support of Excel 2016 and VOSviewer 1.6.12. Excel was used to organize the data and draw diagrams. VOSviewer, a free visualization software developed by Eck and Waltman from Leiden University [7], was used to generate the co-authorship network of authors and cooccurrence network of keywords. A co-authorship relationship refers to the fact that two or more than two authors wrote a document together rather than individually. A co-occurrence relationship means that two keywords both occurred in the same document [8].
this domain. It is notable that the global production experienced a 16.21 times upsurge from 2001 (14 publications) to 2020 (227), which offers further evidence for the increasing discussion and studies of CAEL. Furthermore, a total of 8 different languages were encountered in retrieved documents. English (2,135, 98.98%) was the most frequently used language, followed by Chinese (12, 0.56%), Japanese (5, 0.23%). French, Korean, Portuguese, Russian and Spanish all appeared only once.  iJET -Vol. 16, No. 14, 2021

Countries / regions and funding sponsors
The results show that 85 countries/regions contributed to the CAEL studies. China (589 publications, 27.31% of total output), United States (296, 13.72%), Taiwan (230, 10.66%), Japan (180, 8.34%), India (92, 4.27%), United Kingdom (83, 3.85%), Germany (62, 2.87%), Republic of Korea (54, 2.50%), Hong Kong (45, 2.09%) and Australia (44, 2.04%) were the top 10 countries/regions concerning the number of documents. Table 1 displays the most productive countries/regions in CAEL across the four 5year periods in the past two decades. Only 6 countries (China, United States, Japan, Taiwan, United Kingdom and Germany) always remained ahead in CAEL studies. It is noticeable that United States occupied a dominating position in the first five years with 25 publications. However, this advantage dissolved in the subsequent 3 periods. Specifically, the United States' proportion sharply declined from 21.55% in 2001-2005 to 13.51% in 2016-2020. In contrast, China saw a substantial increase, which climbed from only 7 publications (6.03%) in 2001-2005 to 324 publications (33.40%) in 2016-2020, displaying a more than fivefold expansion in output percentage. Moreover, India could be regarded as a rising star, for it stood out with a strong growth over the past decade, ranking fifth with a total of 92 publications by the end of 2020.
There were 151 funding sponsors in total. Figure 3 illustrates the top 10 funding sponsors. National Natural Science Foundation of China (n = 86) contributed most, followed by National Science Foundation (n = 33) and Japan Society for the Promotion of Science (n = 32). It can be deduced that the majority of publications by Chinese scholars received research funding from government [9], among which National Natural Science Foundation of China and National Basic Research Program of China (973 Program) were most prominent. This indicates that government funding and support is indispensable to promote research productivity [9,10].

Journals and subject categories
The retrieved documents were published in a wide range of 83 journals [6]. Table 2 lists the top 10 journals in the number of publications. Lecture Notes in Computer Science ranked first (209 documents), followed by Computers and Education (70) and Communications in Computer and Information Science (51). The top 10 journals published 563 articles, accounting for 26.10% of the total amount. Additionally, these studies involved 160 subject categories based on the classification code of EI Compendex. Figure 4 presents the top 10 subject categories. Computer applications (1628, 75.48%) was the most common topic in this domain, followed by computer theory (599, 27.77%), education (364, 16.88%), computer software, data handling and applications (349, 16.18%) and data processing and image processing (317, 14.70%). iJET -Vol. 16, No. 14, 2021

Authors
A total of 4,829 authors took part in CAEL studies. Deyi Xiong (21 documents), Jinsong Su (17), Eiichiro Sumita (17), Min Zhang (17), Yang Liu (15), Andy Way (15), Ming Zhou (14), Tie-Yan Liu (13), Tao Qin (13) and Masao Utiyama (12) were the top 10 authors with the largest number of publications in the past twenty years. Table 3 demonstrates, in ranking order, the most prolific authors in CAEL studies across the four 5-year periods. Dantsuji et al. [11] provided insight into recognition and detection of English pronunciation based on acoustic models in 2002. Kurimo et al. [12,13] focused on designing statistical machine learning algorithms to provide morpheme analyses for words in 2009. Ming et al. [14] analyzed the opportunities and challenges of developing an effective language learning search engine such as Engkoo in 2011. Su et al. [15] explored asynchronous bi-directional decoding for neural machine translation by introducing a backward decoder in 2019. Moreover, it can be observed that the most productive authors in different periods varied. Authors cannot even appear twice on the list except Deyi Xiong and Eiichiro Sumita. This finding shows that the top researchers in this field changed rapidly. One reason worth mentioning is probably the continuous technological innovation [4].
As shown in Figure 5, the co-authorship network of authors was generated by VOSviewer. The minimum number of documents of an author was restricted to 5, as a result of which 102 authors met the threshold. However, only 40 items were connected to each other. In other words, 62 of them were independent authors. There are 5 clusters and 92 links in Figure 5. Each node represents an author. The thickness of the links corresponds with the level of cooperation among authors. Clusters of different colors refer to different research groups. For instance, the green cluster took Eiichiro Sumita as the focal point, who had 17 publications with 8 co-authors. Masao Utiyama was his closest partner in CAEL studies (link strength = 12). Furthermore, it can be observed from the connected links among clusters that some authors in different research groups also had collaborations with each other. In addition, the top 10 productive authors in the past two decades except Andy Way were all included in the network map, indicating that cooperation boosts publication output to some extent.

Keywords
Keywords can effectively reflect the hot topics in scientific disciplines [16]. Among the 957 keywords we obtained in total, 404 keywords appeared only once, accounting for 42.22%. Table 4 depicts the most used keywords in four 5-year periods. It can be observed that the frequent keywords in different periods were almost the same with few exceptions displayed in blue ("speech recognition", "websites", "deep learning" and "semantics"), although sequences between some of the keywords changed a little [17].
The keyword co-occurrence network of CAEL was carried out by VOSviewer software (Figure 6). The minimum number of occurrences of a keyword was set as 28, and then the top 50 frequent keywords were selected. The links of the node define the relationship with other keywords [18]. The thicker the link is, the more co-occurrence they have [16]. For instance, "computer aided instruction" had the thickest links with "e-learning" (link strength = 583), "students" (550) and "learning systems" (412), indicating that "computer aided instruction" always appeared together with these keywords.
The nodes with the same color form a cluster. As shown in Figure 6, these keywords were classified into 3 clusters. The size of the node represents the weight each keyword has. The most-used keywords were "computer aided instruction" (1180 occurrences), "computer aided language translation" (746), "learning systems" (689), "e-learning" (679), "students" (638), "computational linguistics" (553), "teaching" (315), "natural language processing systems" (300), "linguistics" (252), "curricula" (131). These observations show that CAEL studies mostly focused on English learning environment, technology applications, learners, teacher education and learning contents. Figure 7 illustrates the distribution of keywords based on their average time of appearance. The deeper the purple node is, the earlier the keyword appeared. The brighter the yellow node is, the later the keyword appeared [19]. It is apparent that recent articles mainly focused on "transfer learning" (avg. pub. year = 2018.40, occurrence = 30), "deep learning" (2018.25, 96) and "signal encoding" (2018.17, 35).

Discussion
In this study, some keywords recurred across the four 5-year periods in the past 2 decades, reflecting the common thematic areas in CAEL studies [17]. The keyword co-occurrence analysis reveals 5 active research topics as follows: • Emerging technology: Information technology referred in CAEL studies could be mainly classified as 3 types, including individual study tools (e.g. computer games, electronic dictionary, etc.), classroom-based multimedia devices (e.g. digital videos, interactive white board, etc.) and network-based social computing (e.g. Twitter, blog, etc.), which facilitated the establishment of English learning environment. • Learners: The computer-assisted learning mode made the traditional teachercentered pattern shift to the student-centered pattern [20]. Autonomous learning and individual difference were attached increasing importance, as students were expected to be active participants in the learning process [21,22]. Additionally, considerable documents took students' learning motivation, attitude, anxiety and behaviors into account [23,24]. Improving learners' enthusiasm in English learning was an important issue in this field. • Teaching: The role of teacher was changed with the advances in computer technology [25]. Teacher education and personnel training were required urgently to improve teaching methods. Blended learning approach gradually played an important role in modern education, which was considered more effective than traditional learning approach [24]. • Language acquisition: On one hand, most of the CAEL studies were dedicated to improving learners' basic language skills (listening, speaking, reading and writing) [1,23,26,27]. Speech communication and human-computer interaction were emphasized. On the other hand, it focused on the grasp of knowledge of pronunciation, grammar, vocabulary. Semantics and syntax were the main keywords. • Testing: A great many authors conducted the evaluation of the effectiveness of English teaching and learning through contrast experiment or surveys. Based on a series of empirical studies, some of them further sought to ascertain which instructional tools or teaching methods most foster learning or help learners produce better academic performance.
In different periods, however, a few of new keywords emerged due to the development of information technology, reflecting the shifting trends in the field: • "Speech recognition" was hotly discussed in 2001-2005: Technology in terms of processing speed of computers, storage space and the management of sound and video devices made great progress in this period [28]. Speech applications were widely used. Kirschning [28] performed basic research on the different speech processing techniques, trying to improve the performance of speech recognition and synthesis. Tamburini and Caini [29] presented a study on the automatic detection of prosodic prominence in American English continuous speech.
• "Websites" appeared frequently in 2006-2015: With the rapid progress of multimedia computing, the mode of computer applications in language instruction was reshaped during this decade [30]. Great attention was paid to the use of webbased technologies for second language acquisition due to the prevalent of the Internet. As suggested by Borau et al. [31], Twitter could help students train communicative and cultural competence. Obviously, microblogging contributed to English learning to some degree. Chen [30]explored the critical determinants of college student's proactive stickiness with a web-based English learning environment, including computer self-efficacy, system characteristics, digital material features, interaction, learning outcome expectations and learning climate. • "Deep learning" and "semantics" acted as significant research topics in 2016-2020: In recent years, with the development of artificial intelligence, big data, graphics calculators and cloud computing technology, deep learning models were adopted in English learning [32]. Yang and Yue [32] established an English pronunciation improvement system based on deep learning which combined speech emotion recognition with speech quality evaluation. Meanwhile, many authors focused on neural network to improve Neural Machine Translation models, which achieved significant improvements over a variety of baselines [33,34].
It can be deduced that the development of technology, obviously, promotes the reform of teaching modes and methods. Therefore, with the maturity and application of computer technology, instructors are supposed to alter their teaching strategies or adjust their teaching activities to make the most use of available resources.
Limitations exist in this study. On one hand, the EI Compendex was the only retrieval source. It did not index articles from other databases such as Scopus and Web of Science. On the other hand, the latest articles which have been accepted but not published were not included due to the information delay. Further study with a wide range of data sources is needed.

Conclusion
To our knowledge, it is the first article to explore the landscape of CAEL studies with bibliometric method. The number of documents related to CAEL studies has increased year by year. China was the most productive country and the funding sponsor contributing most was National Natural Science Foundation of China and Lecture Notes in Computer Science was the most prolific journal. Furthermore, Deyi Xiong ranked first in the number of publications in the past two decades, while the productive authors varied between different periods. It can be found that independent authors had an overall majority. Furthermore, the most frequent keywords were "computer aided instruction", "computer aided language translation" and "learning systems". Technology, learners, teaching, language acquisition and testing were the common thematic areas in CAEL studies. The findings of this study can help relevant researchers seek potential collaborators and get general knowledge about the hotspots and frontiers of CAEL studies, providing directions for further research.

Authors
Suya Liuis a third-year undergraduate student of School of Foreign Studies, Hefei University of Technology. Her research area is data mining and computer-aided language education.
Sihong Zhang is currently working as a full professor in linguistics and dean of School of Foreign Studies, Hefei University of Technology. He graduated from James Cook University, Australia with a doctoral degree in anthropological linguistics. His research interests include language documentation and description, language acquisition, language policy and planning and corpus linguistics. His contact information is: zhangsihong@hfut.edu.cn .