Designing and Evaluating the Use of Smartphones to Facilitate Online Testing in Second-Language Teacher Education (SLTE): An Auto-Ethnographic Study

This paper reports on an auto-ethnographic study of the use of smartphones to facilitate online testing in the context of second-language teacher education (SLTE). A total of 54 pre-service teachers participated in the study. Preliminary data were collected through observation and written reflection, and additional data were gathered from interviews and students’ web activity logs to enable triangulation. Thematic analysis was carried out on the qualitative data. The findings show that smartphones are a viable electronic tool to facilitate online testing in an SLTE context. More importantly, using Moodle as an online test platform meets both teachers’ and students’ needs with respect to aspects such as design, test structure and online testing activity. The study also highlights some benefits and challenges of employing sequential and multiple-attempt test modes and providing delayed feedback on online tests. The implications of these findings are discussed, with suggestions for further research in the field. Keywords—smartphone, teacher education, second-language teacher education (SLTE), auto-ethnography


Introduction
Recent developments in mobile technology have shifted the use of computerfacilitated online learning and adaptive testing from desktop or personal computers (PCs) to mobile devices such as tablets, personal digital assistants (PDAs), pocket PCs and smartphones. Compared with desktop computers, mobile devices offer many benefits, including "flexibility, portability, low cost, ease of use, and timely application" [1, citing 33]. Mobile devices should therefore be seen as potential tools to facilitate mobile learning, and specifically mobile adaptive testing. With regard to the latter, mobile devices may offer particular opportunities because they do not require dedicated computer classrooms and, more importantly, they can be used anywhere [1]. While researchers have been attracted to the use of mobile devices such as smartphones to facilitate computer adaptive testing (CAT), little attention has been paid to exploring the development of CAT for mobile devices and its implementation in the educational classroom setting.
This study was conducted to gain insights into the use of smartphones to facilitate online testing in the context of second-language teacher education (SLTE). The term "smartphone" refers in this paper to a cellular phone with "advanced capabilities, which executes an identifiable operating system allowing users to extend its functionality with third party applications that are available from an application repository" [2, pp. 444-445]. In particular, this study addresses three research questions: 1. What is the process for developing a Moodle-developed website to facilitate online testing in SLTE? 2. To what the extent does the design and process of online testing facilitate students' learning? 3. How does the practice of developing and applying online testing using smartphones in the classroom contribute to teachers' professional development?

Auto-ethnographic research approach
Auto-ethnography is a qualitative research approach that allows researchers to describe and systematically examine their personal experiences in order to better understand cultural experiences [3]. In auto-ethnographic studies, the researcher "retroactively and selectively writes about past experiences" [3, p. 4]. However, writing about the past does not mean that auto-ethnography is an act of simply writing "a story"; it is expected to facilitate researchers' criticality, allowing them to address gaps between theory and practice [4]. While writing their stories, teachers are required to reflect on their experiences and "make meaning of them; that is, they gain an understanding of their teaching knowledge and practice" [5, p. 374].
In this study, an auto-ethnographical research approach was employed to address the question of how we could improve our practice in designing, managing and distributing online tests to students using smartphones during courses in a secondlanguage teacher education (SLTE) context. With reference to the research questions, our focus was on exploring and examining: 1) the development of a Moodle-based website to facilitate online testing; 2) the extent to which the design and process of online testing facilitates students' learning; and 3) a personal account of professional development through using smartphones in classroom testing.

Description of the site
We used smartphones to facilitate online testing in a second-language teacher education (SLTE) setting in a private university in Indonesia. In this university, online learning (also known as electronic learning or e-learning) has been practised widely by the lecturers through PCs and laptops, but we observed that few lecturers were using smartphones to support their classroom teaching. The use of smartphones for classroom instruction was limited primarily to providing students with learning re-sources. Despite the university's available technological facilities, little attention had been paid to using smartphones as an online tool to facilitate classroom assessment.

Participants
In conducting this study, we worked with 54 students attending two courses during one semester of the academic year 2016-2017. Our observation prior to the classroom activity showed that all of the students possessed smartphones with internet access. The students were also observed to have sufficient knowledge and ability to operate computers (laptops) and smartphones. With regard to students' smartphone skills, they were able to browse the internet, and communicate and interact with other people online through their smartphones. They were also able to search for learning resources on websites, and download and store those resources onto laptops and smartphones. These skills would facilitate their online learning activity [6,7].

Data collection and analysis
Data for this study were collected through classroom observation and reflection. Classroom observation was conducted to help us gain a comprehensive understanding of our actions in developing, managing and distributing tests online [8]. We also wrote self-reflection notes that enabled us to describe and evaluate the effects of our actions. Specifically, we reflected on our activities while developing the website for the online tests and the classroom procedure, and on the effect of the online test design on students' test activities and subject learning. In this regard, a reflections-onpractice framework was adopted [9,10,11].
In addition to the two methods above, we used two datasources to facilitate our reflection. First, we conducted interviews with ten randomly selected students to explore their experiences and perceptions of online testing. Second, data relating to students' test activities and performance were also reviewed, including students' activity logs, the number of attempts they took to complete the test, the time they spent on each attempt, and their test scores for each attempt [12]. Figure 1 shows an example of a student activity log obtained from the web database.
Thematic analysis was conducted to analyse the qualitative data collected. This helped us to identify, analyse and recognise patterns in the data [13]. The qualitative data from the interviews were first transcribed verbatim [13], and we then read and reread the written data from our observations and reflections. The data were then coded, and the collection of codes was analysed to help us identify themes.  Auto-Ethnographic Narratives

Development of website for the online test
During the planning stage, we reviewed relevant literature on the use of Moodle to facilitate classroom instruction [e.g. 14, 15,16]. Informed by this literature review, we worked with our colleagues in the university's Centre for Information and Communication Technology to develop a website to facilitate online classroom testing. Together with these colleagues, we installed Moodle 3.0 and used Adaptable, a Moodle template that works with mobile devices such as smartphones.

Design and structure of the online tests
The online tests in this research comprised both formative and summative assessments. The development of online formative tests aimed to foster students' understanding and learning of specific learning units [17,18]. These tests were distributed to the students weekly after the completion of a learning unit. Their distribution was unsupervised as the students took the tests outside classroom sessions. As suggested by Kibble [17], we did not provide students with explicit rules on how to take the formative tests, nor explain what was acceptable or otherwise [17, p. 259]. Instead, we explained how the tests would help improve the students' learning and contribute to the final scores for their courses. We also informed them that we would monitor their test performance in order to identify their strengths and weaknesses on specific course topics. Accordingly, we did not regard any patterns of quiz activity in which the students engaged as unethical or cheating.
In addition to the formative tests, online summative tests were developed to measure students' learning progress at particular points in time [19,20]. These were distributed twice to the students, during the middle of their course (mid-term test) and at the end of term (final term test). To distinguish them from the formative tests, the summative tests were carried out under formal conditions and were supervised.
Each test for both formative and summative assessments was developed in line with the test development principles suggested by Bachman and Palmer [21] and Brown [20]. All online tests were developed in a sequential mode, allowing students to take them only if they already met the test requirements. For example, in order to be able to take "Grammar Test 3", students must already have passed "Grammar Test 2". In line with this sequential mode, a multiple-attempt format was applied to all tests, the aims of which were twofold: to enable students who failed at the first attempt to retake the tests until they obtained a passing grade, and to promote retention of the learning materials.
All the tests were multiple-choice with four options. The multiple-choice type was chosen because it suits many online learning platforms [see, e.g. [22][23][24], is easy to distribute to large groups of students, and is quick to score and provide feedback [20,23]. The tests contained varying numbers of questions. For the formative tests, the total number ranged from 15 to 25, and the total time allowed for test completion was between 20 and 60 minutes. For the summative tests, the total number of items ranged from 30 to 45, and the tests lasted between 45 and 90 minutes. If students exceeded the time limit, the web system automatically closed the test and prevented them from continuing. Time limits were intentionally imposed on the tests to motivate the students to prepare for the test in advance and thus enhance their test performance [18].
In designing the tests, the items were individualised to discourage the students from copying others' answers [25]. To this end, we randomised the arrangement of questions and options so that each student received a different version of the test questions. As suggested by Riffell and Sibley [25], we applied a random number algorithm to allow the web system to select a certain number of questions from question banks that we had prepared earlier.
A scoring method was developed on the web system to allow us to measure students' attainment [26]. Each question was weighted according to the formula ! ! !!""#, where N is the total number of test questions. The maximum grade for each test was set at 100 (100%), with a passing grade of 80 (80%). This meant that students could only progress to the next test once they had achieved 80% or higher on the previous test; otherwise they had to retake the test. Table 1 summarises the structure of the online tests and Figure 2 presents the quiz (test) layout.

Test procedure
Once the website was ready, we provided students with usernames and passwords to access the online tests. The login procedure is presented in Figure 3.
Students who logged onto the website were presented with a list of topics and relevant tests. They were also given opportunities to attempt the tests according to their own preference, although they were reminded that the tests had to be taken sequentially. They could either begin a new test or re-attempt previous ones.

Pilot test
Two online tests, Grammar Test 1 and Grammar Test 2, were piloted on two sessions of a course. The tests ran well during the pilot, although two critical issues were identified. First, some questions were lost and only the options appeared, which we learned was due to incomplete uploading of the questions. We had developed Grammar Tests 1 and 2 manually, creating two text files using an offline text editor application (e.g. Notepad). The two tests had then been stored in the quiz database through a quiz import tool available in the Moodle web system, using the Aiken-formatted question method to store the test items in the question bank. The questions were successfully imported, but when the students took the test, some questions were missing. Accordingly, we had to redevelop the test using the test item development tool available in the web system.
The second issue we encountered related to randomisation of the test items. We noted that the random selection system did not work. In our first design, we had developed question banks from which a specified number of question items could be retrieved for each test. We had developed 60 question items per category, and only 15 questions were selected for each test. Unfortunately, during the pilot tests, the students reported that a total of 60 questions had appeared, not 15 as expected. This meant that the random selection design was not functioning properly. Having sought an answer to this problem from Moodle discussion forums, we understood that we had wrongly applied the arrangement procedure on the web system, which in turn had caused errors in the randomisation of the test items. We made several revisions to the test design and successfully re-tested it.

Discussion
Analysis of the qualitative data from observations and reflection, as well as additional data from the interviews, revealed three key themes, as discussed in the next subsections.

Students' engagement with the online tests
Analysis of our observation and web log data showed that students' participation in the online tests was 96.5 per cent. This high rate of participation was unsurprising as the online tests were used for both formative and summative purposes. The participation rate was 95 percent for the daily tests and 98 per cent for the mid-term tests. As previously explained, the formative tests attracted credit and were pre-requirements for the summative tests. In retrospect, these two factors contributed to the high participation rate for the formative tests. The formative tests each contributed up to 10 per cent of students' final course scores, while the summative tests each contributed 30 per cent.

Students' perceptions of online testing through smartphones
The findings of this study show that most students had positive perceptions of the use of smartphones to facilitate their online test activities. Using smartphones was regarded as easy, and the online test application developed using Moodle was viewed as user-friendly. Students felt that the size of the online test application was small enough to enable it to work fast on their smartphones. One student stated: I did not find difficulty when using a smartphone to complete tests ..

. [the online test application] ran well on my Android phone, and, I could say the internet was quite fast. Actually, I tried once using my laptop ... I just didn't like it when the loading process took a long time. I didn't know the reason. Maybe because of the internet access, or maybe something wrong with my laptop. (Interview with MA)
In addition, using smartphones provided flexibility in completing the tests. Students regarded smartphones as small enough to carry everywhere, allowing them to complete the tests without restrictions on time or place. One student said that she did the test in a café. She said: "I like that the test was set up in just, say, twenty minutes. So when I ordered a drink, I could complete the test while waiting." Although we did not personally receive any negative comments about the use of smartphones for online testing, two students were observed working on their laptops and another two on tablets. Our discussions with the students revealed that they did not find it convenient to take the tests through their smartphones due to issues of screen size and small fonts. One student said: "I could not read the questions very well and the small fonts on the phone hurt my eyes." These issues have been highlighted as primary concerns in previous research on the use of smartphone applications [e.g. 27, 34, 35].

Effect of test design on students' motivation, learning outcomes and test anxiety
The application of a multiple-attempts online test format and the method of feedback delivery in the online test design was shown to affect students' motivation and learning. The design was also found to be a prominent factor contributing to students' anxiety before and during the tests.

Students' motivation:
The findings of this study show that a multiple-attempts format for online test design was a key driver of students' participation in the test sessions. This format, which allowed students to reattempt the tests, was seen not only as a way to familiarise themselves with the online testing technology, but also as an opportunity to gain higher scores. The students were observed to have strong motivation to take and retake the tests when their achievement was below the passing grade. The web log of students' attempts revealed that the total number of test attempts ranged from two to 56. Five of the ten students whom we interviewed after a test expressed curiosity about why their answers to the quiz items were incorrect, motivating them to retake the test. This finding corresponds with earlier studies [28] suggesting that students who are given opportunities to obtain higher scores and benefit from feedback before reattempting a test have reduced anxiety, which enhances their participation.
Students' learning: Besides promoting motivation, the multiple-attempts format affected students' learning, and particularly their retention of the learning materials. One student mentioned that repeating test items belonging to the same topic had helped him to remember the materials from the textbook. Another student reported that she had retained a great deal of information from the textbook, particularly after she had taken the test several times. From the retrospective analysis, we identified that students who had made more attempts at a particular test seemed to have better knowledge of the materials being tested than those who made fewer attempts. For example, during a question-and-answer session in a classroom discussion, these students were able to answer the questions we asked them and provide better explanations and examples. This improvement in learning material retention was a result of multiple testing, as evidenced in the previous literature [29,30,23].
With regard to the method of feedback delivery, feedback was initially given to the students immediately after they had completed each test question. This feedback included their score and a review of their test attempt. Unfortunately, many students misused this immediate feedback, copying the correct answers and sharing them with others. This behaviour damaged the validity and reliability of the test, and consequently did not present a true picture of the students' ability [see 20,31]. To address this issue, the delivery of feedback was modified to a delayed method, whereby the test score did not appear until the student had completed all items in the test. More importantly, the review of the student's attempt on the test was delayed until the classroom session.
Students responded in various ways to the implementation of delayed feedback on the online tests. Some perceived that information had been disclosed, whether their answers were right or wrong. They suggested that such disclosures had hindered them from learning the topic being tested. In contrast, many felt that disclosure of the review promoted both individual and peer learning. In an interview, one student explain how the disclosure of information about the right and wrong answers promoted her individual learning: This student also presented us with two of her notes, as shown in Figure 4. In addition to promoting individual learning, the application of delayed feedback seemed to encourage peer learning among the students. Students were observed engaging in group discussions to examine the answers to the test items. They also got together to evaluate the textbook chapters from which the materials for the tests were taken. For example, RR said, "I sometimes asked a friend for help when I felt uncertain about the answer. Then some of my friends soon joined our discussion." When we joined their discussion, we established an interesting fact about the learning scaffolding among the students. We observed that high-achieving students assisted lowachieving ones, explaining not only the correct answers to the test items, but also the rationale. This learning scaffolding was recounted by one student: "I frequently discussed the test items with friends, especially with those who got higher scores on the test. I discussed with them why my answers in previous attempts were wrong and my friends explained it to me" (Interview with MZ) Test anxiety: Although the sequential test mode and the multiple-attempt format promoted students' learning motivation and affected their learning outcomes, they also contributed to students' test anxiety. Students expressed anxiety both before and during the tests. For example, students' fingers were visibly trembling, many had pale faces, and others were observed to express a lack of confidence. One student, EZ, said: Although the effect of test anxiety on students' overall test performance was not apparent in this study, three students who experienced test anxiety were discouraged from continuing their courses. The most likely explanation for this was that they had lost confidence in completing further tests as they had scored consistently badly on previous test attempts. AA, one of the three students who shared his feelings with another student said: However, in this study, these three students did not regularly attempt and reattempt the tests. In other words, the students' exposure to the online tests was not systematic. Cassady [32] argues that unsystematic exposure to computer-based instructional activities may result in a rise in emotional and anxiety levels.

Conclusions and Implications
This paper has discussed a process for the development of a Moodle-based website to facilitate online testing in the context of second-language teacher education (SLTE). More importantly, involvement in the process of online testing and setting up the tests as web activities benefited our professional development. We gained knowledge of the workflow of Moodle-based websites and how to set up tests to facilitate online assessment. In addition, our knowledge of Moodle and online test design allowed us to identify potential issues that might emerge during the online test practices and to prepare alternatives in response. It should be noted that the second author of this paper had intermediate computer skills, with appropriate knowledge of web design and language programming. This level of computer skill allowed us to make modifications to the web design and contributed to our online testing practices. It is therefore recommended that SLTE administration should provide sufficient technical support for teachers with limited computer skills. This paper also highlights the positive effects on students' motivation and learning of the multiple-attempt test format and delayed delivery of feedback. Multiple at-tempts improved students' retention of the learning materials and increased their participation in the online test activity. This finding has implications for assessment methods adopted by SLTE administrators, as a single-attempt format is usually used for tests. In this regard, SLTE administrators should view this multiple-attempt format as an alternative method to facilitate students' learning. However, students' observed improvements in learning retention relied primarily on qualitative data, and their actual levels of improvement cannot be quantified from this study. Further research is required to address this limitation. While the students were shown to have benefited from the multiple-attempt test format and how feedback was delivered, the negative effect of test anxiety arising from the practice of these two aspects of the online test require further attention. More research is needed to examine how the design of online tests contributes to students' anxiety before and after the test and to devise alternative methods of addressing this test anxiety issue.