Usability Metrics for Gamified E-learning Course: A Multilevel Approach

— This paper discusses the effect of a gamified learning system for students of the master course on Web Design and Programming performed at the Faculty of Organization and Informatics. A new set of usability metrics was derived from web-based learning usability, user experience and instructional design literature and incorporated into the questionnaire which consists of three main categories: Usability, Educational Usability and User Experience. The main contribution of this paper is the development and validation of a questionnaire for measuring the usability of a gamified e-learning course from students’ perspective. Usability practitioners can use the developed metrics with confidence when evaluating the design of a gamified e-learning course in order to improve students’ engagement and motivation.


Introduction
The gamification as an emerging technology was first mentioned in an edition of Gartner Hype Cycles that predicts how technology or its application will evolve over time. In 2011, it predicted that fifty percent of Fortune Global 1000 organization would implement gamification in learning and/or hiring process by the end of 2017 [34]. In the Gartner's 2015 Hype Cycle the gamification was not mentioned within the part of digital technology. It was moved to the category of digital marketing as one of its tools [17]. The term gamification was coined in 2002 by Nick Pelling. His idea was to create commercial electronic devices (in-flight video, ATM machines, mobile phones, etc.) that will be enjoyable to use [37]. Soon, the benefits of gaming features have been recognized and they reached other areas (e.g. web applications). The gamification has also appeared in educational contexts where game elements are used to engage students and to improve their experience in learning [6,14,15].
The online methods of teaching and learning are not as effective as it would have been expected, and often result with higher dropout rates, as well as with lower engagement of students. However, since the teachers constantly search for new instructional approaches, it is considered that the gamification of e-learning content will bring the needed knowledge leap. Although the e-learning has brought immediate feedbacks, flexible space, time, pace of study and easy access to materials [4], the learning experience needs to be customized nowadays. On the one hand, the research of five popular learning management systems (LMS) (Moodle, Edmodo, Blackboard Learn, Schoology and Canvas) has shown that the gaming experience mostly depends on the instructor and his usage of a variety of the LMS features, and much less on the LMS gamification features [8]. On the other hand, if a gamified design fails to meet learners' expectations and if it doesn't fulfil their needs there is a high possibility they would avoid using it again.
Based on the findings presented above, this paper aims to develop and validate the questionnaire for measuring the usability of a gamified e-learning course from students' perspective. Furthermore, it is expected that the obtained results will reveal some possible shortcomings of the gamified e-learning course that could be resolved in the next iteration of the course development. Also, this study fills the gap of the empirical research on the evaluation of usability and user experience in gamified learning environments.
The next section provides the review of the literature regarding the usability and gamification in educational contexts. The research methodology is presented in Section 3 of this paper. The results of the questionnaire conducted in the context of higher education are presented in Section 4. The final comments are revealed in Section 5.

Literature review
A learner-centred design (LCD) should engage students and prove their motivation for further interaction with e-learning courses [8,47]. During the learner-centred course design process [46], different types of learning strategies, experiences in learning and the triggers of motivation should not be put aside [3,48]. Furthermore, students will refuse to use an e-learning environment if it takes them too long to learn its functionality, or if the environment is slow and visually unpleasant [12]. Gamification design has therefore appeared as a solution for growing learners' needs to offer a more satisfying learning experience [51] by stimulating learners' motivation and engagement towards learning as gamers have towards playing games [9].
The literature offers many definitions of the term gamification. The creator of the term [37] suggested the following: "applying game-like accelerated user interface design to make electronic transactions both enjoyable and fast". The simplest definition is "the use of game design elements in a non-game context" [13]. However, [16] the definition highlights the role of a user "the use of game elements in design in nongame environments to influence user engagement". Furthermore, the International Organization for Standardization [28] described the usability as "the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use". The effectiveness is measured "by the extent to which the intended goals of use are achieved", the efficiency can be expressed by "the resources such as time, money or mental effort that have to be expended to achieve the intended goals" and the satisfaction means "the extent to which the user finds the use of the product acceptable" [7].
Previously introduced concepts of gamification and usability form a game usability which is described as "the degree to which a player is able to learn, control, and understand a game" [39]. The usability should be assessed through game interface, game mechanics and gameplay during game development according to [18]. Some studies reported that a gameplay as a core activity of a game seriously affects user experience if it is not challenging and enjoyable [33,36]. Although, the focus is on a gameplay, an interface is the interaction layer between the user and the gaming experience. Therefore, the ease of use and the understandability of a game interface yields to overall game usability which is confirmed by [42]. The same authors reported the results from companies that have declared they would give less attention to game mechanics during usability activities, because they consider the focus should be on the user interface and playability to achieve important goals for game usability. In opposite, the game mechanics yield to overall game's dynamic by making the challenging and satisfying experience for the user [27].
Overall, a good gamification design means that users have a special connection with the interface which inspires them to learn more, to feel the autonomy (i.e. free will) and the accomplishment, to feel free to explore, to have social interactions, but in the same time that they are provided with visually appealing graphic elements [35]. Users will relate to a certain software product if they feel they have the ability to accomplish the satisfying goal (see [20]). The research from [6] confirmed that the gamified courses are more motivating for students than traditional e-learning courses. Also, [35] found a significant relation between the gamified content and students' perceived understanding and the engagement in the course. On the one hand, [44] claimed that most respondents highly agreed that preferable activities were those which were providing instant feedback and correction, that is also confirmed within the study of [38] about fostering learners' engagement with proper feedback. On the other hand, there are also studies that report examples of the unsuccessful effects of gamification on students' performance and attitudes in learning environment [16,31,49].
Various attempts on researching the implementation of gamification (or some elements of it) in various courses and the effects that it had on student's attitude and achievement [6,8,16,35] have been noticed within our literature review. However, there were only few attempts to evaluate the usability of gamified systems [44,47,50]. This research is filling the gap by following the suggestion from [24] to evaluate the student's experience as well as the usability, because the system, in ideal case, should be used intuitively without any manuals. Furthermore, [44] proposes three criteria (Usability, Educational Usability and User Experience) based on [24] for evaluating the gamified learning environment and suggests the research of their interrelationships. The research concept is developed by following their recommendations and scientific principles and presented in the next section.

Research methodology
This study used a questionnaire to measure the usability of a gamified e-learning course from students' perspective. The initial step in questionnaire development is to determine the purpose, objectives, research questions and hypothesis according to [41]. The content validity of the original 38-item questionnaire is analysed during the pilot study. Items are modified, added or removed according to their relevance for the domain of a gamified e-learning system. The final iteration of the questionnaire includes the analysis of reliability and the construct validity.

Development of hypotheses
As mentioned above, this study utilized a questionnaire to investigate students' perspective in three defined categories -Usability, Educational Usability and User Experience, in order to encompass rather complex nature of measuring usability of a gamified course. In cases where end-users (in this case students) are involved in the evaluation and optimization of a software the user-based methods have higher validity than the expert methods because they are based on methodological approach [30]. The following hypotheses were set (see Fig. 1): • H1: Usability (US) significantly influences Educational Usability (EU). Students spend more time working for a gamified course than for a non-gamified, because they stated it had been more motivating, interesting and easier to learn [6,35]. According to [47], the usability of a gamified e-learning system can significantly impact students' learning in two ways -positive, students achieve their learning goals easily through the interface or negative, students spend substantial amount of time on understanding how the system works which distracts them from learning. • H2: Usability (US) significantly influences User Experience (UX). It is necessary to meet students' needs regarding the functionality, aesthetics and intuitiveness of the learning systems [12] and due to the fact that the students' experience and their behaviour are directly influenced by the technology they use [24].

• H3: User Experience (UX) significantly influences Educational Usability (EU).
Gamified learning content should enhance student's learning experience resulting with the increased lecture attendance, online participation, proactive behaviour, better scores in different assignments [6,16,44]. Also, the same authors noticed some side-effects of the gamified online courses such as: less class activity, poorly done written assignments, caring only about acquiring points rather than knowledge, etc. Further research can focus on the observed issues in order to identify more possible limitations of a gamified e-learning environment on students' learning experience.

Development of the questionnaire
Three main criteria were identified to be appropriate for the evaluation of the chosen gamified e-learning system from the students' perspective. The first criterion is the Usability which according to [28] has three sub-criteria as effectiveness, efficiency and satisfaction. Efficiency as metrics is not further considered here, because authors of this paper do not have direct access to the e-learning course. Nevertheless, the quantitative efficiency measures as task success, time-on-task and many more will be studied in future researches, especially in relation to qualitative measures that have been used here. The learning environment is considered as a digital product through which students perform their daily tasks, therefore it would be advisable to evaluate it through standard usability measures [44]. Besides the usability point of view, the elearning system as a teaching and learning support must also be observed through educational lenses [47] and emotional experiences [24]. Therefore, the following two criteria were introduced: Educational Usability (EU) and User Experience (UX).
Both terms, Educational Usability and User Experience yield to wider context of conventional usability. The first term "stresses learning-specific use and the relationship of the content to objectives, learning processes and outcomes" [24], and the second, is associated with users' perception, emotions, preferences -the complete experience before, during and after the interaction with the system [24,44].
Previously cited literature was selected according to two criteria: (1) evaluation of the digital learning environment from the perspective of users or experts; and (2) use of a questionnaire as a method for data collection. The questionnaire was adapted to the specific context of a gamified e-learning course which means that neither all categories nor the criteria from the literature are taken into consideration.
Conclusively, this questionnaire is the integration of the evaluated gamified elearning system as a product of students' interaction with their educational environment, and as a process of acquiring knowledge through experiences.
Preliminary Scale. After the items are selected from the relevant literature, a small sample of users review and test the questionnaire before carrying out a large-scale questionnaire [45]. Therefore, a pilot study was conducted with three HCI experts to test it and deliver their observations. Their suggestions were on the syntax and appropriateness of the individual items under each criterion concerning the research domain. The experts concluded that it is not necessary to add additional items. As a result of the experts' feedbacks, the final 38-item questionnaire with five-point Likert scale was compiled.

The analysis and results
The questionnaire, based on a five-point Likert scale, investigated the view of students on Usability, Educational Usability and User Experience of a gamified course in learning management system they use. It was administered to the students of Informatics in Croatia, in June 2017.

Data collection
The questionnaire included 38 statements about students' perception of a gamified e-learning course implemented in Moodle LMS platform. Only students who gained the right to be evaluated and graded at the end of the semester were invited to fill in the questionnaire. A total of 58 students participated in the research. Most students who filled in the questionnaire are third year students (98,3%), and only one student (1,7%) is in the second year of study. The youngest respondent was 21 years of age while the oldest was 48 years old. The median age of respondents was 22 years. Summarized demographic characteristics using descriptive statistics are presented in Table 1. The calculations were made in R version 3.4.0 [40].

The evaluation of the measurement model
The measurement model had a predetermined factor structure based on theoretical grounds. The Partial Least Square Structural Equation Modelling (PLS-SEM) has shown to be the most appropriate for Confirmatory Factor Analysis (CFA) using SmartPLS software [1]. The CFA is used to examine the validity and suitability of the items for each construct [29].
The analysis in SmartPLS 3.2.7 (Student edition) [43] revealed certain outer loading values for which maintaining recommendations from [25] were followed. The measurement loadings between .40 and .70 should be reconsidered before the elimination. In a case where the elimination of a certain indicator increases the composite reliability (CR) the value can be discarded or reconsidered in terms of theoretical perspectives [22]. To eliminate low outer loadings, a total of three iterations were made based on the defined criteria which resulted in omitting nine indicators. The cutoff value for this study is .60, because the CR values are higher than suggested .70 [23]. In the first iteration, ER2, FE4, EM1 and NE1 indicators were eliminated because their outer loadings were under the value of 0.4. Afterwards, in the second iteration, the value of the FE3 decreased and had to be eliminated together with the indicators SO4, SA7 and SO1. Although, the value of the indicator ER1 is close to the threshold of .60, it belongs to the group of questions that target the system error recognition, diagnosis and recovery which are not fully implemented in the Moodle course that was evaluated. The results of the final iteration are shown in Table 2.
Also, the measurement model was evaluated through internal consistency (Cronbach's alpha, Composite reliability -CR), convergent validity (average variance extracted -AVE) and discriminant validity (heterotrait-monotrait ratio of correlations -HTMT). As mentioned before, the CR should be above .70. Furthermore, cut-off point of .70 is acceptable for the Cronbach's alpha (CA) [19]. As can be seen in Table  2, the CA and CR values exceeded the recommended threshold which points to established internal consistency reliability of used measures. Also, each construct retained more than three items through the deletion process [23]. The final questionnaire statements are presented in Appendix.
The validity assessment of the reflective measurement model in this study is focused on the convergent and discriminant validity. The AVE values should be bigger than .50 concerning convergent validity [23]. It is proposed to use the heterotraitmonotrait ratio of correlations (HTMT) instead of the Fornell-Larcker criterion and cross-loadings for the discriminant validity assessment due to their lack of sensitivity, i.e. they require heterogenous loading patterns and high sample sizes [25]. The suggested cut-off value is less than .90 for the inter-construct correlations [21]. According to Table 3 the discriminant validity is established.
Overall model fit of CFA. After the theoretical model is verified by Structural Equation Model (SEM), it is necessary to perform the evaluation of the overall model fit. [23] recommend reporting at least one fitness index from each category of model fit (absolute fit, incremental fit and parsimonious fit). Not all the GFI and AGFI values exceed the threshold of .90, but they still meet the requirement of .80 [26]. Only the AGFI value of the Usability construct doesn't achieve the required level, but it's close to it. Besides, [32] explained that sample size affects the GFI and AGFI to reach the threshold. In general, other fit indices show relatively good model fit. The various model fit indices and the recommended level of acceptance [5] calculated in IBM SPSS Amos for Windows, version 20 [2] are reported in Table 4.

The evaluation of the structural model
The procedure from [23] was followed to empirically confirm the concept of the path model. First, the examination of the predictor constructs for variance inflation factor (VIF) showed that values are below the suggested value 5, which means that there is no collinearity among variables. The Educational Usability and the User Experience are constructs of interest considering that the perception of a gamified course is examined. The coefficient of determination (R 2 ) is evaluated in this structural model. [23] claim that R squared values mostly depend on the research context, but [10] proposed the following value rules: above .67 are high values, between .33 and .67 are moderate, also values between .19 and .33 are weak, and all values below .19 are unacceptable. The calculated predictive relevance (Q2) values of EU (.342) and UX (.352) based on the blindfolding technique showed medium to large predictive relevance for the named constructs according to [23] (see Table 5). Path coefficients results in Table 6 showed that the Usability (US) has high bearing on User Experience (UX), and low on Educational Usability (EU). Although, total effects calculations showed that the Usability has a bit stronger effect on the Educational Usability (0.814) than on the User Experience (0.800). The analysis of outer weights of specific items in the Educational Usability showed that students gain a sense of accomplishment and autonomy by attending a gamified course. They consider that tasks were more easily achievable to them in this kind of course. Also, students were regularly provided with the feedback on the assessment results and their progress. Considering the items from the User Experience, students agree that a gamified course is interesting and acceptable form of learning. All relationships in structural model are significant (see Table 6), confirming the defined hypotheses about the construct relationships. Furthermore, the effect size (f 2 ) below .15 indicates small effect on the endogenous construct, from .15 to .35 moderate, and above .35 large [11]. The reported f 2 values in the Table 6 indicate moderate to large impact of predictor on an endogenous construct.

Conclusion
Although the gamification is mostly recognized in terms of the corporate training environments, its impact on the blended learning will, at some point, become unavoidable according to the researched literature. There are various definitions of gamification and usability which highly depend on the context of use. However, game elements are mostly used in different contexts as influential factors on users' motivation and engagement.
The multilevel framework developed for the purpose of this research consists of three constructs (Usability, Educational Usability and User Experience) that are recently used in the literature for the evaluation of a gamified software. The framework includes only the elements considered relevant for this field of research.
The research results confirmed that the Usability is indeed a significant predictor of the Educational Usability and the User Experience. In addition, it was shown that the User Experience also predicts Educational Usability.
However, there are a few limitations to be noted. First, the option "not applicable" was not included in the questionnaire which could influence the interpretation and the analysis of the results. Secondly, various new terms and meanings could be interpreted differently by different students.
In the end, the data obtained from the evaluation results can be used to adapt the elearning course according to students' needs. Also, the questionnaire findings can be used as a good practice example for other gamified courses in the e-learning context.
In conclusion, it should be mentioned that traditional learning cannot be fully replaced by games in all contexts, because gamification is usually not a cheap process and it requires a thorough development process and great human efforts.

Appendix
Item Usability Ref.

SA1
The utilized game design elements were useful to motivate me to use the gamified e-learning course. [50]

SA2
Using the gamified e-learning course was a worthwhile experience. [50]

SA3
The full story presented in the gamified e-learning course is meaningful. [50]

SA4
The gamified e-learning course provides me a meaningful feedback. [8] SA5 I enjoyed the experience of using the gamified e-learning course. [50] SA6 I found earning game achievements (badges, points, rewards, etc.) increased my enjoyment of using the gamified e-learning course. [50]

SA8
Trying to earn game achievements had a positive effect on my behaviour. [50]

SA9
The game achievements (badges, points, rewards, etc.) motivated me to participate more than I would have done otherwise. [6,50] SA10 Time passed quickly for me during the task performance. [50]

EF1
The gamified e-learning course improved my understanding of the course material. [50] EF2 I performed my tasks better because the e-learning course is gamified. [6] EF3 This gamified e-learning course required more work than other courses, but it was not difficult to learn from. [6] EF4 I used this gamified e-learning course more often than other courses because it is gamified. [6,51] Educational usability

CL1
Goals were clearly set out. Objectives and expected outcomes for learning were clear. [24] CL2 The gamified e-learning course provided me clear goals about what I should do next. [8] CL3 The gamified e-learning course divided the tasks in such a way that they were achievable to me. [8] CL4 The gamified e-learning course provided me a sense of accomplishment. [8] CL5 The gamified e-learning course gave me a feeling of autonomy. [8] FE1 Prompt feedback on assessment and progress was provided. [24,44] FE2 Guidance was provided about the tasks and construction of knowledge going on. [24,44] User experience EM2 Gamified e-learning course triggered positive emotions in me. [8] EM3 The gamified e-learning course provided me a feeling of purpose. [8] EM4 The tasks within the gamified e-learning course are motivating to learn more about web design and programming. [44] EM5 Gamified e-learning course is interesting and an acceptable form of learning. [24] EM6 This way of learning web design and programming is exciting. [24] SO2 The gamified e-learning course provided me a feeling of relatedness to other students. [8]

SO3
This gamified course provided me the opportunity to cooperate with other users. [8]

NE2
This gamified learning environment is stimulating to me. [24] NE3 A sense of security is achieved in this gamified course. [24]