The Effects of Static and Dynamic Visual Representations as Aids for Primary School Children in Tasks of Auditory Discrimination of Sound Patterns. An Intervention-based Study

It has been proposed that non-conventional presentations of visual information could be very useful as a scaffolding strategy in the learning of Western music notation. As a result, this study has attempted to determine if there is any effect of static and dynamic presentation modes of visual information in the recognition of sound patterns. An intervention-based quasiexperimental design was adopted with two groups of fifth-grade students in a Spanish city. Students did tasks involving discrimination, auditory recognition and symbolic association of the sound patterns with non-musical representations, either static images (S group), or dynamic images (D group). The results showed neither statistically significant differences in the scores of D and S, nor influence of the covariates on the dependent variable, although statistically significant intra-group differences were found for both groups. This suggests that both types of graphic formats could be effective as digital learning mediators in the learning of Western musical notation. Keywords—bimodality, static-dynamic presentation, discrimination of music patterns, music education


Introduction
It has been affirmed that the frequent use of multimedia materials in the classrooms makes it necessary to carry out highly specialized research in order to determine which features of media -and their combinations -best facilitate learning [2], [11]. Much research has been carried out across certain disciplines on the effects of unimodal and multimodal presentations of information in learning processes, using either sound, dynamic images, static images, text, analog representations, and/or notations. However, related studies regarding music education, such as those on bimodal modes of displaying information, are scarce and have given inconclusive results, as shown in the literature review below. The present study has attempted to address the question of whether there is any effect of either static images or dynamic images used as aids in the discrimination of musical sound patterns by primary school students. The main goal of this study was to contribute to the knowledge on the relative effectivity of different modalities of presenting musical information, one of the critical factors for the design of digital objects as mediators in the learning processes.

2
Literature Review

Presentation modes of musical information
The investigations on the presentation of information in learning situations in the music classroom show heterogeneous conclusions, among which stands out the relationship between modes of presentation and other aspects and/or specific elements.
Some studies have suggested that using sound exclusively as a unimodal presentation mode is most effective for learning activities [3], [23], [26], while others show no differences with respect to the display mode used [22]. Furthermore, some researchers have also pointed out that video information could possibly consume so many cognitive resources as to prevent the subject from focusing attention on stimuli of the audio mode [22], [23].
Other authors have reach different conclusions, arguing that discrimination of all auditory stimulation is facilitated when presented together with a visual image [14], [18]. According to these studies, the emphasis placed visual information when learning through ICT naturally reflects the preference for stimulating perception through sight above the other senses, but it can also cause confusion in the perception of tasks due to the division of attention [11], [30].
Cognitive styles have been researched as potentially influential variables. One study examined the effects of visual or auditory presentation of information on students with either auditory or visual learning styles [20], [16]. The results showed that both the learning and the subsequent recognition of melodic phrases were enhanced when the display mode corresponded to each participant's individual learning style.
It can be assumed that the discrepancies in the results of the aforementioned studies are likely the result of differences in stimuli used and the way in which they were presented [18].

Multimodal stimuli
It has been suggested that multimodal information presentation could help to avoid the limitations of memory work [5]. In fact, there are some authors who defend the combination of audiovisual stimuli for educational purposes [1], [7], [17], [30]. Nevertheless, the great diversity of visual elements makes it difficult to reach to a consensus on their effects in combination with auditory stimuli [6]. Some studies, for example, are based on moving images formed by video recordings of concerts [23], [22], while others employ animation [4], sequences of movies [21], or stills, drawings, musical scores, texts, objects, landscapes or buildings in which a concert took place [16].

Static versus dynamic visual information
The subject of this paper, the question of whether musical sound images would be best accompanied by static or dynamic images, has not been clarified by previous research in this area, The most important theories to explain the effects of the different modalities of learning are Multimedia Learning Theory (MLT) and Cognitive Load Theory (CLT) [2], [11], [12]. MLT is based on three assumptions: 1) Visual and auditory information is processed through various channels of information.
2) The processing capacity of each channel is limited. 3) Learning is considered to be an active process [24], [35].
CLT focuses on the limitations of cognitive processing. According to CLT, static images can facilitate learning processes provided that only relevant information is presented and it is possible for the learner to control both the pace and the order of the images [37]. CLT also suggests that dynamic media can reduce the exterior cognitive load, attract attention, increase motivation, and minimize the effort needed in the construction of mental representations [4], [15], [26].
The results of some studies not related with sound or music have shown significantly greater performance in learning with the use of dynamic media (animation) than with static media (text only) [5], [11], [12], [31]. In this regard, it has been suggested that the more complex the learning content, the greater the benefit of using animations [2]. It has also been asserted that dynamic media is only beneficial when it is designed to reduce the cognitive load and generate mental models of the concept being taught, thereby offering visualizations that correspond to a meaningful mental model. The media must also be consistent with the experience and prior knowledge of the students in order to avoid information, animations, and other elements that are not needed to understand the concept. Moreover, it has been suggested that interactivity, from the point of view of the demand for memory, should result in a lower cognitive load and should improve understanding [2], [33].
Other studies do not support this hypothesis, showing either no significant intergroup differences [9], or differences in favor of the use of static visual media [24], [36]. It can therefore be assumed that the processing of dynamic media by students could generate some potential disadvantages [19], [25]. As an example, students could perceive the animations in a superficial way without really processing them due to lack of cognitive challenge or, on the other hand, be unable to process the material in a satisfactory manner due to excessive cognitive challenge. In addition, if the speed at which an animation presents information is greater than the speed of students' understanding, it could lead to excessive demand for cognitive resources, meaning that resources for other high-level tasks would be unavailable, resulting in the impossibility of understanding [10], [28]. In this sense, certain investigations related to reading comprehension have shown the importance of regulating the information according to the cognitive needs of the students [13], [27].
Apart these recurrent themes in the literature, it has been suggested that learning could be facilitated if the modes of presentation correspond to students' individual differences [12]. As an example, it was suggested that dynamic images may be more effective than static images for people with low spatial ability [2]. Nevertheless, results of other investigation show no effect of this variable [9].
Taking all the current literature into account, it can be seen that the relationship between modes of presentation and learning is not clear.

Design
A pretest-intervention-posttest design was adopted for the collection of data. The impossibility of performing a random assignment of subjects to the experimental groups forced the adoption of a quasi-experimental strategy (intact groups).

Subjects
The research subjects were 5th grade pupils (N = 48; 23 boys and 25 girls) from 2 intact classes of a primary school in the city of Valencia, Spain. Two groups were formed: D (dynamic) and S (static). The D group was given dynamic images as aids in intervention sessions for the discrimination of melodic patterns, while the S group used static images. Prior to conducting the study, the school was presented with a letter containing a brief description of the study and a petition, in which ethical codes to be respected were declared. The information included the aim of the study, a description of the role of the students, a confirmation of the absence of any kind of personal risk, and the assurance of the anonymous and confidential use of information collected in the research, i.e. ensuring that both the results and the personal data of the students would be used exclusively for this study.

Variables
The independent variable was the information presentation mode, the levels being the static and dynamic images. The dependent variable was the score in tasks of discrimination of tonal melodic patterns. Age, gender and previous musical experience were treated and analyzed as intervening variables (covariates).

Instruments
A questionnaire was prepared for the measurement of intervening variables (age, gender and previous musical experience).
A pre-posttest was created in order to measure the subjects' ability to perceive and correctly identify melodic patterns. This test, taken by the students before and after treatment, included ten items designed exclusively for this study, in which students were required to listen to and identify tonal melodic patterns. In both the pretest and the posttest, these patterns were presented both as audio and as static images.
More specifically, the test consisted of a printed sheet with 10 consecutively numbered items. Each question showed static images representing a certain tonal melodic pattern ( fig. 1). Each question had 4 response options, only one of which was correct. The static representation of each response option consisted of five circles corresponding to the five sounds in the pattern. The heights of the circles in the images were proportional to their sound frequency, corresponding to the following spatial metaphor: the higher the pitch, the higher the circle on the diagram. The circles were linked by a line, thus forming a melodic pattern profile. Western notation was not used to avoid the potential influence of previous musical experience, which could have otherwise acted as an intervening variable, systematically altering the effects of the dependent variable. A number of precautions were taken in order to reduce the number of potential variables that could influence results. Firstly, in the test, the maximum number of events per pattern was limited to 5 to avoid a potential saturation of sensory memory [29]. Secondly, a clarinet was used to record the sounds, as its tonal quality has a very limited harmonic range and therefore the pitch frequencies produced were sufficiently accurate. Thirdly, the diatonic scale of C major was used for all patterns. Finally, isochronous sounds (sounds with the same duration) were used to avoid any effect of rhythm in the discrimination and recognition of patterns because otherwise, the results of this study could have been invalidated [8] The pre-posttest had construct validity (similar scope and difficulty in the content of the intervention tasks) and scores were evaluated by three professors of music based on a 5-point scale with two criteria of evaluation for each item of the test: suitability and applicability (agreement inter judges k =.785).
In addition, 8 intermediate tests were designed to be administered in each of the intervention sessions. Each intermediate test consisted of 1 printed sheet with 4 items, testing perception and identification of tonal melodic patterns. The items were numbered consecutively and each contained four response options, similar to the configuration of the pretest-posttest ( fig. 1). The students in the D group were presented with dynamic images along with the sound patterns, while those in the S group were presented with static images.
A climate of normality for the pupils was created in order to promote the internal validity of the experimental design. More specifically, the content was taken from the syllabus in the school where the experiment took place and moreover, the activities in the intervention sessions were run by the regular music professor who normally teaches the two groups of subjects (D and S).

Materials
Audiovisual stimuli for each experimental condition were designed and rendered with a video editing software application. The video signal was played by a video player software device and routed to a video-projector, while audio signal was routed to a high quality PA System.

Procedure
The study began with the administration of a questionnaire that collected data from the subjects for the measurement of the intervening variables: age, gender and previous musical experience. As aforementioned, this data was taken into account as covariates in the analysis of the results [16], [17], [32].
A week later, the pretest was administered. Each subject heard a tonal melodic pattern, which was repeated four times, and then had to choose the image which, in his or her view, correctly represented the pattern. There was a 10 second interval between the completion of each test item and the beginning of the following item.
Immediately after the pretest, the subjects were explained how the intervention was going to continue. Practical demonstrations were used to explain the activities of the 8 intervention sessions, which began the following week.
The sessions took place once a week, always on the same day and in the same time slot during class (mornings). The duration of each session was approximately ten minutes. Each session followed the same protocol: 1) review of the learning content from the previous session; 2) presentation of the new learning content, meaning subjects listened to and visualized tonal melodic patterns, performing eight discrimination tasks consisting of patterns of 5 sounds, each of which was repeated five times and presented with sound and either static images (S group) or dynamic images (D group); and 3) completion of a test with 4 discrimination-recognition tasks of tonal melodic patterns in each group's corresponding level of each experimental condition. The resulting scores of the discrimination-recognition tasks in each session were used in the statistical analysis. A week after the last intervention session, the posttest was administered.

Results and analysis
The effects of the two levels of the dependent variable, static image presentation (S) and dynamic image presentation (D), were measured in three different moments by the pretest, intervention tests and posttest described earlier (table 1).
The effects are presented through the means and corresponding standard deviations in the three moments of measurement ( fig. 2). The intra-group differences were measured in relation to the results of the intervention activities in the different mo-ments of measurement through a one-factor ANOVA for repeated measures (with the Greenhouse-Geisser correction for non-sphericity), taking a confidence level of 95% (common in the Social Sciences and Education).  Data returned by one-factor ANOVA showed that no significant differences were found between the groups S and D (F=.56, p = .529). However, there were significant differences between the three different moments of measurement (pretestintervention-posttest) (F=10.45, p < 0.001). Therefore, it can be suggested that both types of visual aids facilitated the discrimination of tonal melodic patterns.
An ANCOVA was carried out in order to estimate the possible effect of age, gender and previous musical studies. The possible interaction of these covariates with the moment was also considered (i.e., the possibility that the effect of the variable on the graphic recognition of sound patterns could vary over time according to the covariate). No significant effect was initially observed. However, after eliminating the covariates with higher p-values (those with less significance) one by one, the resulting model included the interaction of the covariate "previous musical experience" with the "moment". The effect of previous musical experience was significant (F=7.61, p =.016) with higher scores coming from pupils with more musical experience. Therefore, the model confirmed the effect of the moment (F=6.57, p =.005). Meanwhile, no significant interaction between the moment and the previous musical experience was observed (F =2.57, p = .096). In other words, it can be assumed that the better performance of students with more musical experience with respect to those with less experience was proportional over time as measured in the different moments.

Conclusions
The results of this study do not support the superiority of either of the two audiovisual modalities -audiovisual dynamic or audiovisual static-as aids in the process of improving students' ability to identify, discriminate or to associate tonal melodic music patterns to a non-musical representation. This is consistent with respect to other research [5], [12], [24], [36]. A possible cause of the absence of significant inter-group differences could be found in the nature of the dependent variable [18] which, in this study, was the discrimination of musical sound patterns and eventually the association of patterns to a non-conventional representation. It should be noted that neither the concept nor the types of tasks involved in the aforementioned investigations were similar to the variable studied in this work.
As a minor finding, statistically significant intra-group differences occurred, which suggest that both types of visuals helped the subjects in the discrimination and recognition of sound patterns. This supports research findings [14], [18], [38], [39] which suggest that bimodal presentation modes can facilitate the auditory discrimination of melodic lines through two mechanisms: 1) not increasing -or even decreasing-the external cognitive load in the tasks of perception, discrimination and association; and 2) reinforcing aural stimulus without distracting the attention of the students [11], [29].
A possible alternative explanation for the improvement in the groups' performance is due to the fact that, regardless of the mode of presentation of audiovisual information, the intervention sessions required the students to work in a more regular manner and more frequently than in ordinary classes. In this study, therefore, the eight regular sessions of the intervention could have improved the capacity of subjects in relation to the tasks regardless of the mode of presentation of information. If true, it would therefore not be possible to suggest that bimodal presentation is superior to other sensory modes. Another consideration is that students may tend to focus more attention and cognitive resources when involved in activities which take place in relatively short periods of time and which they know will be evaluated by a researcher (in order to please the researcher and obtain desirable results) -i.e, the results could be partly due to the bias of expectation or external motivation - [37].
The individual differences between the students, represented in this study by the covariates age and gender, did not influence the results in either of the experimental conditions. However, previous musical experience did affect results obtained in the analysis of the three measures of dependent variable. This could be explained by the fact that responses from musicians and nonmusicians are very different when listening to the same musical stimuli for reasons that include not only training, experience, ability and musical memory, but also other features such as personality and maturity [21]. Nevertheless, there is also research on the influence of different presentation modes on the perception of music which suggests that both musicians and non-musicians have very similar responses [21].
A recommendation for future research would be the replication of this study with a larger sample in order to obtain a higher level of external validity. Another recommendation is to replicate this study with students from different socio-economic backgrounds in order to explore the potential influence of this variable. In future research, it would also be recommended to control the complexity of the learning content and to increase the number of sessions. Finally, it would be appropriate to approach cognitive styles as a variable in order to determine if there is any correlation of this variable with the given results.