Big Data Support for Problem Solving Method in Mass Spectrometry Topic in Modern Analytical Chemistry Course

Extremely large and unpredictable user generation of data, all digitized and stored in large data repositories is built up by scientists, especially from modern analytical chemistry. This study aims to build a new approach in chemistry education, by utilizing Big Data sources to support IDEAL (I-Identify problem, D-Define goal, E-Explore possible strategies, A-anticipate outcomes and act, L-Look back and learn) Problem Solving learning model. Modern analytical chemistry studies and uses instruments to analize chemical compounds up to structural analysis.  Modern instruments, such as mass spectrometer, generate information of compounds and stored in big data bank.  This must be able to be accessed and used in chemistry education.  This report would be around the benefits of using Big Data during learning process in this digital era, through IDEAL Problem Solving learning. Some preliminary progress would be presented.  The growing number of data and resources would change also teaching and learning methodology in higher education.  Some highlights about disruptive learning innovation would be described


Introduction
Chemistry is a science to study chemical reaction-based processes from natural sciences which mainly refers to scientific research. Learning chemistry, means being exposed to three worlds or levels, namely the real world (macroscopic), the real world and the representation of the theoretical model (submicroscopic) and the world of representation (symbolic world) [1]. One of the scope in chemistry is analytical chemistry. Analytical Chemistry in general studies and uses instruments and learning models that are used to separate, identify, and measure material both qualitatively and quantitatively. Analytical chemistry is often described as the field of chemistry which is responsible for characterizing the qualitative and quantitative composition of matter [2]. The scope of analytical chemistry can be divided into four parts, one of which is structural analysis. Structural analysis plays a role in analyzing the structure of a compound in a sample by utilizing modern instrument such as mass spectrometry (Mass-Spectrometry). This mass spectroscopy is used to determine the mass of atoms or molecules. The working principle of this instrument is the deflection of charged particles in a magnetic field. Molecules that are initially uncharged (neutral), are converted into charged. This learning model can also be used to determine the content and composition of compounds in the sample, whose learning model is combined with the separation learning model. This instrument can also be used for analysis of reaction mechanisms. The reaction mechanism can be carried out because the fracture pattern of the molecular body can be analyzed the process of breaking the original compound. The resulting mass spectrum is the result of breaking the sample compound, which when analyzed will provide information about the initial compound [3].
Increasingly, there are more and more research on structural analysis, especially using mass spectrometry, where data are finally collected and placed in a data warehouse. A survey conducted by IDC reports that the digital world will grow by a factor of 10 from 2013 to 2020, from 4.4 trillion gigabytes to 44 trillion. This indicates that the growth of this data is more than doubling every two years [4]. The growth of data is massive, leading to the technological innovations. In this case the term Big Data emerged. Big Data is the word, describing the following characters such as Volume (size of data in its increasing units: GB, TB), Velocity (fluency in streaming data), Variety (sources of data: text, images, videos, audios), Veracity (accuracy of data) and Value (power to make decision) [5].
Big data affects the education sector as it does to every sector. Big data analytics plays a major role in the education. Big data might transform educational data studies by supporting analysts be more efficient and provide more informative conclusions [6]. With the existence of big data from various sources, educators are aided in conducting classroom learning. Appropriate organizing and direction will provide more meaningful learning and a more effective and efficient learning process. Processing and utilizing more detailed data, not only for the development of education itself but also for understanding students more deeply, starting from the level of absorption, to learning models that match their character [7].
The need for information is increasing with the increasingly advanced internet information technology. Increasing technology automatically affects the education sector both in terms of administration and the learning process. Internet technology has led to the displacement of the learning process from the "chalk and talk" system to "based on internet". The following factors lead to the generation of big data in educational institutions [8]. Big data is a high-volume, high-speed and high-variety information asset that demands cost-effective, innovative forms of information processing for enhanced insight and decision making. This large amount of data makes it possible to analyze and provide better decision support [9].
One of big data in analytical chemistry is NIST. NIST is an institution or data center for information and technology, and one of them stores the big data on chemistry of mass spectrometry. The NIST mass spectrometry data center is a group in the Biomolecular Measurement Division (BMD) that develops the mass spectral library that is evaluated and provides related software. This site offers information and access to NIST mass spectral data products. A collection of data products are presented to aid compound identification by supplying reference mass spectra of from any molecules, including EI and MS tandem (for small molecules as well as bigger ones like peptides) libraries, GC retention index collections and also numerous freely available spectral libraries. Data analysis tools are also available for free, including AMDIS (Automated Mass Spectral Deconvolution and Identification System, this is for GC/MS), Mass Spectrum Interpreter (to elucidate chemical structures from mass spectra), and Mass Spectrum Digitizer Program. In a full version of the NIST MS Search Program, a small demonstration library is available. This big data source can be accessed through chemdata.nist.gov or webbook.nist.gov/chemistry [10].
The learning process in the classroom will be better if technological advances in terms of Big Data sources are collaborated with the right learning model. According to Greeno in Sulasamono, learning models that are triggered by a problem can trigger a thought process in learning [11]. Problems are gaps that occur in (cognitive) thinking. Based on information processing theory, a problem is a situation when the knowledge stored in memory is not ready to be used to solve problems [12]. The learning model that can be applied is IDEAL Problem Solving. IDEAL problem solving is a problem solving learning model that is carried out at the IDEAL stage.
IDEAL as a problem-solving model in education introduced by Bransford and Stein. There are 5 indicators for this model (1) Identify the problem and make a creative opportunity, (2) Define goals to set, (3) Explore potential solutions and approaches, (4) Act on the strategy found and anticipate the results, and (5) Look and learn: review the real process as well as consequences from the experience gained [13]. This model would enhance the ability to think and improve skills in the problem solving process. Sometimes the answers are not in accordance with the objectives set. In IDEAL problem solving, the fifth step, which is to look back, if the answers are not in accordance with the desired goals or have not been achieved, then it can return to the stage where the error occurs. The ability of using thinking skills or operational abilities to solve problems or tasks, will be used a lot [14].
Problem solving learning has several advantages. Problem solving is a good learning model for understanding lesson content so that it challenges students' abilities and gives satisfaction to finding new knowledge. According to Elias and Colleagues in [15] the advantages can be described as: (1) to increase awareness about the problem given and have the idea of problem solving, (2) to encourage positive expectations for problem solving and distract attention from undesirable or pre-occupying thoughts, (3) to encourage perseverance against emotional stress and tough situation, and (4) to simplify a positive emotional state especially within a group. The problem solving model shows that every subject has a way of thinking, and must be understood. Problem Solving is an alternative innovative learning model developed based on a constructivist paradigm. Students play an active role in constructing their knowledge so that students can develop their thinking skills. Problem Solving is one of the problem based learning groups where the teacher helps students learn to solve problems through learning experiences [16].
Process of learning using the IDEAL Problem Solving model shows that this model is able to improve competency mastery of solve problem or task which includes aspects of identifying problems, formulating problems, finding alternative solutions, choosing the best solutions [17,18]. Every problem that exists needs to understand carefully how the problem-solving process must be carried out. The problem solving process has an important role in developing the ability to improve thinking skills. Problem solving is a core activity in classrooms at all levels of education around the world, since problems are the materials to teaching, learning, and the basis for intellectual activity in the classroom. Thus, problems shape students to learn. In the end, anticipating, checking and evaluating student assignments on problems is a large part of the teacher / lecturer task [19]. This is in accordance with research conducted by Jitendra et al., that when students are given proper instruction, difficulties in problem solving can improve problem solving performance proportionally [20].
Research of Problem Solving learning model has been widely carried out and found that the Problem Solving learning model can increase student activity and learning achievement. Tambunan [21], concluded that problem solving strategies were more effective than scientific approaches to students' abilities in term of communication, creativity, as well as mathematical reasoning. Another result by Pinta stated that the average percentage of misconceptions of students who are taught with the Problem Solving learning model is less than students who are taught using conventional learning [22].

Method
The method operated in this research was experimental research with a form of research design, namely Pre-Experimental Design one-group pretest-posttest. The use of pre-experimental design in this study was because there are still external variables that influence the formation of the dependent variable. So the experimental results which were the dependent variable were not solely influenced by the independent variable. This happens, because there was no control variable and the sample was not randomly selected [23]. The one-group pre-test posttest design was chosen because of limited access to select classes due to the pandemic, limited lecture and class materials available for conducting research. Because of these limitations, this experimental study used a one-group design, with the form of a "Pre-experimental designs" design, with a onegroup pretest-posttest design.
The population for this time were all students of the Biotechnology class, State University of Malang, Indonesia, in Introduction to Spectroscopy and Microscopy Course, in which Mass Spectrometry was one of its topic. The sampling technique used was purposive sampling, namely taking the sample determined from the research design so that the sample criteria really matched the research. The sample in this study was one sample class. Taking the sample based on consideration of lecture material, appropriate teaching hours, and class limitations. As many as 19 students were selected for this research.
The variables in this study were independent and dependent variable. The independent variable was the influence of the IDEAL Problem-Solving Learning Model assisted by Big Data. The dependent variable was the learning outcomes of the biotechnology students' mass spectrometry topic.
The research procedure was divided into three parts, namely the preparation, the implementation and the completion stages. The preparatory stage that was carried out was preparing the syllabus, lesson plans, grids of learning outcome test questions, research instruments such as problem handouts, mass spectrometry material, and the implementation of the learning process. The implementation stage was carried out in lectures using the IDEAL Problem Solving Model with Big Data Support for 5 online meetings (1 meeting for pre-test implementation, 3 meetings for the learning process and 1 meeting for post-test). The IDEAL Problem Solving learning model has steps in the learning process. This learning step is divided into five stages. The five stages of the IDEAL Problem Solving assisted by Big data model are as in Table 1:  • Guiding in viewing / correcting ways of solving problems • Guiding in seeing / assessing the effect of the strategies used in problem solving.

•
The completion stage was carried out with an assessment, namely a test of learning outcomes and data analysis. Assessment of this type of research was the process of collecting data followed by processing the information as a way to evaluate the achievement of student learning outcomes [24].
The data collection technique used was through learning outcome tests. Learning outcomes test questions in the form of essay questions totaling 5 numbers and have been validated by experts. The normality test was used to determine the normality of a data so that it knows the statistical techniques that will be used for further data analysis. In this case, the normality test used the Saphiro-Wilk test using SPSS Version 16. With a confidence level of 95% with an error rate of 0.05. To make sure that the data was normal or not, the significance value of the SPSS output was evaluated. The data will be normally distributed if the value is sig. > 0. 05. The homogeneity test was used to determine that the data come from the same variant. In research to test homogeneity using SPSS version 16, namely by looking at the significance value at the SPSS output if the value is sig. > 0.05, the data comes from the same variant and the sig value. <0.05, the data does not come from the same variant.
The significance value of the t-test results used was the Two Paired Sample Test, namely the two tailed or two sides test in the SPSS 16.0 software program. Hypothesis testing is carried out by using comparative hypothesis testing between two different variables. In this study, researchers used analysis using SPSS version 16.  The results showed the normality test using the SPSS version 16 software, and listed in the table above. It can be seen that the significance value of the pre-test data was 0.803 and the post-test is 0.139 which means that the significance value was greater than the 0.05 probability value so that the data is normally distributed, then further process could be carried out.

Homogeneity of Pre-Test / Post-Test Question Data
Homogeneity Test of Pre-Test / Post-Test Questions. The homogeneity test was carried out on the question data to check whether the variant data was homogeneous or not. The homogeneity test used was Lavene's Test using the SPSS software version 16. The results of the homogeneity test are shown in Table 3. From the output of the test of homogeneity of variances, the significance value was 0.065> 0.05, which means that the pretest and posttest scores come from the same variant (homogeneous).
Hypothesis testing. Based on the prerequisite test, it was found that the learning outcome data were normally distributed and homogeneous, so that the type of hypothesis test carried out was the Independent Sample T-Test (T-Test). The significance value of the t-test results used was the Two Paired Sample Test, namely the two tailed or two sides test in the SPSS 16.0 software program. Hypothesis testing is carried out by using comparative hypothesis testing between two different variables, namely student learning outcomes before using the IDEAL Problem Solving learning model assisted by Big Data and student learning outcomes after using the BIG Data-assisted IDEAL Problem Solving learning model. Hypothesis Test Results are listed in Table 4. From the SPSS output results in the table above, it is known that the Sig. 0,000, which means the significance value of 0,000 <0.05, it can be concluded that there is an effect using the IDEAL Problem Solving model assisted by Big Data on student learning outcomes.
Big data resources in this case brought more information to the class and this was a good sign for the teaching and learning process. While more knowledge was simply included to the topic, better comprehension can be achieved. Beside the textbook provided for the class, students can access any further materials related to their need, under lecturer's supervision. The textbook is in Bahasa Indonesia, and designed for chemistry students. In this case, biotechnology students need to learn their objects in chemistry point of views for their basic understanding of chemistry-biochemistry analysis. Chemistry topics used to be small molecules in discussion while in biotechnology the objects are much binger in term of dimension. There must be some "bridging knowledge" in between chemistry and biology and biotechnology.
In open sources in the internet, including the databases provided by lecturers, the need for better information can be filled from the open sources. The intensive use of mobile technology in this case is helpful for the better processes in higher education. This type of tendency is currently being investigated along with more platforms and applications created in the time during Covid-19 pandemic time. More progress in digital era is accelerated by databases provided so far.
N-Gain Data Analysis. N-gain analysis is used to test the hypothesis which reads "The improvement of student learning outcomes uses the IDEAL Problem Solving learning model assisted with Big Data". Based on the criteria table for the N-Gain value on learning outcomes, the N-Gain value was 0.5322 ≥ 0.3 and 0.5322 ≤ 0.7, then the learning outcomes are categorized as moderate. So that the hypothesis which reads "Improving student learning outcomes using the IDEAL Problem Solving learning model assisted by Big Data is categorized as moderate" can be accepted. Further study around both factors is underway since the problem solving approach itself contribute a lot for concepts understanding as well as big data and applications in the databases. Big data can be a good tool for modern teaching and learning in higher education institutions.
The N-gain value can lead to some interpretation, and this must be done very carefully. In term of gaining knowledge there must be a meaningful process in teaching and learning, and this must result in the better understanding and higher scores obtained. However how it was due to the treatment given, it must be followed up by some confirmation test. By having interview with some students, some more explanation can be obtained and analized further.
Results of Learning Outcomes Test Score. The results of the learning outcomes test using the IDEAL Problem Solving learning model assisted by Big Data are presented in Table 5. Learning outcomes in this case is the result of pre-test and post-test around mass spectrometry topic. This instrument consisted of some points in essay to be answered by students. The pre-test and post test were the same test, and was already validated. The scores are presented in Tabel 5 as well as the quick analysis of them. Based on the data in table 3.4, the pre-test mean value was 54.21 while for the posttest mean score increased to 78. 58. From this data it can be concluded that there was a positive effect of using the IDEAL Problem Solving learning model assisted by Big Data on student learning outcomes.
In the big data resources there are several types of information, in which both need special skill to dig the information out of it. However, using problem based learning approach, some students in groups can access the database and learn from it. Student was also busy and focused to the assignments given, while they had to learn some more skill in data digging. However, modern students manage to overcome technological difficulties while doing it in groups. For example using NIST database, the students can extract the information about molecular structures given in more practical way. The structure can be accessed and filtered using provided software and they learnt and understood faster than using textbooks.
For the future, big data are being accessed by more researcher in many areas of interest. The available data enable researcher to analyze more than collecting data in the laboratory. The experimental part of scientific research can be skipped and data analysis becomes pronounced and accessed from many different points of view. This way cost for scientific attempts can be reduced. More if the data is also used in teaching and learning as practical work can be minimized too.

Conclusion
Based on the results of data analysis and discussion, it can be concluded that: 1. There was an influence on student learning outcomes before using the IDEAL Problem Solving learning model assisted with Big Data and student learning outcomes after using the IDEAL Problem Solving learning model with Big Data support. 2. There was an improve in the average student learning outcomes on mass spectrometry material with the IDEAL Problem Solving learning model assisted by Big Data which can be seen from the average pre-test score, namely 54.21 and post-test, namely 78.58, so that the increase in student learning outcomes was categorized as moderate. The N-Gain value is 0.5322.

Authors
Irmayanti Muis is currently a magister student at Chemistry Education Study Program, Faculty of Mathematics and Science, State University of Malang, Indonesia. She is doing research in which chemistry classes are exposed to big data resources, including NIST MS database. Some current real laboratory data from GC-MS equipment are analyzed using big data resources for Analytical Instrumentation Method in chemistry department. Her email address is muisirma39@gmail.com Surjani Wonorahardjo is a lecturer and researcher in Chemistry Department, Faculty of Mathematics and Science, State University of Malang, Indonesia. Her current research is about analytical chemistry methods development for characterization and application in chemistry level. In this topic the aid from big data resources, especially in spectroscopy area are most needed. She is also a member of Center of Excellence (PUI-PT) Disruptive Learning Innovation, State University of Malang, which has emphasis in modern information technology development for teaching and learning processes. Her email address is surjani.wonorahardjo@um.ac.id Endang Budiasih was a senior lecturer in Chemistry Department, Faculty of Mathematics and Science, State University of Malang, Indonesia. She had the expertise in conventional and modern analytical chemistry besides her main projects in chemistry educational areas. She developed instruments for teaching and learning processes assessment for the chemistry education research group in the university. She passed away in December 2020.