Setting Up and Implementation of the Parallel Computing Cluster in Higher Education

— In this article, we describe in detail the setting up and implementation of the parallel computing cluster for education in the Matlab environment and how we solved the problems arising on this way. We also describe the comparative analysis of parallel computing cluster by the example of matrix multiplication by a vector with large dimensions. First calculations were performed on one computer, and then on a parallel computing cluster. In the experiment, we proved the effectiveness of parallel computing and the necessity of the setting up of the parallel computing cluster. We hope that the creation of a parallel computing cluster for education will help in teaching the subject of parallel computing at higher schools that do not have sufficient hardware resources. This paper presents unique setting up and implementation of the parallel computing cluster for teaching and learning of the parallel computing course and a wide variety of information sources from which instructors can choose.


Introduction
Today, there are also a large number of important practical problems associated with computational experiments, the solution of which requires the use of huge computer power.
The most important part of setting up and implementing high-performance parallel computing cluster is the training of qualified specialists in the field of education and information technologies. Therefore, in order to develop education in this area, it is necessary to provide relevant equipment to higher education. In this regard, the education cluster EUBA was setup at the department of Applied Informatics at the University of Economics in Bratislava for implementation in the educational process.
The cluster we call the education cluster EUBA similar to the Beowulf cluster, has some differences.
The name Beowulf originally referred to a specific computer built in 1994 by Thomas Sterling and Donald Becker at NASA [1].
Beowulf is a computer cluster of what is normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a highperformance parallel computing cluster from inexpensive personal computer hardware. No particular piece of software defines a cluster as a Beowulf. Beowulf clusters normally run a Unix-like operating system, such as BSD, Linux, or Solaris, normally built from free and open source software. Commonly used parallel processing libraries include Message Passing Interface (MPI) and Parallel Virtual Machine (PVM). Both of these permit the programmer to divide a task among a group of networked computers, and collect the results of processing. Examples of MPI software include Open MPI or MPICH. There are additional MPI implementations available (Beowulf cluster. In Wikipedia). Difference, the education cluster EUBA was set in the Windows operating system and in Matlab environment.
The purpose of the research is determining the practical and methodological foundations for improving the training of students in the aspect of studying courses on parallel computing, as well as their practical implementation in the learning process.
The subject of our research is the introduction into the content of the educational process of the setting up and implementation parallel computing cluster.
In the course of our research, sources on the topic were analyzed. The experience of other scientists from different countries was studied, namely the setting up and implementation of a parallel computing cluster for education. In works, "Barrier to parallel processing courses in computer education and solutions" [3], "Teaching parallel programming using Java" [4], "Improvement of students' training in parallel and cloud computing" [5], authors use of parallel computing in the learning process is considered.
In the work, "Cluster computing in the classroom and integration with computing curricula 2001", the authors share their experiences in teaching cluster computing and the topics chosen, depending on course objectives, perspective themes of training, offered course components for teaching in a parallel computing cluster [6].
In the article, "Teaching High-Performance Computing on a High-Performance Cluster", authors present their experiences of how a state of the art midsize Linux cluster, bought and operated on a department level primarily for education and algorithm development purposes, can be used for teaching a large variety of HPC aspects such as basics of parallel algorithms, classical tuning, or hardware aware programming. Special focus is put on the effects of such an approach on the intensity and sustainability of learning [7].
Virtual machines (VMs) installed on available computer lab resources can be used to simulate high-performance cluster computing environments. In article, "Virtual clusters for parallel and distributed education", authors describes two such virtual clusters in use at small colleges, demonstrates their effectiveness for parallel computing education, and provides information about how to obtain the VMs for use in an educational lab setting. They have used these clusters to introduce parallelism into several courses in their undergraduate curriculum [8].
The authors of work "The realization of small cluster parallel computing environment for college education" wrote about the message passing interface (MPI) is taken to build a small cluster of Linux-based systems with the number of ordinary PC and establish a parallel development environment with lower investment. Meanwhile it is verified and proves to be reliable. System takes the advantages of low-cost hardware to provide a practical parallel programming environment on clusters for general research institutes and research schools [9].
The author of paper, Research on cloud computing and its application in big data processing of distance higher education, study the parallel k-means clustering algorithm based on cloud computing platform Hadoop, and give the design and strategy of the algorithm [10].
In the article, the application of coarse-grained parallel genetic algorithm with Hadoop in university intelligent course-timetabling system, which is based on the cloud-computing platform with Hadoop, presents an improved method fusing Coarsegrained parallel genetic algorithm (CGPGA) and Map/Reduce programming model, to solve the university course-timetabling problem. The simulation experiment results show that, compared with the traditional genetic algorithm, CGPGA not only improves the success rate but also reduces the conflict rate in solving the university course-timetabling problem [11].
In work "Developing a hands-on course around building and testing high performance computing clusters" the authors describe a successful approach to designing and implementing a high performance computing (HPC) class focused on creating competency in building, configuring, programming, troubleshooting, and benchmarking HPC clusters [12].
In work, "Application of teaching quality assessment based on parallel genetic support vector algorithm in the cloud computing teaching system" authors compare the parallel algorithm and the serial algorithm on the Hadoop platform. The results of experiments show that the GA-SVM based on map reduce is suitable for teaching quality assessment under the environment of big data. As a new technology, cloud computing provides a broad space for the development of a new model in the aspects of hardware environment construction, software resource development, network teaching implementation and personal knowledge management. In order to effectively deal with the challenges of big data processing in the field of education, this paper proposes a GA-SVM teaching quality assessment algorithm which is based on MapReduce [13].
In the scientific and pedagogical literature of the university of economics in Bratislava, studies of the problems of parallel computing are presented in the papers "Parallelization of instance methods of a remote object of a distributed .net application mechatronic" [14], "Parallel programming" [15].
Analysis of scientific literature and internet resources show that courses on parallel computing were introduced in higher education in foreign universities, at the same time. This paper presents unique setting up and implementation of the parallel computing cluster for teaching and learning of the parallel computing course and a wide variety of information sources from which instructors can choose.

Method
The parallel computing cluster was configured in the Windows 10 operating system and in the Matlab interactive environment for teaching and learning of the parallel computing course.
Large-scale simulations and data processing tasks that support engineering and scientific activities such as mathematical modeling, algorithm development, and testing can take an unreasonably long time to complete or require a lot of computers memory. You can speed up these tasks by taking advantage of high-performance computing resources, such as multicore computers, GPUs, computer clusters, and grid and cloud computing services. Math Works parallel computing products let you use these resources from Matlab and Simulink without making major changes to your computing environment and workflows [16].
For organization of the parallel computing cluster for education we use single instruction, multiple data. The educational cluster consists of 3 computers (Figure 1): • Master (for managing the cluster and used to interacting with users) • Nodes (a group of computers) • Network Switch When configuring the cluster, we must select the same type of computers by the characteristic. The same characteristics of computers simplifies the setting of parameters. Initially, we tried to configure and apply a parallel computing cluster for performance on weak computers, but, unfortunately, the parallel calculation was allowed slower. Therefore, we chose more powerful computers. Technical Parameters of educational cluster:  The mdce service ensures that all other processes are running and that it is possible to communicate with them. Once the mdce service is running, you can use the nodestatus command to obtain information about the mdce service and all the processes it maintains. mdce install installs the mdce service in the Microsoft® Windows Service Control Manager. This causes the service to automatically start when the Windows operating system boots up. The service must be installed before it is started. mdce uninstall uninstalls the mdce service from the Windows Service Control Manager. Note that if you wish to install mdce service as a different user, you must first uninstall the service and then reinstall as the new user. mdce start starts the mdce service. This creates the required logging and chec inting directories, and then starts the service as specified in the mdce defaults file. Startjobmanager starts a job manager process and the associated job manager lookup process under the mdce service, which maintains them after that. The job manager handles the storage of jobs and the distribution of tasks contained in jobs to MATLAB® workers that are registered with it. The mdce service must already be running on the specified computer [17].
• Next for Server command: startjobmanager -name EUBA1 -v • Next for Node1 command: startjobmanager -name EUBA2 -v • Next for Node2 command: startjobmanager -name EUBA3 -v • Next select commands Parallel-> Discover Cluster on the Server computer ( Figure  3). The master computer to find the node computers • The EUBA educational cluster is configured. The educational cluster consists of 6 cores of processors and 12 workers (can be increased to 18 workers).  After configuring the cluster in the established university network, there was no parallel computing efficiency. Therefore, we set up a separate modem (Mercury8-port 10/100Mbps) to connect the computers of the parallel computing cluster, the result was effective.

Results of Research
To analyze the execution of parallel programs, a matrix-vector program of multiplication was performed. The dimensions of the matrix were from 100x100, 1000x1000 to 20000x20000, respectively, from 100, 1000 to 20000 [18]. First, the calculations were performed on one computer without using parallel computing, then using parallel computing on a local and a third time using parallel computing on the educational cluster EUBA.

Discussion
Further, after setting up the cluster from Table 1, we see that the speed of parallel computing on the cluster is slowly increasing, and from 18000 the parallel computing speed is higher. Initial slow increases are associated with the data connection between the processors. But, we see that parallel computing on a cluster of big matrices is much more successful (Figure 9).

Conclusion and Recommendation
In the experiment, we showed that parallel computing will be effective at large computing. Also parallel computing in local requires a large RAM. Computers with the similar characteristics simplify the settings. In addition, when configuring a cluster, we must select powerful computers. In addition, for setting up of the parallel computing cluster, we need a fast network because the network speed is interconnected with the calculation. Therefore, the programmer must know all the subtleties of parallel computing. We hope that the creation of a parallel computing cluster for education will help in teaching the subject of parallel computing.