Cloud-based Geophysical Inversion Modeling Using GNU Octave and MatlabMPI on Amazon EC2

— Computer modeling and simulation can be very demanding in terms of computational resources. Cloud computing has opened up new avenues for the scientific researchers with limited resources to do complicated simulation. In order to investigate whether cloud computing is suitable for modeling and simulation, we first describe how to build a cloud-based modeling and simulation platform using GNU Octave and MatlabMPI on the Amazon Elastic Compute Cloud (EC2). Then, we evaluation its performance taking the geo-physical inversion modeling as an example. The results show that the cloud-based modeling and simulation platform is suitable for basic modeling for free. It can provide much more higher performance with acceptable price. Furthermore, we can cut down the cost by employing the Spot instance without losing the computation performance.


Introduction
Computer simulation is defined as a hybrid technology of using computer science and technology to build simulation models and then performing experimentation on the models under various conditions. It has advantages such as high efficiency, high security, scalability, flexibility etc, and has been playing important roles in many domains with great success [1]. Simulation can be very demanding in terms of computational resources, for example when dealing with large and detailed models. Parallel and distributed simulation (PADS) aims at studying methodologies and techniques for defining and executing simulation models on parallel and distributed computing architectures [2]. In cloud computing, the computing resources are paid for per usage and may expand or shrink based on demand [3]. Cloud computing has opened up new avenues for scientists with limited resources to do complicated simulation which hitherto required expensive and computationally-intensive resources [4].
The field of modeling and simulation tools is diverse and emergent. Generalpurpose tools (e.g. MATLAB [5] or Octave [6]) sit beside highly focused and domain-specific applications [7]. MATLAB is a powerful general-purpose tool and widely employed to solve the scientific and engineering problems [8], including modeling and simulation [9][10][11][12][13]. However, using MATLAB on the cloud need MATLAB Distributed Computing Server license, which is very expensive, depending on the number of nodes used (See the white paper [14] for the detail of installation, configuration, and setting up clustered environments using licensed products from Math-Works on Amazon EC2).
GNU Octave is a high-level interactive language, primarily intended for numerical computations that is mostly compatible with MATLAB [6]. With Octave, which is developed under GNU license, the commercial license problems are solved. The MatlabMPI [15] is a set of MATLAB scripts that implement a subset of Message Passing Interface (MPI) and allow any MATLAB program, or Octave program (reported by [16]), to be run on a parallel computer .
Amazon EC2 could be used to build a virtual cluster, powerful enough for scientific computing, for free [17]. This paper is dedicated to study how to build a cloudbased modeling and simulation platform using GNU Octave and MatlabMPI on the Amazon EC2, and to validate its performance taking the geophysical inversion modeling as an example.

2
Building simulation platform on Amazon EC2

Building virtual cluster on Amazon EC2
There are two steps to build a virtual cluster on Amazon EC2: Install StarCluster [18] on Client PC, and build a virtual cluster on Amazon EC2 (see [17] for the detail of installation and configuration). Currently, we choose the StarCluster based scientific Linux AMI (starcluster-base-scientific-linux-6.5-x86_64-ebs-hvm-07, "ami-9ddadef4"), provided by the Amazon Community for free, to build a virtual cluster. The architecture of the virtual cluster built on Amazon EC2 with StarCluster is illustrated in Figure 1. http://www.i-joe.org  [17] ).

2.2
Deploying GNU Octave and MatlabMPI on the cloud (3) Install MatlabMPI There are 3 steps to install MatMPI: Step 1: Copy MatlabMPI into a location that is visible to all computers.
Step 2: Add MatlabMPI/src directory to Octave path.
octave:1> addpath ~/MatlabMPI/src Step 3: Go to the 'examples' directory, and run a test program to validate the installation. MPI_Run has three arguments: the name of the MatlabMPI program ( without the ".m" suffix), the number of machines to run the program on and the "machines" argument which contains a list of names of machines on which to run the program.
The "machines" argument can be of the following form: machines = {}; //Run on a local process. machines = {'machine1', 'machine2'}; //Run on multiprocessors. The result of running the 'xbasic.m' program is shown as following:  Cloud-based geophysical inversion modeling

General description
The magnetic method is one of the most popular geophysical techniques for the exploration of minerals, oil and gas resources [19]. Inversion is one of the important works of magnetic data quantitative interpretation. The magnetic inversion, same as the other geophysical inversions, is an ill-posed problem because the inversion result is non-uniqueness. The Tikhonov regularization is used to solve ill-posed geophysical inversion problem [20], which has two major issues must be studied: (1) How to choose the optimal regularization parameter; (2) How to find the global optimal solution which independent on the initial guess.
J. Xiong and T. Zhang proposed a multiobjective optimization (MO) inversion algorithm deal with these two issues at the same time, but has the disadvantage of huge computational time [21]. We implement the MO inversion algorithm on the cloudbased modeling and simulation platform, described in chapter 2 in this paper, to overcome the problem of computational time mentioned above.

Problem formulation
Forward Modeling. We consider the 2D magnetic forward modeling. We divide the subspace into several regular arranged 2D prisms. The magnetic anomaly of a prism, which is shown in Fig. 3., is calculated as follows: Fig. 3. Geometry of a vertical prism of finite depth extent. (According to [21]).
In the equation (1) where F is the magnetic anomaly, e F is the Earth's magnetic filed (EMF) intensity, k is the susceptibility contrast between the magnetic target and the host medium, I is the inclination of the EMF field, ! is the strike angle of the prism relative to the magnetic north, and , are distances and angles (see Fig. 3).
The magnetic anomaly vector is

Inversion Modeling
The goal of the inversion is to derive a model m that best fits the observed data.
The objective function of the data misfit is [21] min, ) ( where most common model constraint is minimum of norm of model vector: The regularization factor, which determines the relative weight of data misfit and model constraint, affects the inversion result extremely.
Because of the difficulty to choose the optimal regularization factor, J. Xiong and T. Zhang employ the MO inversion method to minimize the data fitness and model constraint simultaneously [21]. The MO objective function is

3.3
Modeling method J. Xiong and T. Zhang employed a multiobjective particle swarm optimization inversion (MPSO-I) algorithm to solve the equation (6) successfully except the disadvantage of huge computational time [21].
In the MOPSO-I algorithm, the swarm consists of N particles. The particle's position and speed are updated as follows: is the position, speed, and personal best of the i-th particle at the k iteration; LEADER is the global best of the swarm at the k iteration. ! is the inertial weight; Here, we propose a parallel MPSO-I algorithm and implement it on the cloudbased modeling and simulation platform using GNU Octave and MatlabMPI on the Amazon EC2, to overcome the computational time problem. The cloud-based geophysical inversion modeling algorithm is illustrated in Fig.4.
In the Fig.4, we can see that there are three main steps in the algorithm: (1) the master node broadcast the necessary parameters to slave nodes, including the subswarm size, local iteration number N, upper and lower boundary of susceptibility, ! , , I F e , so that each slave node could initial a sub-swarm locally; (2) each subswarm iterated locally on slave node for N iterations; (3) master node gathering information and choose global LEADER.

Numerical Results
The cloud-based geophysical inversion modeling algorithm is implemented using GNU Octave and MatlabMPI, and run on the cloud-based modeling and simulation platform on the Amazon EC2. It yields the same inversion modeling results as [21] but much fewer computation time. This paper focuses on discussing the computation time and cost of cloud-based geophysical inversion modeling, since the details of modeling result can be found in [21]. As a comparing, we also run the same geophysical inversion modeling algorithm on the PC with the hardware of Intel i7 6500U CPU and 8GB memory, and the software of Windows 10 OS and Octave 4.2. The computation time of PC and different types of instance and its prices are listed in the Table 1.
iJOE -Vol. 13, No. 9, 2017  The comparing of geophysical inversion modeling computation time between PC and different types of instance are illustrated in Fig.5.
From Table 1 and Fig. 5 we can see that: (1) Although t2.micro instance is less powerful than comparing PC, the cluster of t2.micro instances is more powerful than PC and it is free. (2) The one node c3.large instance cluster is as powerful as PC and c4.4xlarge instance is more powerful than PC. Furthermore, the cluster of c3.large instances and c4.4xlarge instances are much more powerful than PC; (3) The prices of c3 large and c4.4xlarge instance is 0.105 and 0.796 $/hour for On-Demand instance respectively, and 0.0156 and 0.1887 $/hour for Spot instance respectively, which are all affordable for scientific researchers.
To investigate the price issue in detail, we calculate the total prices for 100 times modeling on Amazon EC2 for different types of instance, and list the results in Table  2. The total prices vary from free to $63.26, according to instance types with different computation abilities and the size of cluster.  We put the computation time data in Table 1 and total price data in Table 2 together and plot the Fig.6 and Fig. 7, to show the relationship of them. From Fig. 6, we can see that, for the On-Demand instance(s), the computation time of geophysical inversion modeling on the cluster of c4.4xlarge instance(s) is about half of that of c3.large instance(s), however the total price of c4.4xlarge instance(s) is about 3 times of that of c3.large instance(s). This results indicate that if we do not care about the computation time very much, the price of relative lower performance instance is more acceptable. The price of Spot instance is much more cheaper than that of On-Demand instance. From Fig. 7 we can see that, for the Spot instance(s), the computation time of geophysical inversion modeling on the cluster of c4.4xlarge instance(s) is about half of that of c3.large instance(s), however the total price of c4.4xlarge instance(s) is about 4 times of that of c3.large instance(s). This results indicate that if we do not care about the computation time very much, the price of relative lower performance Spot instance is much more acceptable.

Conclusion
Computer modeling and simulation can be very demanding in terms of computational resources. However, not all scientific researchers have an access of High Performance Computers (HPC). Cloud computing has opened up new avenues for scientists with limited resources to do complicated simulation which hitherto required expensive and computationally-intensive resources.Thus, in this paper, we dedicate to study how to build a cloud-based modeling and simulation platform using Octave and MatlabMPI on the Amazon EC2, and to validate its performance taking the geophysical inversion modeling as an example. Our main findings are: (1) The basic cloudbased modeling and simulation platform (t2.micro instances) is free and powerful enough for the basic geophysical inversion modeling. (2) The higher performance cloud-based modeling and simulation platform (c3.large or c4.4xlarge instances) are much more powerful. The price of them is acceptable for the scientific researchers.
(3) Although the Spot instance(s) is as powerf^ul as the On-Demand instance(s), it is much more cheaper than On-Demand one(s). We could build a cloud-based modeling and simulation platform based on the Spot instance to do complicated modeling without spending too much money.