Flood Disaster Identification and Decision Support System using Crowdsource Data Based on Convolutional Neural Network and 3S Technology

— Flooding causes significant damage to lives and property. Moreover, it affects the economy and the lifestyles of people in society for both the short term and long term. As a consequence, this research aimed to demonstrate techniques and flood detection analysis through digital images and web application development for receiving reports and inspection data about flood situations in every area. The process requires crowdsource data and uses 3S technology so it can receive accurate, real-time data for making decisions in flood management. It also offers aid to the people in the disaster areas. In this research, a convolutional neural network was applied for flood detection and classification using digital images and data from people or the victims. According to the study, it was found that the convolutional neural network for flood classification has data accuracy at high levels of 95. 50%, 93. 00%, 97. 89%, and 0. 91, which are the results of accuracy, producer accuracy, user accuracy, and kappa statistics, respectively. Besides, the use of this technique saves costs, time, and labour. Furthermore, the method could be applied to other disasters such as landslides, earthquakes, and fires. It is able to monitor the incidents in each type of disaster and also examine the damaged site after the incident .


Introduction
Flooding is a natural disaster that can cause a significant impact on the lives and property of people. Thailand has frequently suffered from flooding during the monsoon seasons [1]. Each flood area is spread over several provinces in various areas [2]. Therefore, using only human labour or government officials to investigate the suffering or damage that occurs each time requires a large amount of funding and human resources. Moreover, it takes time to get the information on the flood areas in order to be used to make informed decisions concerning the provision of relief to flood victims. The information can be used to help flood victims or alleviate the damage. The data can also be used for simulating the flooding that could occur in the future. With the ability of today's information technology, it is possible to use data from volunteers or Crowdsource Data, which can be added to the traditional operation that only uses human labour. It can make the process easier, more economical, and faster [3]. The method currently used takes information from social networks, where the victims' post-flood situation information in the form of text messages, images, and videos, to facilitate the relevant persons to be able to access the information quickly. However, the use of online social networks to monitor flooding still has several limitations [3] [4]. For example, the coordinates or locations of the flooded areas derived from the posts on social networks cannot be used to determine the exact coordinates; sometimes the location of a flooded area cannot be identified precisely because the current location data is not recorded. Furthermore, even with location records, the coordinates of the recordings do not correspond to the actual flooded areas since social networks use the default location of the latest check-in or GPS update information. Online social networks allow users to post by checking in to other locations according to each user's requirements. Without checking in at current positions, the data may be inaccurate or incorrect. There are also other limitations in using social networks, such as the process of checking digital images in flooded areas needing human labour for monitoring. With so many postings on networks, it is hard to examine without the automated process for screening only the images of actual flood situations [3].
In disaster management, the current technologies that play an important role in spatial management include geographic information systems, remote sensing, and global positioning systems. 3S Technology can be used in several aspects, such as in handling wildfires [5], landslides [6], and floods [7] [8] [9]. Such information and technology can determine the coordinates and locations of each point on the Earth and can also record various changes on the Earth at any time, making it possible to track changes. In addition, artificial intelligence techniques have been used extensively in flood forecasting and management. The convolutional neural network (CNN) is one of the most popular artificial intelligence resources available today. CNN is widely used in image classification because of its high accuracy [10]. CNN has been developed continuously, as evidenced by Lenet, AlexNet, ZFNet, and Google Net, etc. [11]. From the literature review related to image classification using CNN, it was found that image classification results are highly accurate. Therefore, this research proposes the development of a model system for recording and verifying the data in flooded areas using GIS and GPS to monitor and record the coordinates of flooded areas using CNN to classify images of affected areas with the images gained from victims or volunteers to screen for images of the flood by checking with the information from village leaders or staff as well as satellite image-processing data.

2
Literature Review

Geographic information system
Geographic information systems are widely used in many fields such as flood forecasting [12], the management of transportation routes, and particularly the use of disaster management, especially flooding, which has been widely used. GIS has been developed in such a way that sensors are used to measure the amount of water, such as rainfall, streams, etc. The data can be used to monitor or track flood occurrences [13][14] [15] in various forms, including web applications or applications via mobile devices or smartphones [15] [16]. Most of the results are shown as water levels, which allow users to monitor the situation in each area. Besides, many models for mobile geographic information systems have been developed, including the development of a prototype system using hydrological models for forecasting and warning about floods via mobile devices in order to be prepared for flood incidents [17]. A mobile application for disaster management has been developed by using semantic web technology. The primary function of this application is to report on flood situations using images and notify demands about various kinds of resources [18].

Global positioning system
According to the literature review, it is apparent that the positioning system on the earth is of great importance and has been used widely in both military and civilian applications, including navigation [19][20], transportation tracking [21] [22], location and tracking [23] [24], survey and mapping work [25] [26]. For work in disaster management, it is found that the global positioning system is necessary for this task as it essential to have or know the coordinates of different types of disasters so that they can be handled. In disaster management, there are uses in various areas such as surveillance for landslides [27], tsunami alarming [28], etc. When studying the use in flood disaster management, it is found that there are uses in various forms such as for the data collection of rainfall in each location to predict the amount of rainfall and runoff [29]. Moreover, it is also useful in flood surveillance [30].

Convolution neural network
Clarifai is a CNN image classification that uses ZFNet architecture. The ZFNet is designed to tweak the network parameters of the latter. This CNN model is similar to Alexnet, with the same blocks, same depth with a smaller stride of two at first convolution of four and a smaller filter size of 7 instead of 11, slightly more weights and operations. It uses deconvolution for visualizing and understanding Convolutional Networks for details, as shown in Figure 1. This model dramatically reduces the number of network parameters and improves overall classification accuracy. When studying the researches related to remote sensing, it was found that CNN image classification is used to classify different types of coverage such as vegetation, roads, constructions, ground, and water [31], as well as in examining buildings or construction from satellite images [32] [33]. From the study, it was found that the CNN technique is one of the deep learning approaches popularly used at present. CNN was first introduced in 1998 [34], but it was virtually impossible to use it in real work at the time because of the problem of computer resources not being able to process the results quickly. Later, a concept was presented after 2006 [35] that could be considered a turning point in neural networking as the neural network could be constructed with deep structure. Thus, the CNN technique has been improved to become deep learning. Currently, there are several models that have been invented to be used in deep learning. Clarifai developed its approach by using Python with APIs served for the development of use with languages such as Python, PHP, etc. [36][37] [38]. This research uses Clarifai as it can be used to develop web applications and applications for screening flood-related images. From the preliminary tests, it was found that it can be used to classify images of flood occurrences. According to the reviews of literature and researches related to the application of deep learning to application development, it was found to be used to develop an application for screening road accidents [37]. It has also been used to screen fire incident images [39]. However, it has not yet been applied to identify flood events for flooded and non-flooded areas using digital images.

Crowdsource data
Crowdsource data is also used in different ways, with the details as follows. Crowdsource data is used to manage floods, wildfires, and earthquakes in the United States and Haiti. The data provided by victims include information on electricity, food, water, communication, and damage in each area, which can be used as the data for assisting the victims [40]. Disaster management research uses data from crowds to manage disasters using various devices including smartphones, tablets, and computer devices [41]. The City-Share software is developed to be used in the disaster management of floods in 7 countries in the Europe zone focusing on the use of crowdsource data so that the victims can have more channels for providing data about resource demands, data for flood occurrence in each area, and data for communication. Further, it also emphasizes Public Display to give information to relevant persons and other people [42]. Besides, there is also the use of crowdsource data obtained from the online social network, Twitter, for handling disasters in the Philippines. The messages extracted from Twitter are classified for disaster occurrence using the Naïve Bayes technique [43]. This makes it possible to screen the messages received from the crowdsource data more accurately and match the needs. However, such research classified only texts. No images were derived from the crowdsource data. Besides, crowdsource data has been used to manage floods by allowing people or victims to upload images or videos of flood situations. Subsequently, the images can be analysed and processed to identify the flow direction and speed of the water, which will be beneficial for relevant persons to be used in management and decision-making [44]. In addition, the crowdsource data is used for assessing the damage to roads after a flood by using volunteers or other people to provide the information. The information reported by the volunteers must include images of the damaged roads as well as location coordinates. It can display road damage data in each area and the locations of the damaged areas [45].
According to the literature review, the use of crowdsource data in disaster management, as mentioned above, makes it possible to track the damage or flood situation in real-time and the flood information in each area more effectively than the traditional way that people used to explore and use the crowdsource data. This is because the number of social networking users is enormous. However, the literature review also notes there is a lack of development in the classification of digital image data to screen only the data for real floods.

Methodology
This research has two main objectives. The first is to distinguish between images of flood events in pictures and images without flood incidents occurring in the images by using the CNN in order to be able to screen only the images related to the flood event in the developed web application. The second objective is to develop web applications to receive and monitor flood events from the crowdsource data. The research methods are shown in Figure 2. The data can be divided into 3 parts: input data, data processing; and data visualization. Data processing consists of 1 ( the import of flood data from the crowd. This process is characterized by automatic user positioning and flood image screening and 2 ( Flood decision support system from the database. The developed web applications can be rendered via a web browser and can be used on any device.

Study area
The study area in this research consists of 3 provinces located in southern Thailand, comprising Chumphon, Surat Thani and Nakhon Si Thammarat Provinces. The study area is shown in Figure 3. The image on the left is Thailand, followed by Chumphon, Surat Thani, and Nakhon Si Thammarat. Each area having a blue polygon is an area affected by the flood in 2017. The reason for choosing this area is that the area is located in the southeast monsoon zone and northwest monsoon zone, causing such areas to suffer frequent flooding almost every year. However, the methods presented in this research and the applications can be applied in other areas, without limitations on the area.

Data preparation and pre-processing
This research uses 3 main data sets. The first is the crowdsource data from the flooding reported in each area, imported through the developed system containing descriptive information about the flood situation in each area, flood area coordinates, and images of flooding in each area. This information is the data for the flooding in early and late 2017 in Chumphon, Surat Thani, and Nakhon Si Thammarat. The second part of the data is from a survey of the actual flood areas, which is used in validating the process presented in this research. The third part is the flood data generated by the satellite image processing from the Geo-Informatics and Space Technology Development Agency (Public Organization), which will be used to monitor flood areas in the developed system.

Flood detection using CNN classification
This research utilizes CNN to distinguish between flooded and non-flooded images to enable screening to help the system administrators in screening irrelevant information. This makes the process of screening or investigating the flooded areas from a large crowd of informants faster. This can reduce the burden on those involved in flood monitoring as only the relevant information is retained. The technique and tool used in this research is Clarifai, which is used in web application development for screening. The images used in the test in this process are 200 images, including 100 flood images and 100 non-flood images. In this process, if the pictures uploaded by the crowd are not pictures of a flood, a pop-up window will appear to warn the users that the uploaded pictures are not relevant and a new picture should be uploaded. If the pictures have water or flooding in them, the uploaded pictures can be stored in the database.

System analysis and development
In this research, 3s technology is used in system development for various issues, such as determining the coordinates of the flood area using GPS and displaying the flood area information using GIS and RS. The analysis and design of the database is carried out by considering the main functions or capabilities of the system in order to be able to collect data and fully utilize it. In this research, it can be divided into 3 main groups: general users, village leaders or officials, and system admin. Each group of users has the right to use the system. The users can use the system to browse the flood areas in Thailand and report the flood situations in each area. However, the users need to log in in order to be able to report the situations. The group of village leaders or officials has more privilege than general users to examine the flood data in the villages where they are leaders and can report on the flood areas. For the last user, the administrators have more rights to use than all other users. They can manage user information, can approve user data additions, can check flood information to add data and check flood areas from satellite imagery processing or from the information provided by relevant agencies to conduct the report on the flood in each area and each period. For the development of flood notification and monitoring system in this research, the tools used in the development consist of two items. The first is the language used in the development, consisting of PHP, JavaScript, CSS, Bootstrap, and Google Map API. The second is the software that acts as the server for web application services, which in this case is Apache. The software used to store the database is PostgreSQL. The tools used in the development emphasize the choice of tools as open-source software because such software does not entail costs or expenses for use.
The users or people are more involved in reporting the flood situation in each area and reduce the burden on the administration. The details of the 3 features include: 1) Automatic coordinate positioning for the development in this part will not allow users to add or edit coordinates. Instead, it uses positional retrieval, where the user is present by retrieving data from the user's device via the geolocation function. The purpose of development in this part is to reduce the possibility of fake locations compared to actual flood disaster areas. Therefore, the development of this part will result in higher positional accuracy. 2) Screening flood-related images use CNN for flood image information screening, which are uploaded or added to the system by the users. This research utilizes the CNN by Clarifai to develop web applications in the examination to monitor before the user's upload so that the images uploaded by users show water appearing in the images or not. If not, the system will prompt the user to upload a new image. Developing this part allows more accurate information to meet the needs while reducing the workload of data validation by human labour. 3) In reporting for system development, the type of reports on the flooded areas can be selected for the entire areas or only certain provinces, districts, sub-districts, and villages to conduct reporting. The report is available for each year. The exported report format is in MS Excel format so that the data can be used for analysis or handling.

Accuracy assessment
For verifying this research, the method is used in examining the flood data obtained from the site survey and the data from the GISTDA satellite image processing. In verifying or evaluating the accuracy of the flood images uploaded by the crowd and the non-flooded images without the real on-site survey, CNN will be used when classifying the two types of images to identify whether or not the image is a flooded area. The criteria used in the assessment include the Producer Accuracy (Precision) of each class, the User Accuracy (Recall) of each class, Overall Accuracy (OA), and Kappa. These values make it possible to evaluate and analyse the accuracy of the flood classification. It can also be possible to basically consider how well the method works.

Results and Discussion
The results of this study consist of two parts. The first part is the classification of flooded areas using digital photographs obtained from the crowd using CNN. The accuracy is assessed compared to the flood data gained from GISTDA and the data obtained from the on-site survey. The second part is the results of the flood warning and detection system using 3S Technology. For the classification results shown in Figures 4 and 5, it can be seen that digital photographs can be used to classify images of floods in each area. Figure 6 shows a sample of the classification results for the flood areas. Each image consists of the class name for the objects that appear in the image and the probability of each class. The criteria used to determine whether the image the user uploads is the flood image or non-flood image based on the class of images. The results of the prediction must have Water or Flood class with a probability higher or equal to 0. 8. It is in the top 5 class. Figure 5 shows an example of the classification results where the image is non-flooded or does not contain a flood event. The results in Figure 5 show that the 5 class names of each image have no Water or Flood classes. They are other classes such as Grass, Road, Car, etc.
When applying these images to the developed system, the users will not be allowed to upload the images. This is a preliminary screening of irrelevant images to reduce the administrative burden. For assessing the accuracy of digital image classification for flooded areas using the CNN in this research, 6 values are used (as shown in Figure 6); Producer Accuracy (Precision) of Flooded class, Producer Accuracy (Precision) of Non-Flood class, User Accuracy (Recall) of Flooded class, User Accuracy (Recall) of Non-Flood class, Overall Accuracy and Kappa. When considering the results of accuracy assessment compared to the data for the real flooded areas based on GISTDA survey data and area survey data, the Producer Accuracy (Precision) of the Flooded class is 93. 00%, Producer Accuracy (Precision) of the Non-Flood class is 98. 00%, User Accuracy (Recall) of the Flooded class is 97. 89%, User Accuracy (Recall) of the Non-Flood class is 93. 33%, and Overall Accuracy is 95. 50%. The Kappa value is 0. 91. From the analysis, the results reveal that the accuracy of the classification is at a high level. This approach can be used to develop web applications for screening images related to flood events. The results for the development of the flood notification and monitoring system using 3S Technology are as follows.
1. For authentication results for each group of users, the users can utilize the username password to access the system, as shown in Figure 7, showing the user authentication for each group of users. There is also the support function for the user's online registration, as shown in Figure 8, in order to check which user reports the flood information to the system. 2. For the results of automatic user coordinate detection in this section, the users will be prompted to confirm that the users have allowed the system to retrieve the current coordinates to be used in informing the flood situations when the user has already confirmed by clicking on the Update User Location button. The map will appear, as shown in Figure 9, with the red marker indicating the current location. 3. For the results of the flood situation notification in this section, when the users are logged in, they can add flood data by filling in the details as shown in Figure 8, including the location of the flood occurrence, the date of occurrence, the water level, the transpiring situations, and images of the flooded areas. The data for coordinates, sub-districts, districts, provinces will be retrieved automatically from the current coordinates of the users. For image data of the flooded areas, the users need to be added to confirm the flood in such areas is really occurring. Moreover, the system will alert the users to upload a new image if the users upload images without water as an element in the images. This is a basic screening test without needing a person to check. Instead, CNN is used for screening the images. 4. Display of flooded areas: In this section, the crowdsource data for the crowd or people who notify about the flood situations will be displayed by presenting the information including the name of the location of the flood, informer, latitude, longitude, water level, situation data, monitoring status from satellite imagery, the status of inspection of the leaders, and images of the flooded area, as shown in Figure 10. For displaying the inspection data, it indicates whether or not the flood data has been verified. If the information has not been verified, it will be marked as a yellow marker on the map. If it has already been checked by the village leader or official, it will be marked with an orange marker. When it is verified with the processing information of Satellite imagery, it is displayed as a red marker. In addition, if the situation data is verified by both village leaders and the dataprocessing satellite images, it will be displayed as a red flag, as shown in Figure  11. In terms of monitoring by using satellite imagery processing or using information from relevant agencies, this section allows viewers and systems to upload the flood area data. Figure 11 shows the polygon or the blue area, which is the area of the flood generated by the satellite image processing. The image will be shown with a red flag, representing the real flood areas. The 3 types of information are matched; crowds or people, village leaders or officials, and data processing satellite images of the flooded areas. The advantage of this system is that it will show the flooded area more accurately. This information can be used to plan and decide on relieving the victims or planning for the future. 5. Results of the report: In preparing the report, the report can be made following the desired area each year. For the details in conducting the report, it will contain the information, informers of situations, latitude, longitude, date, water level, subdistrict, villages, a summary of flooded areas, average water level, and percentage of flooded areas compared to all reported data. The report can be issued in three formats, including a pie chart, MS Excel and. pdf files, as shown in Figure 12.

Conclusion
Based on the results of the CNN study for 3S Technology, which consists of the Geographic Information System, Global Positioning System, and exploration as well as remote sensing to be used in conjunction with crowd participation techniques in digital image analysis to detect flood and develop the web applications for flood notification and monitoring in each area. The study results enable the conclusion that applying the CNN in detecting the flooded areas from images can be verified with high accuracy. This will help reduce the burden on administrators as well as those involved in screening the images that have been notified by the crowd. In addition, the use of 3S Technology and the involvement of the crowd are beneficial to many people. The first group is the victims and affected people so that they can have a communication channel to get flood information quickly and timely with the tools used for notifying the flood occurrence in their areas. The second group comprises the officials or village leaders who are assigned to manage or rehabilitate the floods as they have the tools to present supplementary information for decision making in various areas, as well as bringing the data to be used in proposing government organizations for budget planning to help the victims. The third group is the administrators or relevant persons. It is useful to help reduce the burden of screening unrelated data.
As the methods and web applications presented in this research were conducted during a flood and after the flood, there should be the development of a system in the future used for forecasting in order to warn people before a flood occurs to prepare for the flood. If the system supports the management before the flood, during the flood, and after the flood, it will be able to mitigate the damage that could occur in the future. It also supports the decisions of relevant persons.