Flashover Prevention System Using IoT and Machine Learning for Transmission and Distribution Lines

—Flashover on transmission and distribution line insulators occurs when the insulator’s resistance drops to a critical level and causes frequent power outages. Thin layers of dust, salt, and airborne particles, gradually deposited on the surface of insulators, as well as humidity, form an electrolyte which causes flashover. In this paper, a flashover prevention system using IoT technology and machine learning is proposed in order to reduce loss and increase power reliability. The system includes an IoT module, a service and clients. The IoT module prototype was installed at a distribution line pole located in Pracha-utit, Bangkok, Thailand and had collected data for thirty-four months. The data were pre-processed and split for the training process and evaluation. In this study, we built and compared four models including linear regression, polynomial regression, Auto-regressive Integrated Moving Average (ARIMA), and Long-Short-Term Memory (LSTM) models. The results revealed that the LSTM model outperformed ( R 2 =.931, RMSE= 530.74) the others.


Introduction
Electrical power or electricity is invisible but plays an extremely important role in our daily lives in almost all sectors, such as industry, education, transport, communication, and others. It is generated at a power plant by generators usually driven by the primary energy sources, including renewable and non-renewable energy such as wind, solar power, tidal power, fossil fuel, coal, natural gas, etc [1]. An electrical power grid is an interconnected network system designed to produce and supply electricity to the consumers. A power grid is comprised of a power plant (electricity generator), transmission system, distribution system, substations, transformers and other components. Power plants are generally located near the primary energy sources but very far away from the consumer areas [2]. The generated electricity is transmitted to the consumers through the cables on the transmission and distribution systems. When the electricity flows through a long cable (transmission and distribution lines), it will lose the energy to the resistance along the cable. To lower the loss on electricity transmission and distribution, the generated electricity is stepped up to a much higher level (known as high voltage) by using a step-up transformer, since the corresponding power loss decreases by the square of the current, according to Ohm's law (P=I 2 R) [3]. The highvoltage electricity is transmitted to the consumers through the transmission and distribution lines which are either overhead or underground power cables. The transmission-line voltages are typically at 110 kV or higher while the distribution lines carry 22-50 kV from the substation where the transmission lines finish [1]. The transmission and distribution lines use insulators made from porcelain, glass or composite polymer materials to support the suspended cables and prevent the current from flowing to the earth and the other lines [2].
Contamination or pollution from dust deposited on the insulator surface is an important problem that affects the insulation performance of the insulators. It can increase the surface conductivity between the insulator strings. [4]. Several previous pieces of research have mentioned the methods to evaluate the surface conductivity by measuring the leakage current under various conditions [5] [2] [6] [7]. However, it is challenging to study the effect of dust deposited on the insulators.
A flashover is a dielectric insulation failure also known as a short circuit and can be caused by a lightning strike and contaminated insulators. The flashover from contaminated insulators, as shown in Figure 1, causes frequent power outages in wide areas, waste of time and money, and equipment damage [5]. In this paper, a flashover prevention system using IoT and machine learning is proposed. The system collects dust concentration and relative humidity around the insulators as the input data, then predicts the insulators' contamination level and schedules a maintenance plan. Moreover, the system is able to send an alert to the substation operators to start cleaning the insulators.  Literature Review

Flashover on contaminated insulators
Thin layers of dust, salt and airborne particles, gradually deposited on the surface of insulators, and the wet conditions that may include fog, drizzle and humidity, form an electrolyte on an electrical insulator referred to as a contaminated insulator. This electrolyte has a significant effect on insulator performance [5]. Flashover on contaminated insulators occurs when the insulator's resistance drops to a critical level. The electrolyte or contaminated layers increase the conductivity and thereby reduce the resistance of the insulators. The methods used to prevent the flashover are [3]: 1) increasing insulation distance by adding more insulators, 2) unspecified periodic maintenance by cleaning or washing of the insulators with high-pressure water injection, 3) periodic cleaning the insulators by high pressure-driven abrasive material, 4) replacing the insulators, 5) monitoring the insulators by using a remote sensor network, 6) inspecting by using remote sensors such as a corona camera, infrared sensor and ultraviolet (UV) camera [6], and 7) coating the insulator surface with a film of silicon rubber or silicon grease. However, silicon grease is not used anymore because of the cost and short life. Monitoring by using current measurement sensors uses a sensor attached to the insulator to measure leakage of the current. The limitation of this method is that each sensor covers only one insulator string [5]. Using a remote sensor, a corona camera attached to a helicopter is used for detecting and inspecting impairments in insulators and other elements. However, it is inefficient to detect the contamination level on the insulators. A UV imaging detection system is also used for detecting the performance status of transmission and distribution lines as a longdistance monitoring system [6]. The UV system consists of a UV imaging objective lens, a UV light filter lens, a UV intensifier, and a display. When an electric leak reaches a critical point, a corona occurs on the insulators, the UV radiation will be emitted, then the UV detector can detect this and shows the status of the insulators.
Narayanan et al. [7] proposed a contaminated insulator prediction approach based on wavelet transform and fuzzy c-means algorithms. The experimental setup for the electrical lab was comprised of a single disc porcelain insulator suspended in a fog chamber, 100 kVA and 10 kVA transformers, and a pollution maker applied to the insulator surface. The different pollution conditions and the corresponding leakage current were measured and analysed. The system would warn the substation operator to start cleaning the insulators. A prediction of flashover voltage on clean and polluted insulators was proposed. The research objective was to study the relationship between 1) flashover voltage and electric field distribution on six different varieties of composite and insulators, experiment prediction using an electric field and, 2) conductivity of the contaminated insulator and the environmental relative humidity. The findings revealed that the conductivity of the contaminated surface on insulators was closely related to the surrounding relative humidity [4].

IoT remote sensors for the transmission system
Researches revealed that thin layers of dust, salt and airborne particles, gradually deposited on the surface of insulators and the wet conditions from humidity are one of the causes of flashover [5] [7] [4]. The Internet of Things (IoT) technology has enabled large-scale connection of physical devices or objects. It is used to monitor and control objects in various applications, including smart building, smart home, and also smart grid [8]. The IoT architecture comprises three layers including the perception and control layer, network and connectivity layer, and application and service layer [9]. The perception layer consists of sensors responsible for collecting environmental data [10]. In a flashover prevention system, a dust sensor is used to collect deposited dust and airborne particles deposited on the insulators while a humidity sensor is used to measure relative humidity surrounding the insulators.
There are many dust and humidity sensors available for the IoT system. Dust sensors are classified into optical and laser-based types [11]. An optical dust sensor employs an optical sensing method to detect particle density or concentration in the surrounding air. An infrared light-emitting diode (IR LED) and a photo-sensor are diagonally arranged in a device's chamber. While the IR LED beams infrared light into the chamber, the photo-sensor detects the light reflected by dust particles in the air [12].
A Sharp GP2Y10 is an optical dust sensor designed to sense the density of fine particles with a diameter beyond 0.5 µm using an IR LED and a phototransistor arranged diagonally in a small chamber. It can distinguish between smoke and dust by syncing the IR LED pulse with the pattern of the output voltage [12]. Table 1 shows the Sharp GP2Y10 series. Relative humidity (RH) is an indication of the amount of water vapor in the air at a given temperature compared with the maximum amount that the air can contain at the same temperature. The relative humidity is expressed as a percentage [11]. If the water vapor or moisture in the air remains unchanged while the temperature rises, the maximum amount of moisture that the air could hold increases and the percentage will be lower.
The capacitive humidity sensor relies on electrical capacitance. It consists of two conductive electrodes, separated by a thin film of non-conductive polymer. The capacitance of the sensor depends on the moisture that penetrates this film. The change in capacitance is nearly directly proportional to the relative humidity of the surrounding environment [13]. The resistive humidity sensor employs a hygroscopic material similar to the capacitive sensor, but the difference is that it measures the change in electrical impedance or resistance rather than the capacitance. The output voltage of the resistive sensor has an inverse exponential relationship to the relative humidity of the surrounding air. The resistive sensor is cheaper, while the capacitive sensor is considered to be more accurate and stable, which is required in the accurate system [14]. Table 2 shows popular humidity sensors for research and prototyping.

Wide area network for long-range IoT
For a smart grid, a Wide Area Network (WAN) is used for data communication between sensors/nodes and the server/cloud since it covers a long distance (10-30 km). Commonly used WANs for IoT applications are Narrow-Band-Internet of Things (NB-IoT), Low Power Long Range Wide Area Network (LoRaWAN), and Sigfox [15].
NB-IoT is a Low Power Wide Area Network (LPWAN) technology developed to enable IoT devices on a cellular network. The new physical layer of NB-IoT is improved to meet the requirements of extended coverage (10-15 km), low cost, and lowpower consumption (up to 10 years of battery life) [16]. However, it employs a subset of the Long-Term Evolution standard (LTE) which occupies the bandwidth to a narrow bandwidth at 180-200 kHz. As a result, NB-IoT limits the data rate at a maximum of 200/250 kbps (peak) for the downlink and uplink directions respectively [17].
Sigfox is categorized as a Low Power Wide Area Network (LPWAN) built to connect low-power devices such as electricity meters and smartwatches. These devices continuously send small amounts of data to the networks. The Sigfox uses the differential binary phase-shift and the Gaussian frequency shift keying (DBPSK and GFSK), employing the Industrial, Scientific and Medical radio band (ISM), Ultra Narrowband at 868 MHz in Europe and 902 MHz in the US, and 433 MHz in Asia [15]. It requires low energy while covering a long distance, from 10 km (urban) to 20 km (rural) at a maximum of 100-600 bps, depending on the region. The topology is based on a star network architecture. Each device is not attached to a specific base station different from the cellular architecture. As a result, the broadcasted message is received by the nearest three base stations within the range [17] [18].
LoRa (Long Range) or LoRaWAN is a low-power wide-area network (LPWAN) that employs unlicensed ISM bands, sub-gigahertz, e.g., 863-870 MHz in Europe, 902-928 MHz in the US, and 923 MHz in Asia [19]. The LoRa also requires low energy while it covers a long distance, from 5 km (urban) to 15 km (rural) at a maximum of 50 kbps. It allows the devices to send small packets every couple of minutes, thus it is not suitable for data streaming [17]. The LoRa network architecture is a starof-stars topology where end devices/nodes are connected to gateways with bidirectional communications [15]. The LoRa network standard defines three classes for the end devices; Class A, Class B and Class C. Class A (All) allows bi-directional communications. The end device initiates an uplink slot to send the data. The downlink is open shortly after the device has sent the data but does not wait until the next scheduled uplink. As a result, this class consumes the lowest power and is suitable for distance sensors. Class B (Beacon) also allows bi-directional communications and extends Class A by adding longer slots for the downlink. To initiate the downlink, the end device receives a time-synchronized beacon from the gateway. This allows the server to know the device's receiving window. In Class C (Continuous), the end device continuously listens to the server and can receive data all the time except when it needs to send data (half-duplex). As a result, this class consumes more power when compared to Class A and Class B [20].

Time-series forecasting
Time-series data is a sequence of data points typically consisting of values over a time interval. It is also referred to as time-stamped data which is collected from the measurements of the same source at different points in time. A time-series analysis aims to analyze time-series data in order to extract meaningful changes and other characteristics of the data and to model and predict future values based on previously observed data [21]. There are several Machine Learning techniques that can be used to analyze time-series data, such as Regression, Auto-regressive Integrated Moving Average (ARIMA), and Long-Short-Term Memory Recurrent Neural Networks (LSTM) models [22]. The ARIMA is a parametric statistics technique that requires prior knowledge about the data distribution to build predictive models [23]. A Recurrent Neural Network (RNN) is an Artificial Neural Network (ANN) where the output from the previous step is fed as an input to the current step. This enables the network to learn time-series data but only for a short period. The LSTM is a special type of RNN, explicitly designed to avoid the long-term dependency problem of the RNN [24].

System Architecture
This section describes the system architecture and design. The system was designed for a flashover prevention system. The major requirements [25] were gathered from the engineering team working in the Electricity Generating Authority of Thailand (EGAT). This team is responsible for transmission lines and insulator mainte-nance. Figure 2 depicts the proposed system architecture. It is comprised of an IoT module, a server, and clients.

IoT module
The IoT module illustrated in Figures 2 and 3 measures dust levels and relative humidity around the insulators then sends this information to the server to analyze. The IoT module comprises four main components; MCU, dust sensor, humidity sensor, and NB-IoT module.
Dust sensor: A Sharp GP2Y1014AU0F onboard by Waveshare, was used as a dust sensor. It consumes a very low current (20mA max, 11mA typical) and can be connected with a power supply of 3.3 VCD to 7 VDC. The dust density sensing covers up to 580 µg/m 3 at +/-15% accuracy which is enough for a dust collecting system. The output pin (AOUT) of the sensor was connected to the analog input of the MCU (A0). The sensor generates an analog voltage proportional to the measured dust density/concentration, with a sensitivity of 0.5V/0.1 mg/m 3 at 2.5 to 5.5 VCD power supply.
The GP2Y1014AU0F onboard by Waveshare comes with a circuit of a 150 Ohms resistor, 220 µF capacitor, and a regulator for convenience. The circuit is designed to pulse the IR LED on and off to extend its lifetime. There is a 4-pin connector to be connected to an MCU as shown in Table 3.  Humidity sensor: A DHT22 was used as the relative humidity sensor. The sensor is based on a capacitive humidity sensor that uses a thermistor to measure the surrounding air. It uses a single-wire digital interface and has its own single wire protocol for transferring the data to the MCU. The protocol requires precise timing for getting the data from the sensors and providing the output. The sampling rate is 0.5 Hz or it reads data every two seconds. There are two versions available, without or with a pull-up resistor. In this research, the pull-up version was used. It comes with a 4.7k to 10k pull-up resistor which pulls the data pin to VCC. The operating voltage of both versions is from 3 to 5 volts, while the maximum current used when measuring, is 2.5 mA. After connecting the sensor to a power supply, it waits for one second to pass an unstable status. A 100 nF capacitor should be connected between the power supply (VCC/VDD) and the GND to filter the ripple. To read the humidity, the MCU sends a start signal to the DHT22. Then the DHT22 changes state from standby mode where it consumes low power to running mode and sends a response signal of 40-bit data which is the relative humidity and temperature information after the MCU finishes sending the start signal. There is a 3-pin connector to be connected to an MCU as shown in Table 4.
MCU: Arduino UNO REV3 was used as the microcontroller in this research. It is a development board equipped with six analog inputs and fourteen general purpose inputs and outputs (GPIOs), six can be used as PWM outputs. The board is based on ATmega328 running on a 16 MHz chip, and is designed for embedded systems and IoT applications. There are two sensors, dust and humidity sensors, in this system. To read the dust density, Pin number 7 was used as an output for IR LED pulsing while the A0 was connected to the dust sensor to read the dust density which is an analog value. The value needs to be processed and converted into dust density data in µg/m 3 . Besides, Pin number 2 was used to be an input to read relative humidity, as shown in Figure 3. The dust density and relative humidity data are read by the MCU every 15 minutes. After that, the data are transformed into JSON payload and transmitted to the NB-IoT cloud service named Magellan, by using the NB-IoT module.
NB-IoT: There are two LPWANs available in Thailand, NB-IoT and LoRa. However, an NB-IoT module in the form of an Arduino shield that can be attached on top of the Arduino UNO, was used for the data connectivity layer. The module employs Universal Asynchronous Receiver/Transmitter (UART) to send and receive data to the MCU using TX and RX pins. Generally, the NB-IoT architecture cannot communicate directly to any application servers. Instead, it has to transmit the data to the NB-IoT cloud service. To get the data (dust density and relative humidity), the server needs to send a request to the NB-IoT cloud service. Then, the NB-IoT cloud service returns the requested data to the server.

Server and services
The application and database server gets information from the NB-IoT cloud service and handles the queries requested by the client's device. The information includes dust density and relative humidity. The system employs IFTTT for sending alerts to the maintenance team.

Implementation
The implementation includes two steps including the experimental setup and data collection. In the experimental setup, the prototype, as shown in Figure 4, was installed in a service control cabinet located in Pracha-utit, Bangkok, Thailand. The control cabinet was attached to the pole at 6-7 meters below the cables to avoid interference from radio frequencies emitted from the cables. The installation and power fed to the control cabinet were done by the EGAT staff as shown in Figure 5. The GP2Y1014AU0F dust sensor does not provide any in-built fan to supply airflow to the sensor's chamber. In this research, a hole at the top of the control cabinet was provided to let the surrounding air into the sensor by natural airflow or wind. This allows the sensor to react to the surroundings quickly and accurately. In the data collection step, the prototype has collected data since Apr 2018. Every hour has four data points in a time-series format (date-time, relative humidity in %, and dust concentration or density in μg/m 3 ). All data points were transmitted to the server to be processed.

Distinguishing smoke and dust particles
The output from the dust sensor has a different pattern between smoke and dust. Smoke consists of tiny unburnt particles and is diffused while moving slowly. It can be detected and can give the output continuously as shown in Figure 6 (left). In contrast, dust has bigger particles. The output pattern detected from dust becomes intermittent as shown in Figure 6 (right). Only dust has a significant effect on the flashover, thus, the IoT module distinguished and collected only dust by reading the output with the synchronized LED pulse on the sensor during a certain time period by a program running on the MCU.

Dust concentration and relative humidity
The time-series graph shown in Figure 7 illustrates dust density/concentration gathered from April 2018 to Jan 2021; since the prototype has been installed and launched. The dust density is shown on the thin line graph, and the mean value (average) of each month is plotted on the graph with a red dot. The dots are then connected in a continuous red line. Both represent data on the same set of axes during the months of each of the years. Overall, the dust concentration distribution trend in 2020 was similar to the years 2019 and 2018. The dust concentrations ranged from 10.2 to 115.7 μg/m 3 ; with the highest value found during the period between December and January, and the lowest during August. Generally, the average dust concretion value was low during the period between June and October and high during the period between November and May. All values were stored in the form of time-series data for the model to learn and predict for the future.
The mean monthly relative humidity data gathered by the system during 2019 and 2020 are shown in Figure 8. The relative humidity usually increases at the beginning of March. The lowest and highest relative humidity ranged from 47% to 97%. The highest value was found in June and October while the lowest value occurred in February and December. Generally, the average relative humidity was low during the period between November-December and January-April, and high during the period between June and October. As the relative humidity increases, the dust concentration usually decreases or vice versa. Therefore, the insulators should be cleaned by October.

Machine learning process
The EGAT engineering team has studied and defined that one dust unit (du) equals accumulated dust concentration level (μg/m 3 ) in a day and the insulators should be cleaned when the dust unit reaches 6000-6200 μg/m 3 to prevent a flashover. To create a machine learning model, the gathered data were pre-processed by resampling into daily and padding for eliminating the missing values. After that, the pre-processed data were split into the training set and test set in 75:25 ratio based on the timefrequency of the series. There were 776 and 259 time-series data points as input features and evaluation respectively. Each data point represents the daily dust unit value. Once the data were available, models were trained then evaluated using R-squared (R 2 ) and Root Mean Squared Error (RMSE) on the training and test set. In this study, we built and compared four models, including linear regression, polynomial regression, Auto-regressive Integrated Moving Average (ARIMA), and Long-Short-Term Memory (LSTM) models.
The Performance results are concluded in Table 5. The LSTM achieved the best performance (R 2 =.931, RMSE=530.74) followed by the ARIMA (R 2 =.876, RMSE=867) and the Linear Regression (R 2 =.819, RMSE=1083.15). The highest per-formance was retrieved when the prediction was within the future five months. It can be concluded that the LSTM model has outperformed the others.

Conclusion
Flashover on contaminated insulators occurs when the insulator's resistance drops because of thin layers of dust, salt and airborne particles, gradually, are deposited on the surface. One of the methods used by EGAT, Thailand to prevent the flashover, is periodic washing of the insulators with high-pressure water before the resistance of the insulator drops to the critical level. To overcome this problem, the maintenance plan should be scheduled efficiently.
In this paper, a flashover prevention system using IoT technology and machine learning is proposed. The IoT module was comprised of a Sharp GP2Y1014AU0F as a dust sensor, DHT22 as a humidity sensor, Arduino UNO Rev3 as an MCU, and NB-IoT Arduino shield as a network device. For the experiment, the prototype was installed in a service control cabinet located in Pracha-utit, Bangkok, Thailand, and had collected data for thirty-four months. The gathered data were pre-processed and split into the training set and test set, 776 and 259 time-series data points as input features and evaluation respectively. In the modeling process, we built and compared four models including linear regression, polynomial regression, Auto-regressive Integrated Moving Average, and Long-Short-Term Memory models. The results reveal that the LSTM model outperformed the others.