Extreme Learning Machine Weights Optimization Using Genetic Algorithm In Electrical Load Forecasting

The growth of electrical consumers in Indonesia continues to increases every year, but it is not matched by the provision of adequate infrastructure that available. This causes the available electrical capacity can't fulfill the demand for electricity. In this study, a smart computing system is build to solves the problem. Electrical load data per hour is being used as an input to do the electrical load forecasting with Extreme Learning Machine method. Extreme Learning Machine method uses random input weight within range -1 to 1. Before the forecasting process is ran, Genetic Algorithms is used to optimize the input weight. According to the test results with weight optimization, MAPE average error rate is 0.799% while without weight optimization the rate rise to 1.1807%. Thus this study implies that Extreme Learning Machine (ELM) method with weight optimization using Genetic Algorithm (GA) can be used in electrical load forecasting problem and give better forecasting result. Keyword: Extreme learning machine, genetic algorithm, forecasting, electricity


Introduction
Consumers of electricity in Indonesia continues to experience increased every year.Based on statistical data in 2016, the total number of subscribers reached 61,167,980.Compared to the year 2014 the increase happened amounted to 6.39%.The high increase is not matched by the provision of adequate infrastructure, so the available electrical capacity can't fulfill the demand for electricity.The power plant that overloaded can lead to power outages.With the addition of a new power plant will be able to overcome it, but it takes a lot of costs.So, as an anticipation Indonesia Electricity Corporation (PT.PLN) do operations management system.Operation management system is the planning of operation which includes planning the distribution and generation to achieve an economical, reliable, and qualified operation systems.
One of the operations management systems is an electric load forecasting.Load forecasting is being used as a base knowledge for generation and distribution planning to fulfill the demand.Extreme Learning Machine (ELM) is a method introduced by Huang in 2004 [1].This method is the development of Artificial Neural networks.This algorithm is adapting performance of neurons in human brain through learning systems [2].ELM that also has known as Single-Hidden Layer Feedforward Neural Networks (SLFNs) only have one hidden layer and has a learning accuracy higher than other algorithms such as Backpropagation.Its because to determine the input weight and bias, ELM using a randomized number.But, according to Alencar, et al. in 2016 [3] the determination of random numbers can cause poor generalization due to the formation of hidden neurons with large numbers.Moreover, it can take more time and the value generated is not an optimal results.Thus, the solution of these problems found by way of combining the Genetic Algorithms with ELM method to get the value of the input weights with optimal results.
A genetic algorithm is a method for searching optimal solutions by conducting an approach adapted from Darwin's theory of evolution.This method is has used late in life.With the generation of random numbers for searching the optimal solution, this method is able to provide the optimal solution to a complex and complicated problem.
A similar study using the same method and different objects never done before.As in research conducted by Wang, et al. in 2016[5].They conducted research wind power using Genetic Algorithm and ELM.On the experiment, the optimize value are hidden node, bias, and the coefficient of regular.And the result is, with combining this method delivers higher accuracy and a better generalization ability [4].
This research objective is to evaluate the parameter and performance of combination between these two method on forecasting the electricity load.

Electricity Load Data
PT. PLN is the biggest electric power provider in Indonesia.Over the Growth of technology that requires electrical energy is increasing, causing the existence of PLN much needed by the whole society.As a provider of electricity PLN has always tried to provide electrical energy consumption according to the needs of electricity, it referred to electrical load.Any equipment that is connecting to the power system and consume electric energy can be called electrical loads.While large or whether the load is dependent on the intensity of use of electrical equipment.The appropriate fulfillment of efforts is necessary, to avoid a deficit of electrical energy and to avoid a blackout.Extreme Learning Machine (ELM) is an improved artificial neural network (ANN) method invented by Huang, Zhu, and Siew in 2004 [1].ELM commonly called as Single Hidden Layer Feedforward Neural Networks (SLNFs) and used as solution from learning speed found in ANN feed forward.This method is able to provide a more stable prediction results with faster time compared to the usual feedforward network by using just one hidden layer and randomly generated the process of determining its parameters.

Extreme Learning Machine
According to Fig. 1 ELM has a 3 layer structure, namely input layer, hidder layer, and output layer.Each node in every layer is mutually linked with weights that have different values and connected into one output [5].
The steps of Extreme Learning Machine are explained below [1]:

Initial Weight and Biased Initialization
Initial weight generation (W jk ) randomly ranged from -1 until 1 and biased value b ranged from 0 until 1. k index indicates the amount of nodes of the input layer while j indicates the amount of hidden layer nodes.

Calculating the Output Hidden Layer
The formula below shows how to calculate the input matrix for hidden layer (H init ).

Sigmoid Activation
Sigmoid Biner activation function is used to map the previous input for hidden neuron results value.Sigmoid Biner activation function is a non linier function which output value ranged from 0 to 1.
= hidden layer output matrix

Calculating the Moore-Penrose Pseudo Inverse Matrix
Moore-Penrose Pseudo Inverse matrix is being used to find the most unique solution from the linier function.This process is done to connect the hidden layer and the output layer and get the output weight value.

Calculating the Forecasting Results
The weight (W and  ) now can be used to calculate the forecasting.The forecasting results is calculated using equation ( 5)

Genetic Algorithm
Genetic algorithm is an optimization technique that adapting biological evolution process and able to give solutions to complex problems.This algorithm is first introduced by John Holland in 1975.In addition to its ability to solve daily problems like choosing the best composition of fodder [6], this algorithm have been applied in a lot of fields like economics, physics, sociology, and so on.
Genetics algorithm is stochastic, in which it can produce different solution whenever generated [7].This algorithm use a lot of genetic terms like genes, individual, and chromosome because it was adopted from biological genetics.

Chromosome Representation and Initialization
Chromosome initialization is done by generating set of solutions randomly with real number as a representation.The length generated is depend on the amount of genes in one chromosome.If there are 5 genes in one chromosome then the length of chromosome is 5.In our case, we used 5 previous consecutive data to forecast the next electricity load.It means we need 5 input neurons for the input layer.Furthermore, we use 3 neurons in hidden one hidden layer.So since it is fully connected neuron, we have 15 weight values which connected all input neuron to all hidden neuron.These weights will be generated using The Genetic Algorithm so we need 15 genes for each of our chromosome.After we generate some chromosomes, the next step is to reproduce to get an offspring.There are 2 reproduction processes needed to be done, the crossover and mutation.Values of cr (crossover rate) and mr (mutation rate) have to be decided first.These values have a role as parametres to determine offspring ratio that should be generated.The offspring value is obtained by multiplying cr/mr value by the population size.

Crossover (Extended Intermediate Crossover)
The crossover process involves two parents whose randomly chosen to generating new chromosome.It will then generating alpha (α) value randomly ranging between -0,25 until 1,25.After that the alpha value will be inserted to Formula 6.

Random Mutation
Different from crossover which need two parents to do the process, mutation only need one parent.This parent is also chosen randomly.The goal of mutation process is to keep the diversity of population [8].Then one gene value from the chosen chromosome is selected randomly to be changed using equation (7).
′ = new gene value  = old gene value r = random value between -0.1 and 0.1  != maximum possible value for selected gene  != minimum possible value for selected gene

Evaluation
Evaluation is used to assess the feasibility of individual or chromosome that are generated from previous process.We calculate the value of each individu using fitness value in equation (8).But before that, hard constrain checking is done before calculating the fitness value.Hard constraint is a condition when the value of a solution is beyond the limit or initial range that has been decided before.When we do an optimization, constraint is a must to make sure a solution didn't pass beyond the limit [9].This research set hard constraint when value of genes is less than or beyond the range limit.When it happen the genes that didn't meet the constraint will be fixed or changed into new genes randomly generated according to the range limit set before.In this case, the higher fitness value means the better the individual.
MAPE is Mean Absolute Percentage Error, which calculate absolute error from all of training data when they are applied to ELM using weight in the chromosome.MAPE is calculated using equation (9).

Selection
Selection is the last step of genetics algorithm, and use to choose the best individual that can survive until the next generation.Using elitism selection method, individuals sort from the largest according to their fitness Then 30% of the population with lowest fitness value to be processed in the next generation [8].

GA-ELM
Determination of variables such as the initial weights and biases found in the ELM done in random.It causes the results obtained is not necessarily optimal.Therefore, the genetic algorithm is present as a solution to these problems for optimizing initial weights so that the value of the forecasting always could gives optimum results.The GA-ELM algorithm follows these steps: Step 1 initialize the number of population size, the number of generations, the value of the cr, and the value of mr.
Step 2 Create some random new chromosomes as much as population size.
Step 3 Select random chromosome/s for reproduction process which are crossover and mutation.Then put the offspring altogether with the current population.
Step 4 Evaluate each chromosome by using it as an initial input weights on ELM to obtain the value of its accuracy (calculate MAPE and fitness value).
Step 5 Select a number of chromosome as much as population size, which have higher fitness from the others to pass through to the next generation.Combination testing between crossover rate and mutation rate aims to get cr and mr combinaton value that give the most optimal in which has the smallest fitness value.The result of combination testing between crossover rate and mutation rate is shown in Fig. 3, according to that the fitness value is unstable, it is caused by randomly generated value made in everytime we run the process.From figure 3, we get the highest fitness value is generated from the combination of cr value of 0.8 and mr value of 0.2 with fitness average value of 9.998x10 -1 .
From the processes above we can conclude that crossover produce more offspring than mutation process.But if the cr and mr value is the same then the result won't be optimal.Some previous research use a higher cr value.According to [11] it is caused by the focus of crossover to find new indivuals and expanding the searching area.
There are conditions where a solution searching process is stuck between local optimum caused by a premature convergence.A premature convergence may happen when only individual with best fitness value is being chosen result in solution search in just one area and not exploring the other options.To resolve the problem with lowest probabilities, mutation process can be used because it requires just one genes to me modified to provide offspring result.Thus the probabilities is either making individual with good fitness value or change it for the worse [10].

Fig. 3 Average fitness result of cr and mr Combination Testing
In addition, the speed of solution search will be different for each testing with 100 generations as default.Even when test is being done with cr value of 0.8 and mr value of 0.2 is which is the most optimal, from 5 testing the most optimal solution can be found in between 23 rd and 80 th generation, which is a really random value.That

Fitness
Crossover Rate / Muta5on Rate means there just a few influence of number of generation to cr and mr value.Therefore, to determine the number of generations to be used in the next step, population size testing, can be done by taking the highest generations value obtained from the overall combination testing between cr and mr.
In this test, the highest generation value is 95, which means that from 100 generations, the most optimal solution is found in the 95 th generation.The highest generation value can be used as the ideal generation value to find the most optimal solution searching.The more generation value use the longer it takes to do the computation [8].Thus, if we can find the best solution in the 95 th generation, testing more than 95 generation is just a waste of time.

Population Size Testing
Population size testing is done to get the value that gives the most optimal solution with the best fitness value.The results of population size testing is shown in Fig. 4. According to that higher population size did not always provide the most optimal fitness value.The instability that occurs can be caused by the random value generated every time the process started.In this research, the lowest fitness average value is obtained on population size value of 40.Meanwhile the highest and the most optimal fitness average value is obtained on population size value of 100 with fitness value of 9.997x10-1.fitness value increase between range 40 until 100, but become unstable again once the population size pass beyond 100.Larger population data makes the exploration range grows bigger thus slowing the calculation process [8].
Fig. 4 Average fitness of population size testing

Electrical Load Data Testing
Electrical load data testing is being done to find the pattern that gives the best fitness value.This testing will be using the same data but with different input pattern.Input patterns that will be tested is sorted by hours, days, and weeks.

Step 6 Step 7 8 6 Testing And Result 6 . 1
Repeat the steps 3 to 5, until getting enough individuals and reach maximum generation.In the last generation we choose the best chromosomes, then do the process on ELM to see the forecasting outcome.a. Calculate the input hidden layer with the training data.b.Activate the matrix using the sigmoid binary.c.Calculate the moore-penrose pseudo invers matrix.d.Calculate the output weight.Step Still in ELM process, do the testing process: a. Calculate the input hidden layer with the testing data.b.Activate the matrix using the sigmoid binary.c.Calculate the forecasting result using output weight from step 7d.d.Calculate the accuracy.p-ISSN: 2540-9433; e-ISSN: 2540-9824 Crossover Rate (Cr) & Mutation Rate (Mr) Combination Testing

1 .
Input pattern by hours Input data used is electrical load data from the previous 5 hours to forecast the load in the next hour.2. Input pattern by days 9