CN111260500B

CN111260500B - Hadoop-based distributed differential evolution scheduling method for small hydropower station

Info

Publication number: CN111260500B
Application number: CN201911277513.6A
Authority: CN
Inventors: 王万良; 骆俊锦; 吴菲; 王宇乐; 屠杭垚; 李国庆; 赵燕伟
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2019-12-12
Filing date: 2019-12-12
Publication date: 2021-12-07
Anticipated expiration: 2039-12-12
Also published as: CN111260500A

Abstract

The Hadoop-based distributed differential evolution scheduling method for small hydropower stations comprises the following steps: step 1: establishing a mathematical model with the maximum generated energy as a target; the target function is set as the maximum value of the sum of the generated energy of n hydropower stations in the small hydropower station group; step 2: encoding and initializing: the flow quoted in the scheduling periods of the n hydropower stations needs to be comprehensively scheduled, the flow quoted needs to be taken as a whole, and each individual is composed of the flow quoted value in the T time period of the small hydropower station group, namely a T x n dimensional array; step 3, evolution operation, comprising: mutation operation, crossover operation, selection operation; and 4, realizing distributed calculation: by adopting a sub-population concept and relying on a programming framework MapReduce of a Hadoop platform, a father population formed by a certain number of individuals is initialized, then the initial population is divided into a certain number of relatively independent sub-populations, each sub-population independently evolves, and finally all the sub-populations are summarized to screen out optimal individuals.

Description

Hadoop-based distributed differential evolution scheduling method for small hydropower station

Technical Field

The invention relates to optimized scheduling of small hydropower station groups, in particular to a distributed optimized scheduling method based on Hadoop.

Background

Small hydropower is short for small hydropower stations. Small hydropower as a high-quality, clean and renewable energy source plays a very important role in power generation, flood control, irrigation, water supply and the like. However, in the past, small hydropower plants in China are lack of reasonable planning, and various problems of inadequate supervision, insufficient scheduling and the like exist, so that the small hydropower plants are scientifically and reasonably managed and utilized. The small hydropower station is optimally scheduled, so that the active effect of the small hydropower station can be exerted to a greater extent, the economic benefit is improved, the ecological benefit is improved, the sustainable development of the small hydropower station is promoted, and the intelligent strategy of a power grid in China is promoted.

The optimized scheduling of the small hydropower station refers to analogy of actual problems into abstract problems, establishment of a target function, and searching for an optimal solution in the scheduling of the small hydropower station by means of mathematical statistics and algorithm programming, so that the reservoir can be efficiently utilized, and the generating capacity and the generating benefit are improved. As early as the 40 th century, the optimized dispatching of foreign hydropower stations started, and through decades of development, relatively complete academic achievements in the field have been achieved. Reservoir dispatching research in China gradually began to develop only in the 80 s of the 20 th century.

In the research of hydropower optimization scheduling algorithm, the traditional mathematical programming method is dwarfed due to the occurrence of a heuristic method. The solving method comprises a Genetic Algorithm (GA), a simulated annealing method (SA), a tabu search algorithm (TS), an artificial neural network method (ANN), a particle swarm algorithm (PSO), an ant colony Algorithm (ACO), a Chaos Optimization Algorithm (COA) and the like. In the heuristic method, the genetic algorithm belongs to the most studied and mature algorithm and is also the earliest method for optimizing and scheduling the hydropower station, and then, a plurality of researches for improving the genetic algorithm appear.

However, with the explosive growth of data in production and life, the evolutionary algorithm has the defects of complex calculation and slow speed when processing massive high-dimensional data.

Disclosure of Invention

The invention provides a distributed differential evolution scheduling method for small hydropower stations based on Hadoop, aiming at overcoming the defects in the prior art and improving the operational efficiency of the optimized scheduling method for the small hydropower stations.

The distributed computation of a differential evolution algorithm is realized by utilizing a MapReduce framework of a Hadoop platform, a novel variation strategy is selected for the differential evolution, and the diversity of the population is kept by adopting a fixed global boundary, so that a feasible scheduling scheme is provided for the optimal scheduling of the small hydropower station group with the generated energy as the target.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a distributed differential evolution scheduling method for small hydropower stations based on Hadoop is characterized in that a Hadoop distributed computing platform is used for completing distributed differential evolution scheduling with generated energy as a target; the concept of the sub-populations is introduced, each sub-population corresponding to each Map process independently completes a series of evolution operations such as variation, intersection, selection and the like in the differential evolution process, and finally, the Reduce process is used for summarizing and screening to obtain an optimal scheduling scheme suitable for a target hydropower station, so that the economic benefit and the ecological benefit of the small hydropower station are improved, and the sustainable development of the small hydropower station is promoted.

The Hadoop-based distributed differential evolution scheduling method for small hydropower stations comprises the following steps:

step 1: and establishing a mathematical model with the maximum generated energy as a target. The objective function is set as the maximum value of the sum of the generated energy of n hydropower stations in the small hydropower station group.

In the formula, the meaning of each mathematical symbol is:

f: the annual generating capacity of the small hydropower station is ten thousand kilowatt-hours;

t: scheduling the total period number of the cycle;

i: the number of hydropower stations;

K_i: the output coefficient of the ith hydropower station;

Q_i(t): the power generation flow of the hydropower station i in the t period;

H_i(t): generating water heads of the hydropower station i in a time period t;

Δt_i: and the generating time of the hydropower station i in the t period is s.

In addition, the constraint conditions of the small hydropower station optimization scheduling model are as follows:

a. and (4) library capacity constraint:

V_imin<V_i<V_imax (1-2)

wherein, V_imin、V_imaxThe minimum and maximum reservoir capacity values of the ith reservoir respectively;

b. reference to flow constraints:

Q_i,tmin<Q_i,t<Q_i,tmax (1-3)

wherein Q is_i,tmin、Q_i,tmaxRespectively the minimum and maximum quoted flow of the i hydropower station in the t period;

c. water quantity balance:

V_i(t)＝V_i(t-1)+(q_i(t)-Q_i(t)) (1-4)

wherein, V_i(t)、V_i(t-1) the values of the storage capacity in the periods t and t-1, respectively, q_i(t) represents the amount of water coming during the interval t, Q_i(t) represents the let-down flow rate during the period t;

d. non-negative condition constraint:

all the variables are more than or equal to 0

Because of the long-term scheduling, the water flow time between reservoirs is not considered.

Step 2: encoding and initializing: to comprehensively schedule the reference flow in the n hydropower station scheduling periods, the reference flow needs to be taken as a whole, each individual is composed of reference flow values in the T time period of the small hydropower station group, namely a T-n dimensional array, a real number coding mode is adopted, and a coding table is shown in a table 2-1.

TABLE 2-1 hydropower station coding table

The population size was NP. In the population initialization process, reading file contents, initializing the water inflow amount of each hydropower station every month and the storage capacity size of a time interval 1, randomly generating all individuals of a population by taking the upper limit and the lower limit of the reference flow as boundaries when initializing the power generation reference flow, calculating the fitness value of the individuals, and storing the fitness value.

Step 3, evolution operation, which comprises the following specific steps:

3.1 mutation operation;

in the differential evolution process, the population is firstly subjected to variation operation, a novel variation strategy is adopted, a fixed global boundary is adopted, the diversity of the population is favorably kept, and the variation strategy is shown as a formula 3-1.

X_i,j＝k*(a_j+b_j)+sin(w)*(c*x_e,j+(1-c)*x_r,j) (3-1)

Wherein k is [0,1 ]]Is a random number; a is_j、b_jRespectively the upper and lower boundaries of the current population boundary; x is the number of_e,jThe j gene of the current elite solution; x is the number of_r,jIs the jth gene of the jth individual, r is not equal to i, w belongs to [0,2 pi ]]And c is a fixed value of 0.3.

3.2, cross operation;

the objects of the cross operation are an initial population and a population consisting of variant individuals, and a binomial cross mode is adopted, namely:

wherein,

in order to test the individual(s),

is the original individual. Firstly, randomly selecting an individual in an initial population for standby, then sequentially generating a random value within the range of 0-1 for each dimension of each individual in the population, and if the random value is less than the cross probability or when the random value is less than the cross probabilityAnd if the former individual is the initially selected individual, performing crossover, namely selecting the current dimension value of the corresponding variant individual to replace the dimension value of the current individual, otherwise, keeping the dimension value of the current individual unchanged, thereby obtaining the test individual.

3.3 selecting operation;

firstly, according to the initial storage capacity of each reservoir hydropower station month and the water volume coming in each month, respectively adopting a flow reference scheme corresponding to a test individual and an individual in an initial population to update the storage capacity value at the end of each month, namely: the end-of-month storage capacity is the initial storage capacity of the month + the amount of the water coming from the month-the electricity generation reference flow of the month.

If the storage capacity value does not meet the storage capacity constraint at the end of the month: correcting the reference flow lower than the minimum reservoir capacity value constraint to ensure that the monthly reference flow is equal to the monthly initial reservoir capacity + the monthly water volume-minimum reservoir capacity, setting the monthly end reservoir capacity as the minimum reservoir capacity, and generating 0 waste water; marking the quoted flow higher than the maximum reservoir capacity value constraint, setting the reservoir capacity at the end of the month as the maximum reservoir capacity, generating the abandoned water amount which is the initial reservoir capacity of the month plus the water amount of the month-the quoted flow-the maximum reservoir capacity, and simultaneously recording the accumulated value of the individual abandoned water amount by using a variable; and for the reference flow meeting the storage capacity constraint, the updating water abandonment amount is 0. And after updating each dimension in the individual, respectively calculating the adaptive value of the test individual and the initial individual.

The following constraint processing rules are used to select individuals for entry into the next generation for: (1) the test individuals and the initial individuals are not marked, and the individuals with larger adaptive values are selected; (2) selecting the test individual if the test individual is not marked and the initial individual is marked; (3) both are marked and the individual with the smaller offset value is selected. And finally updating parameters such as the adaptive value, the water abandoning amount, the storage capacity value and the like of the individuals in the population.

And 4, realizing distributed computation: by adopting a sub-population concept and relying on a programming framework MapReduce of a Hadoop platform, a father population formed by a certain number of individuals is initialized, then the initial population is divided into a certain number of relatively independent sub-populations, each sub-population independently evolves, and finally all the sub-populations are summarized to screen out optimal individuals. And after each sub-population evolves for a certain number of generations, summarizing data submitted by the Map process through a Reduce process, screening out the optimal individuals in all the sub-populations, namely the corresponding optimal power generation water scheme, and generating a result file together with the corresponding optimal adaptive value as a result to output the result file to the HDFS.

4.1Map process;

the MapReduce processed data form is a form of key, value key value pair, and as the DDE-MR evolves based on the sub-population, the input and output key, value pair format of the map function is set as:

<key1_i,value1_i>→<key2_i,value2_i>,i＝1,2,...,n (4-1)

wherein n is the total number of the sub-populations; key1_iIs the starting address of the ith sub-population in the file, value1_iAll individuals representing the ith subgroup; key2_iThen it is an index of the ith sub-population, value2_iAll individuals of the ith subgroup are also represented.

The pseudo code for the Map function is as follows:

4.2Reduce process;

between the Map process and Reduce process, there is a Shuffle process, which performs preliminary sorting and classification on the output key value pairs of Map according to the key value, and the data introduced into Reduce becomes the form of [ key, list [ value ] ]. The main workings of the Reduce function are therefore: traversing all individuals in the sub-population corresponding to the value in the list, screening out the individual with the maximum adaptive value, and outputting the individual to the HDFS, wherein the number of reducers is set to be 1.

The pseudo code for Reduce function is as follows:

the technical conception of the invention is as follows: on the basis of the optimized scheduling problem of small hydropower stations, a differential evolution algorithm is adopted to carry out coding, fitness function design and a series of evolution operations on the small hydropower stations so as to process the running data of each hydropower station, optimize the power generation reference flow to obtain an optimized reference flow scheme, and provide data support for the actual scheduling scheme. Meanwhile, a programming frame MapReduce of a Hadoop platform is taken as a research object, the concept of sub-populations is adopted in an evolution mode, the evolution process is independently completed among the sub-populations, and the traditional evolution task based on a single population is decomposed.

The invention has the advantages that: with the explosive growth of data in production and life, the evolutionary algorithm has the defects of complex calculation and low speed when processing massive high-dimensional data, and the adoption of a big data distributed platform Hadoop can improve the original differential evolutionary algorithm, so that the evolutionary algorithm is suitable for data processing and optimized scheduling in a big data environment, and has very important practical significance. The optimal scheduling of small hydropower stations is taken as an application direction, and in order to improve the operational efficiency, a Hadoop-based distributed differential evolution scheduling method for small hydropower stations is provided, so that the economic benefit and the ecological benefit of small hydropower stations can be better improved, and the sustainable development of small hydropower stations is promoted.

Drawings

FIG. 1 is a flow chart of a differential evolution algorithm of the present invention;

FIG. 2 is a diagram of a distributed platform architecture of the present invention;

FIG. 3 is an overall flow chart of the present invention;

FIG. 4 is a differential evolution algorithm convergence diagram of the present invention;

FIG. 5 is a monthly scheduling statistical chart of the trial basin power generation amount of the present invention;

FIG. 6 is a running acceleration ratio of the algorithm under the Hadoop platform of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

In the embodiment, a reservoir power station with regulation capacity is selected by taking a Jiangxi Lushui river basin as a test point object: taking a social chief, a head and wave and an east valley power station as research objects, taking the maximum generated energy as a scheduling model of an optimization target, adopting a monthly scheduling mode to comprehensively schedule the quoted flow within 12 months (T is 12), and setting the initial reservoir capacity of the scheduling as a monthly initial reservoir capacity value; the population size NP of the algorithm is set to 1000 and the number of iterations is 1000.

As shown in fig. 1, after coding and fitness function design of an actual research object, firstly, setting control parameters of an algorithm, including a mutation operator, a crossover operator and other parameters required by operation; secondly, generating an initialization population and calculating population fitness; then, carrying out differential variation operation on the population to generate new variation individuals to form a variation population; performing cross operation on the original population and the variant population to generate a temporary population; then, selecting the temporary population according to certain selection conditions to generate a new population for next generation evolution, and simultaneously recording the optimal individuals until now; after a certain number of iterations, whether the evolution is finished or not is determined by judging whether the initial set iteration number is reached or not.

As shown in fig. 2, the distributed cluster is constructed by using server virtualization nodes as Hadoop deployment carriers, fig. 2 is a virtualization architecture diagram of two servers, and the server virtualization technology is used to convert physical resources of the servers into logical resources, so that a plurality of operating systems run on the same host at the same time, thereby improving the utilization rate of the resources to a greater extent and reducing the total cost of the system. Each server is virtualized into a plurality of virtual nodes equivalent to physical nodes, and the virtual nodes are interconnected through a switch, so that all hosts are communicated with each other, and a daemon process can normally run on the hosts.

As shown in FIG. 3, based on the design of the differential evolution algorithm, the method is combined with the MapReduce framework of the Hadoop platform. Two main tasks of designing the distributed computing method based on MapReduce are to construct a Map process and a Reduce process, namely to define a Map function and a Reduce function. The Map process is used for decomposing the total task, and the Reduce process is used for summarizing the plurality of decomposed subtasks.

Therefore, in the aspect of designing the evolution mode of the population, a father population composed of a certain number of individuals is initialized, then the initial population is divided into a certain number of relatively independent sub-populations, each sub-population independently evolves, and finally all the sub-populations are gathered to screen out the optimal individuals. And combining with a MapReduce framework, namely processing the evolution process of one sub-population by each Map process, after each sub-population evolves for a certain number of generations, summarizing the data submitted by the Map process through a Reduce process, screening out the optimal individuals in all the sub-populations, namely the corresponding optimal power generation water scheme, and generating a result file together with the corresponding optimal adaptive value as a result and outputting the result file to the HDFS.

step 1: and establishing a mathematical model with the maximum generated energy as a target. The objective function is set to be the maximum value of the sum of the generated energy of the 3 hydropower stations, the scheduling period is one year, and 12 months are taken as unit time intervals.

In the formula, the meaning of each mathematical symbol is:

t: scheduling the total period number of the cycle;

i: the number of hydropower stations;

K_i: the output coefficient of the ith hydropower station;

Q_i(t): the power generation flow of the hydropower station i in the t period;

H_i(t): generating water heads of the hydropower station i in a time period t;

a. and (4) library capacity constraint:

V_imin<V_i<V_imax (1-2)

b. reference to flow constraints:

Q_i,tmin<Q_i,t<Q_i,tmax (1-3)

c. water quantity balance:

V_i(t)＝V_i(t-1)+(q_i(t)-Q_i(t)) (1-4)

d. non-negative condition constraint:

all the variables are more than or equal to 0

Step 2: encoding and initializing: and comprehensively scheduling the reference flow of the three hydropower stations within 12 months, taking the reference flow as a whole, wherein each individual is composed of the reference flow value of the small hydropower station within 12 months, namely a 12 x 3-dimensional array, and a real number coding mode is adopted, and a coding table is shown in a table 2-2.

TABLE 2-2 hydropower station coding table

The population size NP is initially set to 1000. In the population initialization process, reading file contents, initializing the water inflow amount of each hydropower station every month and the size of the storage capacity at the beginning of 1 month, randomly generating all individuals of a population by taking the upper and lower limits of the reference flow as boundaries when initializing the power generation reference flow, calculating the fitness value of the individuals, and storing the fitness value.

Step 3, evolution operation, which comprises the following specific steps:

3.1 mutation operation;

X_i,j＝k*(a_j+b_j)+sin(w)*(c*x_e,j+(1-c)*x_r,j) (3-1)

3.2, cross operation;

wherein,

in order to test the individual(s),

is the original individual. Firstly, randomly selecting an individual in an initial population for standby, then sequentially generating a random value within the range of 0-1 for each dimension of each individual in the population, if the random value is less than the cross probability or the current individual is the initially selected individual, performing cross,selecting the current dimension value of the corresponding variant individual to replace the dimension value of the current individual, otherwise, keeping the dimension value of the current individual unchanged, thereby obtaining the test individual.

3.3 selecting operation;

4.1Map process;

<key1_i,value1_i>→<key2_i,value2_i>,i＝1,2,...,n (4-1)

The pseudo code for the Map function is as follows:

4.2Reduce process;

The pseudo code for Reduce function is as follows:

as shown in fig. 4, the small hydropower group of the luwater river basin is optimally scheduled with the maximum power generation amount as the scheduling target. And adopting a monthly scheduling mode, setting the 2014 as a scheduling target year, setting the reservoir capacity of the initial scheduling reservoir as an actual early monthly reservoir capacity value, and acquiring actual corresponding data by quoting flow constraint, reservoir capacity constraint and the like. And solving a scheduling model for the main reservoir type power station according to the actual warehousing runoff of the year to obtain an optimized scheduling simulation result, wherein the optimized scheduling simulation result comprises the power generation index amount, the storage capacity, the water abandonment amount, the power generation amount and the like of each time period. The actual scheduling power generation and optimized scheduling power generation results of the main hydropower stations of the small hydropower station are shown in a table 3-1.

TABLE 3-1 comparison of actual and optimized power generation for small hydroelectric power generation groups

The scheduling results are compared, and the total power generation amount of the small hydropower station group under optimized scheduling is basically improved. Meanwhile, the actual monthly power generation amount of the small hydropower group of the Lushui river in 2014 and the optimized scheduled monthly power generation amount are counted to obtain a statistical chart shown in the figure 4. It can be obtained that after optimized scheduling, the total power generation amount of the small hydropower station basically exceeds the actual power generation amount. From simulation results, it is observed that the power generation amount of individual hydropower stations in a small hydropower group is not always increased, and may not reach an actual scheduling value compared with monthly actual scheduling, because local optimization needs to be abandoned to achieve the overall improvement goal in order to pursue overall optimization during overall scheduling.

As shown in fig. 5, the optimal adaptive value convergence variation curve of the DE algorithm is a curve that fluctuates in the early stage but converges to global optimal when the population evolves to about 300 generations.

As shown in FIG. 6, the concept of an acceleration ratio is introduced herein to evaluate the operating efficiency of a parallel algorithm. The acceleration ratio is the ratio of the execution time of a single node to the parallel execution time under the same task and condition; the larger the acceleration ratio, the better the parallelism. The calculation formula is as follows:

wherein, T₁For the running time of a task in the state of a single node, T_nThe time taken for the task to run in parallel in the n nodes.

The acceleration ratio of the experimental run result of the population size 4000 is fed back in the figure. When the number of maps is small, that is, the number of parallel tasks is small, the cluster operation time is longer than that of a single machine due to the need of communication among the nodes in the cluster. When the number of maps is increased, the number of parallel tasks is increased, the acceleration ratio is gradually increased and exceeds 1, the running time of each job is shortened when the job runs in a cluster environment, and the acceleration ratio is continuously increased. Therefore, it is inferred that when the problem scale is large and the number of processing jobs is large, the cluster scale is increased, namely the number of computing nodes in the cluster scale, and the processing capacity of the system for the same task is remarkably improved, which also shows that the differential evolution algorithm can obtain a better speed-up ratio when processing the actual problem under the MapReduce framework.

The embodiments described in this specification are merely illustrative of implementations of the inventive concept and the scope of the present invention should not be considered limited to the specific forms set forth in the embodiments but rather by the equivalents thereof as may occur to those skilled in the art upon consideration of the present inventive concept.

Claims

1. The Hadoop-based distributed differential evolution scheduling method for small hydropower stations comprises the following steps:

step 1: establishing a mathematical model with the maximum generated energy as a target; the target function is set as the maximum value of the sum of the generated energy of I hydropower stations in the small hydropower station group;

in the formula, the meaning of each mathematical symbol is:

t: scheduling the total period number of the cycle;

i: the number of hydropower stations;

K_i: the output coefficient of the ith hydropower station;

Q_i(t): the power generation flow of the hydropower station i in the t period;

H_i(t): generating water heads of the hydropower station i in a time period t;

Δt_i: generating time of the hydropower station i in a time period t, wherein the unit is s;

a. and (4) library capacity constraint:

V_imin<V_i<V_imax (1-2)

b. reference to flow constraints:

Q_i,tmin<Q_i(t)<Q_i,tmax (1-3)

c. water quantity balance:

V_i(t)＝V_i(t-1)+(q_i(t)-Q_i(t)) (1-4)

wherein, V_i(t)、V_i(t-1) the values of the storage capacity in the periods t and t-1, respectively, q_i(t) represents the amount of water coming during the interval t;

d. non-negative condition constraint:

all the variables are more than or equal to 0

Because of long-term scheduling, the water flow time between reservoirs is not considered;

step 2: encoding and initializing: the flow quoted in the I hydropower station dispatching cycle needs to be comprehensively dispatched as a whole, each individual is composed of the flow quoted value in the T time period of the small hydropower station group, namely a T-I dimensional array, and a real number coding mode is adopted;

the population size is NP; in the population initialization process, reading file contents, initializing the water inflow amount of each hydropower station every month and the storage capacity size of a time interval 1, randomly generating all individuals of a population by taking the upper limit and the lower limit of a reference flow as boundaries when initializing the power generation reference flow, calculating the fitness value of the individuals and storing the fitness value;

step 3, evolution operation, which comprises the following specific steps:

3.1 mutation operation;

in the differential evolution process, the population is firstly subjected to variation operation, a fixed global boundary is adopted, the diversity of the population is favorably kept, and the variation strategy is shown as a formula 3-1;

X_i,j＝k*(a_j+b_j)+sin(w)*(c*x_e,j+(1-c)*x_r,j) (3-1)

wherein k is [0,1 ]]Is a random number; a is_j、b_jRespectively the upper and lower boundaries of the current population boundary; x is the number of_e,jThe j gene of the current elite solution; x is the number of_r,jIs the jth gene of the jth individual, r is not equal to i, w belongs to [0,2 pi ]]C is a fixed value of 0.3;

3.2, cross operation;

wherein,

in order to test the individual(s),

is the original individual; firstly, randomly selecting an individual in an initial population for standby, and then sequentially carrying out each dimension of each individual in the populationGenerating a random value within the range of 0-1, if the random value is less than the crossover probability or the current individual is the initially selected individual, performing crossover, namely selecting the current dimension value of the corresponding variant individual to replace the dimension value of the current individual, otherwise, keeping the dimension value of the current individual unchanged, thereby obtaining a test individual;

3.3 selecting operation;

firstly, according to the initial storage capacity of each reservoir hydropower station month and the water volume coming in each month, respectively adopting a flow reference scheme corresponding to a test individual and an individual in an initial population to update the storage capacity value at the end of each month, namely: the last month storage capacity is the initial month storage capacity plus the monthly water amount-the monthly power generation reference flow;

if the storage capacity value does not meet the storage capacity constraint at the end of the month: correcting the reference flow lower than the minimum reservoir capacity value constraint to ensure that the monthly reference flow is equal to the monthly initial reservoir capacity + the monthly water volume-minimum reservoir capacity, setting the monthly end reservoir capacity as the minimum reservoir capacity, and generating 0 waste water; marking the quoted flow higher than the maximum reservoir capacity value constraint, setting the reservoir capacity at the end of the month as the maximum reservoir capacity, generating the abandoned water amount which is the initial reservoir capacity of the month plus the water amount of the month-the quoted flow-the maximum reservoir capacity, and simultaneously recording the accumulated value of the individual abandoned water amount by using a variable; updating the water abandoning amount to be 0 for the reference flow meeting the storage capacity constraint; after updating each dimension in the individual, respectively calculating the adaptive value of the test individual and the initial individual;

the following constraint processing rules are used to select individuals for entry into the next generation for: (1) the test individuals and the initial individuals are not marked, and the individuals with larger adaptive values are selected; (2) selecting the test individual if the test individual is not marked and the initial individual is marked; (3) both are marked, and an individual with a smaller deviation value is selected; finally, updating adaptive values, water abandoning amounts and storage capacity value parameters of individuals in the population;

and 4, realizing distributed computation: by adopting a sub-population concept and taking a programming frame MapReduce of a Hadoop platform as a basis, firstly initializing a father population consisting of a certain number of individuals, then dividing the initial population into a certain number of relatively independent sub-populations, wherein each sub-population independently evolves, finally summarizing all the sub-populations, and screening out optimal individuals; each Map process processes the evolution process of one sub-population, after each sub-population evolves for a certain number of generations, data submitted by the Map process are summarized through the Reduce process, the optimal individuals in all the sub-populations, namely the corresponding optimal power generation water scheme, are screened out, and a result file is generated by taking the optimal adaptation value and the corresponding optimal adaptation value as results and is output to the HDFS;

4.1Map process;

＜key1_i,value1_i＞→＜key2_i,value2_i＞,i＝1,2,...,n (4-1)

wherein n is the total number of the sub-populations; key1_iIs the starting address of the ith sub-population in the file, value1_iAll individuals representing the ith subgroup; key2_iThen it is an index of the ith sub-population, value2_iAll individuals of the ith subgroup are also represented;

the Map process is as follows: firstly, setting initialization parameters, and performing differential variation, intersection and selection operations on a population within the range of iteration times to obtain a final sub-population; then, the index is used as a key, all individuals in the population are used as values, and a key value pair write context is formed;

4.2Reduce process;

between the Map process and the Reduce process, a Shuffle process exists, which carries out preliminary sorting and classification on the output key value pairs of the Map according to the key value, and the data transmitted into the Reduce is changed into a form of [ key, list [ value ] ]; the main workings of the Reduce function are therefore: traversing all individuals in the sub-population corresponding to the value in the list, screening out the individual with the maximum adaptive value, and outputting the individual to the HDFS, wherein the number of reducers is set to be 1.