[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2021077729A1 - 一种雷电预测方法 - Google Patents

一种雷电预测方法 Download PDF

Info

Publication number
WO2021077729A1
WO2021077729A1 PCT/CN2020/090434 CN2020090434W WO2021077729A1 WO 2021077729 A1 WO2021077729 A1 WO 2021077729A1 CN 2020090434 W CN2020090434 W CN 2020090434W WO 2021077729 A1 WO2021077729 A1 WO 2021077729A1
Authority
WO
WIPO (PCT)
Prior art keywords
lightning
order
forecast
meteorological
meteorological parameters
Prior art date
Application number
PCT/CN2020/090434
Other languages
English (en)
French (fr)
Inventor
方玉河
李健
王钊
陈玥
吴大伟
陶汉涛
许远根
陈扬
张磊
林卿
姜志博
高攀
李旺
Original Assignee
国网电力科学研究院武汉南瑞有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 国网电力科学研究院武汉南瑞有限责任公司 filed Critical 国网电力科学研究院武汉南瑞有限责任公司
Priority to AU2020372283A priority Critical patent/AU2020372283A1/en
Publication of WO2021077729A1 publication Critical patent/WO2021077729A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Definitions

  • the invention relates to the technical field of disaster prevention and reduction, in particular to a lightning prediction method.
  • Thunder and lightning are often accompanied by lightning and thunder. It is also called lightning. It is a very spectacular and extremely destructive natural phenomenon. The location of thunder and lightning is mostly in the cumulonimbus with intense convection process, or between the electrified thundercloud and the ground protrusion. The occurrence and development of the lightning process is the result of the combined effect of many natural and physical conditions such as atmospheric motion and the earth's magnetic field. As a strong discharge phenomenon, the current value during the occurrence of lightning can reach tens of thousands of amperes. Moreover, the instantaneous voltage of lightning is also very high, reaching several million volts. Therefore, the power of a mid-to-low intensity thunderstorm can reach about 10 million watts, which is equivalent to the output power of a small nuclear power plant.
  • Lightning warning is an indispensable part of the country's disastrous weather forecast, improving its accuracy and forecasting service level, which is closely related to the development of the whole society and the safety of various industries and people's lives.
  • Common lightning forecasting and early warning methods are mainly radar data extrapolation, direct forecasting with numerical models, empirical forecasts based on meteorological elements, and short-term forecasts based on atmospheric electric field instruments.
  • direct forecasting with numerical models has high accuracy, but requires The computing power is very large and the cost is very high; the calculation amount required for the extrapolation of radar data and the empirical forecast method based on meteorological elements is far smaller than the numerical model, but the accuracy rate is low; methods such as short-term forecast based on atmospheric electric field instrument The forecast result is more accurate, but the forecast time effect is very small.
  • the existing lightning early warning methods have disadvantages such as low accuracy, too large required computing resources, and too small forecasting timeliness. How to reduce the trial use of computing power, save costs, improve forecast timeliness, and achieve better accuracy are currently problems that need to be resolved.
  • the purpose of the present invention is to provide a lightning prediction method, which has a small amount of calculation, low cost and high forecast accuracy.
  • a lightning prediction method includes the following steps:
  • S2 Calculate the high-order meteorological parameters related to lightning based on the high-order meteorological parameters of the area to be predicted;
  • S4 Based on the random forest algorithm, calculate the correlation degree of each high-order meteorological parameter with thunder and lightning, and select the high-order meteorological parameter with a high degree of correlation with thunder and lightning;
  • S5 Use XGBoost algorithm to establish a forecast model based on forecast timeliness, forecast times, and high-order meteorological parameters that are highly correlated with lightning;
  • the basic meteorological parameters include temperature, humidity, dew point, vorticity, air pressure, convective precipitation, non-convective precipitation, convective effective potential energy, and radar reflectivity at different altitudes in the area to be predicted.
  • the high-level meteorological parameters include A index, K index, Sabouraud index, and strong weather threat index.
  • step S3 gridding the lightning positioning observation data refers to using a grid method to convert the lightning positioning observation data into a grid with the same longitude, latitude, and resolution as the basic meteorological parameters. Grid data.
  • step S4 based on the random forest algorithm, the specific method for calculating the correlation degree of each high-order meteorological parameter with lightning is: taking each high-order meteorological parameter as the feature vector and using the gridded lightning The positioning observation data is used as the target vector to establish a random forest model, and then the outer bag function is used as an evaluation index to calculate the importance of each feature vector, and determine the degree of correlation between high-order meteorological parameters and lightning according to the importance of each feature vector.
  • step S5 based on the forecast timeliness, forecast times, and high-order meteorological parameters that are highly correlated with lightning, the XGBoost algorithm is used to establish a forecast model as follows: For each high-order meteorological parameter, use high-order meteorological parameters The historical data of the parameters is the feature vector, the grid-processed lightning location historical observation data is the target vector, the linear regression function is the objective parameter, and the hyperopt algorithm is used to perform Bayesian adjustment of the hyperparameters in the XGBoost algorithm to construct The forecast model of high-order meteorological parameters and lightning data at different forecast times is a multi-time forecast model.
  • using a forecast model to predict the spatial distribution and occurrence probability of lightning includes:
  • the present invention has the following beneficial effects:
  • the lightning forecasting method disclosed by the present invention significantly reduces the calculation amount and greatly reduces the calculation cost; moreover, it uses random forest algorithm and XGBoost algorithm to establish a forecast model.
  • this method has the advantages of small calculation amount and low cost; moreover, compared with the traditional linear model-based meteorological statistical model, it introduces It has more nonlinearity and higher complexity, so the accuracy is higher and its forecast time is equal to the input global model forecast time, up to more than ten days.
  • the invention discloses a lightning prediction method, which includes the following steps:
  • S2 Calculate the high-order meteorological parameters related to lightning based on the high-order meteorological parameters of the area to be predicted;
  • S4 Based on the random forest algorithm, calculate the degree of correlation between high-order meteorological parameters and lightning, and select high-order meteorological parameters with high degree of correlation with lightning. This is because when the random forest algorithm is used to judge the importance of high-order meteorological parameters, There is no need to consider whether the high-level meteorological parameters are linearly separable, and there is no need to normalize or standardize features;
  • S5 Use the XGBoost algorithm to establish a forecast model based on the forecast timeliness, forecast times, and high-order meteorological parameters that are highly correlated with lightning.
  • the XGBoost algorithm is one of the boosting algorithms, and the idea of the Boosting algorithm is to integrate many weak classifiers to form a strong classifier. Moreover, since XGBoost is a boosted tree model, it integrates many tree models in At the same time, a strong classifier is formed.
  • the objective function of lightning forecast is a linear regression function. For each forecast time and each time period, Bayesian optimization method is used to determine the maximum depth, tree Optimize the coefficients such as the number, learning rate, sampling number, and the minimum sample proportion of the end node, and then every period of time, the new observation data obtained is put into the training sample, and the training is retrained to obtain a new forecast model. Therefore, in the present invention, the forecasting effect of the forecasting model can be continuously improved.
  • the basic meteorological parameters include temperature, humidity, dew point, vorticity, air pressure, convective precipitation, non-convective precipitation, convective effective potential energy, and radar reflectivity at different altitudes in the area to be predicted, specifically , Obtain the 72-hour, 3-hour-by-three-hour forecast of the temperature, humidity, dew point, vorticity and other variables of each pressure layer from the EC global forecast model, and obtain the ground convective precipitation, non-convective precipitation, convective effective potential energy and other variables; Obtain radar reflectivity and so on in the forecast mode.
  • the high-level meteorological parameters include A index, K index, Sabouraud index and strong weather threat index, among which:
  • A T850-T500-(T850-Td850)-(T700-Td700)-(T500-Td500);
  • the strong weather threat index is defined as:
  • SWEA 12*Td850+20*(TT-49)+4*WF850+2*WF500+125*(sin(WD500-WD850)+0.2), where: TT is the total index value, if the sub-item of the formula is less than 0, does not count this sub-item, that is, the value is 0, WF is in "m/s" as the unit, the rightmost sub-item must satisfy WD850 at 130° ⁇ 250°, WD500 at 210° ⁇ 310°, WD500 is greater than WD850, Calculate when both WF850 and WF500 are greater than 7.5m/s, otherwise it is 0.
  • T temperature
  • Td potential temperature
  • WF wind speed
  • WD wind direction
  • the value of the suffix stands for the pressure layer where the variable is located.
  • step S3 gridding the lightning location observation data refers to using the grid method to convert the lightning location observation data into grid data with the same longitude, latitude and resolution as the basic meteorological parameters. This is Because the lightning positioning observation data is station data, the gridding method can be used to convert the lightning positioning observation data into grid data with the same longitude, the same latitude, and the same resolution as the basic meteorological parameters.
  • step S4 based on the random forest algorithm, the specific method for calculating the correlation degree of each high-order meteorological parameter with lightning is: taking each high-order meteorological parameter as the feature vector and using the gridded lightning The positioning observation data is used as the target vector to establish a random forest model, and then the outer bag function is used as an evaluation index to calculate the importance of each feature vector, and determine the degree of correlation between high-order meteorological parameters and lightning according to the importance of each feature vector.
  • step S5 based on the forecast timeliness, forecast times, and high-order meteorological parameters that are highly correlated with lightning, the XGBoost algorithm is used to establish a forecast model as follows: For each high-order meteorological parameter, use high-order meteorological parameters The historical data of the parameters is the feature vector, the grid-processed lightning location historical observation data is the target vector, the linear regression function is the objective parameter, and the hyperopt algorithm is used to perform Bayesian adjustment of the hyperparameters in the XGBoost algorithm to construct The forecast model of high-order meteorological parameters and lightning data at different forecast times is the multi-time forecast model, which is specifically: (1) For each high-order meteorological parameter, the grid-processed lightning positioning observation data Historical data is the target vector, linear regression function is the objective parameter, and the hyperopt algorithm is used to perform Bayesian adjustment of the hyperparameters in the XGBoost algorithm such as the number of iterations, the number of trees, and the depth of the tree; (2) In each forecast At the time
  • the XGBoost algorithm is used to establish the forecast model because the XGBoost algorithm has the following advantages: (1) The XGBoost algorithm supports linear classifiers, which is equivalent to the introduction of L1 and L2 regularization terms in logistic regression (classification problem) And linear regression (regression problem); (2) The XGBoost algorithm does a second-order Taylor expansion of the cost function, and introduces the first-order derivative and the second-order derivative, so that we can clearly understand what the whole goal is, and step by step Deduced how to learn the tree; (3) When the sample has missing values, XGBoost can automatically learn the splitting direction; (4) XG Boost draws on the approach of RF and supports column sampling, which can not only prevent overfitting, but also Reduce the amount of calculation; (5) The cost function of the XGBoost algorithm introduces a regularization term to control the complexity of the model.
  • the regularization term includes the number of all leaf nodes, and the square sum of the L2 modulus of the score output by each leaf node. From the perspective of Bayesian variance, the regular term reduces the variance of the model and prevents the model from overfitting; (6) XGBoost allocates the learning rate to the leaf nodes after each iteration, reduces the weight of each tree, and reduces each tree. The influence of the tree provides a better learning space for the following; (7) XGBoost tool supports parallelism, but it is not the granularity of the tree, but the granularity of the feature. The most time-consuming step of the decision tree is to sort the value of the feature. XGBoost is Before iteration, pre-sort and save it as a block structure.
  • the structure is reused, which reduces the calculation of the model; the block structure also provides the possibility of parallelism for the model.
  • the gain of each feature can be performed in multiple threads;
  • Parallel approximate histogram algorithm when the tree node is split, the gain of each node needs to be calculated If the amount of data is large, sort the features of all nodes to obtain the optimal segmentation point. This greedy method is extremely time-consuming.
  • the approximate histogram algorithm is introduced to generate efficient segmentation points, that is, split A certain value after subtracting a certain value before splitting to obtain a gain.
  • a threshold is introduced. When the gain is greater than the threshold, the split is performed.
  • XGBoost is the most commonly used and one of the most effective models for machine learning modeling of structured data.
  • a forecast model to predict the spatial distribution and occurrence probability of lightning includes:

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Medical Informatics (AREA)
  • General Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种雷电预测方法,包括如下步骤:获取待预测区域的基础气象参数;基于待预测区域的高阶气象参数,计算与雷电相关的高阶气象参数;获取待预测区域的雷电定位观测数据,并将雷电定位观测数据进行网格化处理;基于随机森林算法,计算各高阶气象参数与雷电的相关程度,并选取出与雷电相关程度最高的高阶气象参数;基于预报时效和预报时次,利用XGBoost算法建立预报模型;基于待预测区域的高阶气象参数,利用预报模型对雷电的空间分布和发生概率进行预测。

Description

一种雷电预测方法 技术领域
本发明涉及防灾减灾技术领域,尤其涉及一种雷电预测方法。
背景技术
雷电发生时常伴有闪电和雷鸣,也被称为闪电,是一种极为壮观而又具有极强破坏性的自然现象。雷电的产生位置多是在对流过程激烈的积雨云中,或者是在带电雷云和地面突出物之间。雷电过程的发生和发展是大气运动和地球磁场等诸多自然条件和物理条件综合作用的结果。作为一种强放电现象,雷电发生过程中的电流值可达上万安培。而且,雷电的瞬时电压也很高,能达到几百万伏特,所以说一个中低等强度雷暴的功率就可达到一千万瓦左右,这一数量与一座小型核电站的输出功率相当。因此,雷电释放的能量巨大,其瞬时破坏性极强,因而也引起了广泛的关注,在“联合国国际减灾十年”被列为“十种最为严重的自然灾害之一”。所以,为了有效地降低雷电灾害对经济社会发展的影响,避免重大人员伤亡和经济损失事故的发生,进行雷电预警是十分重要的。
雷电预警是国家灾害性天气预报不可或缺的一部分,提高其准确性和预报服务的水平,这与全社会的发展以及各行业和人民生活的安全息息相关。常见的雷电预报预警方法主要为雷达数据外推、数值模式直接预报、基于气象要素的经验预报、基于大气电场仪的短临预报等方法,其中:数值模式直接预报的准确度高,但所需算力非常大,成本很高;雷达数据外推法和基于气象要素的经验预报方法所需计算量相对数值模式远远偏小,但准确率较低;基于大气电场仪的短临预报等方法预报结果较为准确,但预报时效很小。
现有的雷电预警方法存在准确率不高、所需计算资源太大、预报时效太小等缺陷。如何降低计算力的试用、节约成本、提高预报时效,并达到较优的准确度,是当前需要解决的问题。
发明内容
本发明的目的是提供一种雷电预测方法,其计算量小、成本低且预报准确性高。
为实现本发明的目的,本发明所采用的技术方案内容具体如下:
一种雷电预测方法,包括如下步骤:
S1:获取待预测区域的基础气象参数;
S2:基于待预测区域的高阶气象参数,计算与雷电相关的高阶气象参数;
S3:获取待预测区域的雷电定位观测数据,并将雷电定位观测数据进行网格化处理;
S4:基于随机森林算法,计算各高阶气象参数与雷电的相关程度,并选取出与雷电相关程度高的高阶气象参数;
S5:基于预报时效、预报时次以及与雷电相关程度高的高阶气象参数,利用XGBoost算法建立预报模型;
S6:基于待预测区域的高阶气象参数,利用预报模型对雷电的空间分布和发生概率进行预测。
作为上述方案的优选,所述基础气象参数包括待预测区域不同高度层的温度、湿度、露点、涡度、气压、对流降水量、非对流降水量、对流有效位能以及雷达反射率。
作为上述方案的优选,所述高阶气象参数包括A指数、K指数、沙氏指数以及强天气威胁指数。
作为上述方案的优选,在步骤S3中,将雷电定位观测数据进行网格化处理是指利用格点化方法,将雷电定位观测数据转换为与基础气象参数具有相同经度、纬度以及分辨率的网格化数据。
作为上述方案的优选,在步骤S4中,基于随机森林算法,计算各高阶气象参数与雷电的相关程度的具体方法为:以各高阶气象参数为特征向量、以经过网格化处理的雷电定位观测数据为目标向量建立随机森林模型,然后将袋外函数为评价指标,计算各特征向量的重要性,并根据各个特征向量的重要性的大小确定各高阶气象参数与雷电的相关程度。
作为上述方案的优选,在步骤S5中,基于预报时效、预报时次以及与雷电相关程度高的高阶气象参数,利用XGBoost算法建立预报模型为:针对每个高阶气象参数,以高阶气象参数的历史数据为特征向量,以经过网格化处理的雷电定位历史观测数据为目标向量,以线性回归函数为objective参数,使用hyperopt算法对XGBoost算法中的超参数进行贝叶斯调参,构建高阶气象参数与雷电数据在不同预报时次时的预报模型,即得 到多时次预报模型。
作为上述方案的优选,基于待预测区域的高阶气象参数,利用预报模型对雷电的空间分布和发生概率进行预测包括:
(1)将各预报时次的高阶气象参数输入多时次预报模型,得到各预报时次的雷电预报数据;
(2)将同一预报时次的雷电预报数据序列重新组合,生成网格化的雷电预报数据基于待预测区域的高阶气象参数。
与现有技术相比,本发明的有益效果在于:
本发明公开的雷电预测方法,其相对于数值预报而言,计算量明显减小,大大降低了计算成本;而且,其利用随机森林算法和XGBoost算法等方法建立了预报模型。该方法相对于动辄数个Tflop/s的基于复杂流体力学方程求解的数值预报模型而言,具有计算量小、成本低的优势;而且,相较于传统的基于线性模型的气象统计模型,引入了更多非线性,复杂程度更高,因此准确性较高且其预报时效与输入的全球模式预报时效相等,最高可达十余天。
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其他目的、特征和优点能够更明显易懂,以下特举较佳实施例,详细说明如下。
具体实施方式
为更进一步阐述本发明为达成预定发明目的所采取的技术手段及功效,以下结合较佳实施例,对依据本发明的具体实施方式、结构、特征及其功效,详细说明如下:
本发明公开了一种雷电预测方法,包括如下步骤:
S1:获取待预测区域的基础气象参数。
S2:基于待预测区域的高阶气象参数,计算与雷电相关的高阶气象参数;
S3:获取待预测区域的雷电定位观测数据,并将雷电定位观测数据进行网格化处理;
S4:基于随机森林算法,计算各高阶气象参数与雷电的相关程度,并选取出与雷电相关程度高的高阶气象参数,这是由于利用随机森林算法判断高阶气象参数的重要程度时,不需要考虑各高阶气象参数之间是否是线性可分的,也不需要对特征作归一化处理 或标准化处理;
S5:基于预报时效、预报时次以及与雷电相关程度高的高阶气象参数,利用XGBoost算法建立预报模型。
XGBoost算法是boosting算法的其中一种,而Boosting算法的思想是将许多弱分类器集成在一起形成一个强分类器,而且,由于XGBoost是一种提升树模型,所以它是将许多树模型集成在一起,形成一个很强的分类器,在本发明中,雷电预报的目标函数为线性回归函数,针对每个预报时次和每个时间段,分别使用贝叶斯优化方法,对最大深度、树的数量、学习率、采样数、终点节点最小样本占比的和等系数进行优化,然后每一段时间,将获得的新的观测数据放入训练样本中,重新训练一次,获得新的预报模型,因此,在本发明中,预报模型的预报效果可得到持续提高。
S6:基于待预测区域的高阶气象参数,利用预报模型对雷电的空间分布和发生概率进行预测。
作为进一步优选的方案,所述基础气象参数包括待预测区域不同高度层的温度、湿度、露点、涡度、气压、对流降水量、非对流降水量、对流有效位能以及雷达反射率,具体地,从EC全球预报模式中获得72小时、逐3小时预报的各气压层的温度、湿度、露点、涡度等变量,获得地面的对流降水、非对流降水、对流有效位能等变量;从区域预报模式中获取雷达反射率等。
所述高阶气象参数包括A指数、K指数、沙氏指数以及强天气威胁指数,其中:
(1)A指数的计算公式为:
A=T850-T500-(T850-Td850)-(T700-Td700)-(T500-Td500);
(2)K指数计算公式为:
K指数定义为:K=T850-T500+Td850-(T700-Td700);
(3)沙氏指数定义为:
SI=T500-T’,其中:T’为850hPa等压面上的湿空气块沿干绝热线抬升,到达凝结高度后再沿湿绝热线上升至500hPa时具有的气块温度。
(4)强天气威胁指数定义为:
SWEA=12*Td850+20*(TT-49)+4*WF850+2*WF500+125*(sin(WD500-WD 850)+0.2),其中:TT为全总指数值,若算式子项小于0,不算该子项,即值为0,WF以“m/s” 为单位,最右的子项必须满足WD850在130°~250°,WD500在210°~310°,WD500大于WD850,WF850、WF500均大于7.5m/s时才计算,否则为0。
需要说明的是,在上述定义中,T代表温度,Td代表位温,WF代表风速,WD代表风向,后缀的数值代表变量所处气压层。
在步骤S3中,将雷电定位观测数据进行网格化处理是指利用格点化方法,将雷电定位观测数据转换为与基础气象参数具有相同经度、纬度以及分辨率的网格化数据,这是由于雷电定位观测数据为站点数据,利用格点化方法可以将雷电定位观测数据转换为与基础气象参数具有相同经度、相同纬度以及相同分辨率的网格化数据。
为了更好地说明格点化的方法,以待预测区域的某个格点为例附近,以该格点圆心、R为半径的范围内,发生闪电的次数为N,当满足R<20km、且N/R2≥1/(5*5)时,则认为格点值为1,否则为0,最后获得一个二维的矩阵,即为网格化的雷电数据。
作为上述方案的优选,在步骤S4中,基于随机森林算法,计算各高阶气象参数与雷电的相关程度的具体方法为:以各高阶气象参数为特征向量、以经过网格化处理的雷电定位观测数据为目标向量建立随机森林模型,然后将袋外函数为评价指标,计算各特征向量的重要性,并根据各个特征向量的重要性的大小确定各高阶气象参数与雷电的相关程度。
另外,选取出与雷电相关程度高的高阶气象参数是从A指数、K指数、沙氏指数以及强天气威胁指数中选取出一个或多个参数,以将其作为雷电预报的重要参考变量。
作为上述方案的优选,在步骤S5中,基于预报时效、预报时次以及与雷电相关程度高的高阶气象参数,利用XGBoost算法建立预报模型为:针对每个高阶气象参数,以高阶气象参数的历史数据为特征向量,以经过网格化处理的雷电定位历史观测数据为目标向量,以线性回归函数为objective参数,使用hyperopt算法对XGBoost算法中的超参数进行贝叶斯调参,构建高阶气象参数与雷电数据在不同预报时次时的预报模型,即得到多时次预报模型,具体为:(1)针对每个高阶气象参数,以经过网格化处理的雷电定位观测数据的历史数据为目标向量,以线性回归函数为objective参数,使用hyperopt算法对迭代次数、树的个数、树的深度等XGBoost算法中的超参数进行贝叶斯调参;(2)在每个预报时次,建立高阶气象参数与雷电数据的预报模型,即可得到多时次预报模型。
需要说明的时,在本发明中,使用XGBoost算法建立预报模型是因为XGBoost算 法具有如下优势:(1)XGBoost算法支持线性分类器,相当于引入L1和L2正则化项的逻辑回归(分类问题)和线性回归(回归问题);(2)XGBoost算法对代价函数做了二阶泰勒展开,引入了一阶导数和二阶导数,这样做使得我们可以很清楚地理解整个目标是什么,并且一步一步推导出如何进行树的学习;(3)当样本存在缺失值是,XGBoost能自动学习分裂方向;(4)XG Boos t借鉴RF的做法,支持列抽样,这样不仅能防止过拟合,还能降低计算量;(5)XGBoost算法的代价函数引入正则化项,控制了模型的复杂度,正则化项包含全部叶子节点的个数,每个叶子节点输出的score的L2模的平方和。从贝叶斯方差角度考虑,正则项降低了模型的方差,防止模型过拟合;(6)XGBoost在每次迭代之后,为叶子结点分配学习速率,降低每棵树的权重,减少每棵树的影响,为后面提供更好的学习空间;(7)XGBoost工具支持并行,但并不是tree粒度上的,而是特征粒度,决策树最耗时的步骤是对特征的值排序,XGBoost在迭代之前,先进行预排序,存为block结构,每次迭代,重复使用该结构,降低了模型的计算;block结构也为模型提供了并行可能,在进行结点的分裂时,计算每个特征的增益,选增益最大的特征进行下一步分裂,那么各个特征的增益可以开多线程进行;(8)可并行的近似直方图算法,树结点在进行分裂时,需要计算每个节点的增益,若数据量较大,对所有节点的特征进行排序,遍历的得到最优分割点,这种贪心法异常耗时,这时引进近似直方图算法,用于生成高效的分割点,即用分裂后的某种值减去分裂前的某种值,获得增益,为了限制树的增长,引入阈值,当增益大于阈值时,进行分裂。总体上来说,XGBoost是对结构化数据进行机器学习建模,最常用,也是效果最好的模型之一。
作为进一步的方案,基于待预测区域的基础气象参数基于待预测区域的高阶气象参数,利用预报模型对雷电的空间分布和发生概率进行预测包括:
(1)将各预报时次的高阶气象参数输入多时次预报模型,得到各预报时次的雷电预报数据;
(2)将同一预报时次的雷电预报数据序列重新组合,生成网格化的雷电预报数据。
上述实施方式仅为本发明的优选实施方式,不能以此来限定本发明保护的范围,本领域的技术人员在本发明的基础上所做的任何非实质性的变化及替换均属于本发明所要求保护的范围。

Claims (7)

  1. 一种雷电预测方法,其特征在于,包括如下步骤:
    S1:获取待预测区域的基础气象参数;
    S2:基于待预测区域的高阶气象参数,计算与雷电相关的高阶气象参数;
    S3:获取待预测区域的雷电定位观测数据,并将雷电定位观测数据进行网格化处理;
    S4:基于随机森林算法,计算各高阶气象参数与雷电的相关程度,并选取出与雷电相关程度高的高阶气象参数;
    S5:基于预报时效、预报时次以及与雷电相关程度高的高阶气象参数,利用XGBoost算法建立预报模型;
    S6:基于待预测区域的高阶气象参数,利用预报模型对雷电的空间分布和发生概率进行预测。
  2. 根据权利要求1所述的雷电预测方法,其特征在于,所述基础气象参数包括待预测区域不同高度层的温度、湿度、露点、涡度、气压、对流降水量、非对流降水量、对流有效位能以及雷达反射率。
  3. 根据权利要求1所述的雷电预测方法,其特征在于,所述高阶气象参数包括A指数、K指数、沙氏指数以及强天气威胁指数。
  4. 根据权利要求1所述的雷电预测方法,其特征在于,在步骤S3中,将雷电定位观测数据进行网格化处理是指利用格点化方法,将雷电定位观测数据转换为与基础气象参数具有相同经度、纬度以及分辨率的网格化数据。
  5. 根据权利要求1所述的雷电预测方法,其特征在于,在步骤S4中,基于随机森林算法,计算各高阶气象参数与雷电的相关程度的具体方法为:以各高阶气象参数为特征向量、以经过网格化处理的雷电定位观测数据为目标向量建立随机森林模型,然后将袋外函数为评价指标,计算各特征向量的重要性,并根据各个特征向量的重要性的大小确定各高阶气象参数与雷电的相关程度。
  6. 根据权利要求1所述的雷电预测方法,其特征在于,在步骤S5中,基于预报时效、预报时次以及与雷电相关程度高的高阶气象参数,利用XGBoost算法建立预报模型为:针对每个高阶气象参数,以高阶气象参数的历史数据为特征向量,以经过网格化处理的雷电定位历史观测数据为目标向量,以线性回归函数为objective参数,使用hyperopt 算法对XGBoost算法中的超参数进行贝叶斯调参,构建高阶气象参数与雷电数据在不同预报时次时的预报模型,即得到多时次预报模型。
  7. 根据权利要求6所述的雷电预测方法,其特征在于,基于待预测区域的高阶气象参数,利用预报模型对雷电的空间分布和发生概率进行预测包括:
    (1)将各预报时次的高阶气象参数输入多时次预报模型,得到各预报时次的雷电预报数据;
    (2)将同一预报时次的雷电预报数据序列重新组合,生成网格化的雷电预报数据。
PCT/CN2020/090434 2019-10-23 2020-05-15 一种雷电预测方法 WO2021077729A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2020372283A AU2020372283A1 (en) 2019-10-23 2020-05-15 Lightning prediction method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911011363.4A CN110796299A (zh) 2019-10-23 2019-10-23 一种雷电预测方法
CN201911011363.4 2019-10-23

Publications (1)

Publication Number Publication Date
WO2021077729A1 true WO2021077729A1 (zh) 2021-04-29

Family

ID=69440985

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/090434 WO2021077729A1 (zh) 2019-10-23 2020-05-15 一种雷电预测方法

Country Status (3)

Country Link
CN (1) CN110796299A (zh)
AU (1) AU2020372283A1 (zh)
WO (1) WO2021077729A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191568A (zh) * 2021-05-21 2021-07-30 上海市气象灾害防御技术中心(上海市防雷中心) 基于气象的城市运行管理大数据分析预测方法和系统
CN114252706A (zh) * 2021-12-15 2022-03-29 华中科技大学 一种雷电预警方法和系统
CN114442198A (zh) * 2022-01-21 2022-05-06 广西壮族自治区气象科学研究所 一种基于加权算法的森林火险气象等级预报方法
CN114545098A (zh) * 2022-03-18 2022-05-27 华中科技大学 一种雷暴预报方法和闪电定位方法
CN115273440A (zh) * 2022-07-23 2022-11-01 河南泽阳实业有限公司 一种基于大数据智能分析算法的预警装置
CN116341391A (zh) * 2023-05-24 2023-06-27 华东交通大学 基于STPM-XGBoost模型的降水预测方法

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110796299A (zh) * 2019-10-23 2020-02-14 国网电力科学研究院武汉南瑞有限责任公司 一种雷电预测方法
WO2021217867A1 (zh) * 2020-04-29 2021-11-04 平安科技(深圳)有限公司 基于XGBoost的数据分类方法、装置、计算机设备及存储介质
CN111694047B (zh) * 2020-05-09 2021-03-23 吉林大学 基于多通道奇异谱的钻孔应变网络拓扑结构异常检测方法
CN111913236A (zh) * 2020-07-13 2020-11-10 上海眼控科技股份有限公司 气象数据处理方法、装置、计算机设备和存储介质
CN111897030A (zh) * 2020-07-17 2020-11-06 国网电力科学研究院有限公司 一种雷暴预警系统及方法
CN111915846B (zh) * 2020-08-11 2021-08-03 安徽亿纵电子科技有限公司 一种基于云计算的智能云防雷运维系统
CN114218994A (zh) * 2020-09-04 2022-03-22 京东科技控股股份有限公司 用于处理信息的方法和装置
CN112731564B (zh) * 2020-12-26 2023-04-07 安徽省公共气象服务中心 一种基于多普勒天气雷达数据的雷电智能预报方法
CN112764129B (zh) * 2021-01-22 2022-08-26 易天气(北京)科技有限公司 一种雷暴短临预报方法、系统及终端
CN113239946B (zh) * 2021-02-02 2023-10-27 广东工业大学 一种输电线路载流量的校核方法
CN113204903B (zh) * 2021-04-29 2022-04-29 国网电力科学研究院武汉南瑞有限责任公司 一种预测雷电的方法
CN113283653B (zh) * 2021-05-27 2024-03-26 大连海事大学 一种基于机器学习和ais数据的船舶轨迹预测方法
CN114518612A (zh) * 2022-02-14 2022-05-20 广东省气象公共安全技术支持中心 雷暴风险预警方法、系统及电子设备
CN114966233B (zh) * 2022-05-16 2024-08-13 国网电力科学研究院武汉南瑞有限责任公司 基于深度神经网络的雷电预报系统及方法
CN115456248B (zh) * 2022-08-15 2024-10-29 国网电力科学研究院武汉南瑞有限责任公司 基于卷积神经网络的落雷预测模型构建方法
CN118365118A (zh) * 2024-04-08 2024-07-19 北京玖天气象科技有限公司 基于动力降尺度技术的电力作业风险识别方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150161150A1 (en) * 2013-12-10 2015-06-11 Weather Decision Technologies, Inc. Four dimensional weather data storage and access
CN104950186A (zh) * 2014-03-31 2015-09-30 国际商业机器公司 雷电预测的方法和装置
CN108052734A (zh) * 2017-12-12 2018-05-18 中国电力科学研究院有限公司 一种基于气象参数对雷电流幅值进行预测的方法及系统
CN108427041A (zh) * 2018-03-14 2018-08-21 南京中科九章信息技术有限公司 雷电预警方法、系统、电子设备和存储介质
CN110796299A (zh) * 2019-10-23 2020-02-14 国网电力科学研究院武汉南瑞有限责任公司 一种雷电预测方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105068149B (zh) * 2015-07-24 2017-04-12 国家电网公司 一种基于多信息综合的输变电设备雷电监测和预报方法
CN110334732A (zh) * 2019-05-20 2019-10-15 北京思路创新科技有限公司 一种基于机器学习的空气质量预报方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150161150A1 (en) * 2013-12-10 2015-06-11 Weather Decision Technologies, Inc. Four dimensional weather data storage and access
CN104950186A (zh) * 2014-03-31 2015-09-30 国际商业机器公司 雷电预测的方法和装置
CN108052734A (zh) * 2017-12-12 2018-05-18 中国电力科学研究院有限公司 一种基于气象参数对雷电流幅值进行预测的方法及系统
CN108427041A (zh) * 2018-03-14 2018-08-21 南京中科九章信息技术有限公司 雷电预警方法、系统、电子设备和存储介质
CN110796299A (zh) * 2019-10-23 2020-02-14 国网电力科学研究院武汉南瑞有限责任公司 一种雷电预测方法

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191568A (zh) * 2021-05-21 2021-07-30 上海市气象灾害防御技术中心(上海市防雷中心) 基于气象的城市运行管理大数据分析预测方法和系统
CN113191568B (zh) * 2021-05-21 2024-02-02 上海市气象灾害防御技术中心(上海市防雷中心) 基于气象的城市运行管理大数据分析预测方法和系统
CN114252706A (zh) * 2021-12-15 2022-03-29 华中科技大学 一种雷电预警方法和系统
CN114442198A (zh) * 2022-01-21 2022-05-06 广西壮族自治区气象科学研究所 一种基于加权算法的森林火险气象等级预报方法
CN114442198B (zh) * 2022-01-21 2024-03-15 广西壮族自治区气象科学研究所 一种基于加权算法的森林火险气象等级预报方法
CN114545098A (zh) * 2022-03-18 2022-05-27 华中科技大学 一种雷暴预报方法和闪电定位方法
CN115273440A (zh) * 2022-07-23 2022-11-01 河南泽阳实业有限公司 一种基于大数据智能分析算法的预警装置
CN116341391A (zh) * 2023-05-24 2023-06-27 华东交通大学 基于STPM-XGBoost模型的降水预测方法
CN116341391B (zh) * 2023-05-24 2023-08-04 华东交通大学 基于STPM-XGBoost模型的降水预测方法

Also Published As

Publication number Publication date
CN110796299A (zh) 2020-02-14
AU2020372283A1 (en) 2021-11-25

Similar Documents

Publication Publication Date Title
WO2021077729A1 (zh) 一种雷电预测方法
Mokhtar et al. Estimation of SPEI meteorological drought using machine learning algorithms
Huang et al. An analytical comparison of four approaches to modelling the daily variability of solar irradiance using meteorological records
Saxena et al. A review study of weather forecasting using artificial neural network approach
Wei RBF neural networks combined with principal component analysis applied to quantitative precipitation forecast for a reservoir watershed during typhoon periods
Deng et al. Visibility Forecast for Airport Operations by LSTM Neural Network.
Hussain et al. Wavelet coherence of monsoon and large‐scale climate variabilities with precipitation in Pakistan
Novitasari et al. Weather parameters forecasting as variables for rainfall prediction using adaptive neuro fuzzy inference system (ANFIS) and support vector regression (SVR)
Omeje et al. Performance of hybrid neuro-fuzzy model for solar radiation simulation at Abuja, Nigeria: a correlation based input selection technique
Baki et al. Parameter calibration to improve the prediction of tropical cyclones over the Bay of Bengal using machine learning–based multiobjective optimization
Baudhanwala et al. Machine learning approaches for improving precipitation forecasting in the Ambica River basin of Navsari District, Gujarat
Lu et al. Lightning strike location identification based on 3D weather radar data
Bao et al. Application of lightning spatio-temporal localization method based on deep LSTM and interpolation
Sen et al. Analysis of PCA based adaboost machine learning model for predict mid-term weather forecasting
Aggarwal et al. A comprehensive review of numerical weather prediction models
Zhang et al. A novel combinational forecasting model of dust storms based on rare classes classification algorithm
Kober et al. Examination of a stochastic and deterministic convection parameterization in the COSMO model
Wang et al. Using machine learning to analyze the changes in extreme precipitation in southern China
アンナススワルディ et al. Neuro-fuzzy approaches for modeling the wet season tropical rainfall
Alves et al. Lightning Warning Prediction with Multi-source Data
Rufus et al. Thunderstorm Prediction Model Using SMOTE Sampling and Machine Learning Approach
de Almeida et al. Artificial neural network for data assimilation by WRF model in Rio de Janeiro, Brazil
Hwang et al. An Optimized ANN Measure-Correlate-Predict Method for Long-term Wind Prediction in Malaysia
Wang et al. The prediction method of tropical cyclone intensity change based on deep learning
Navaz et al. A survey on ensemble computing method for rainfall prediction in different regions of Chhattisgarh

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20880070

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020372283

Country of ref document: AU

Date of ref document: 20200515

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20880070

Country of ref document: EP

Kind code of ref document: A1