[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118429564A - Machine learning-based three-dimensional modeling method for soil of south hills - Google Patents

Machine learning-based three-dimensional modeling method for soil of south hills Download PDF

Info

Publication number
CN118429564A
CN118429564A CN202410881760.1A CN202410881760A CN118429564A CN 118429564 A CN118429564 A CN 118429564A CN 202410881760 A CN202410881760 A CN 202410881760A CN 118429564 A CN118429564 A CN 118429564A
Authority
CN
China
Prior art keywords
data
soil
gradient
sampling
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410881760.1A
Other languages
Chinese (zh)
Inventor
张军
王萃
魏龙
李欣
李胜天
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Space Geoinformation Engineering Group Co ltd
Geographic Information Engineering Team Of Jiangxi Provincial Bureau Of Geology
Original Assignee
Jiangxi Space Geoinformation Engineering Group Co ltd
Geographic Information Engineering Team Of Jiangxi Provincial Bureau Of Geology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Space Geoinformation Engineering Group Co ltd, Geographic Information Engineering Team Of Jiangxi Provincial Bureau Of Geology filed Critical Jiangxi Space Geoinformation Engineering Group Co ltd
Priority to CN202410881760.1A priority Critical patent/CN118429564A/en
Publication of CN118429564A publication Critical patent/CN118429564A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/05Geographic models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Geometry (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Remote Sensing (AREA)
  • Computer Graphics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a machine learning-based three-dimensional modeling method for southern hilly soil, which is characterized in that soil data of southern hilly areas, including soil types, textures, water content, organic matter content, PH value indexes, are collected, the soil can be classified by three-dimensional modeling of the southern hilly soil through a machine learning method SVM, proper crop planting is helped to be selected, proper crops can be selected, crop yield and quality are improved, and the understanding of the indexes of nutrient content, acidity and alkalinity and the like of the soil can be helped, so that a proper soil improvement plan is formulated.

Description

Machine learning-based three-dimensional modeling method for soil of south hills
Technical Field
The invention relates to the field of three-dimensional modeling, in particular to a machine learning-based southern hilly soil three-dimensional modeling method.
Background
Current crop planting and soil management often rely on traditional experience and conventional soil testing methods that have limitations in classifying and evaluating soil on a large scale and accurately.
The traditional soil classification and evaluation method is generally based on manually collected data and expert experience, and has limitations facing diversified soil types and complex soil characteristics in southern hilly areas, and in addition, the traditional soil evaluation method often cannot comprehensively consider complex correlations among various soil indexes, so that planting suggestions and soil improvement plans are difficult to accurately provide.
The machine learning method, particularly the Support Vector Machine (SVM), is adopted to perform three-dimensional modeling, a large amount of complex soil data can be better processed, and nonlinear relations among soil features are mined, so that accurate classification and modeling of soil are realized, the SVM method has strong generalization capability and high-dimensional data processing capability, the complexity and diversity of the soil data in southern hilly areas can be better dealt with, and more accurate planting suggestions and soil improvement schemes are provided.
Therefore, a machine learning method, particularly a three-dimensional modeling method by SVM, can make up for the defects of the traditional method in soil classification and evaluation, and provides more accurate and scientific soil information and crop planting suggestions.
Disclosure of Invention
The invention aims to provide a machine learning-based three-dimensional modeling method for southern hilly soil.
The invention aims to solve the problems that: soil data of the southern hilly areas are collected, wherein the soil data comprise soil types, textures, water content, organic matter content and PH value indexes, the soil of the southern hilly areas is subjected to three-dimensional modeling through a machine learning method SVM, the soil can be classified, proper crop planting is selected, farmers can select proper crops, crop yield and quality are improved, the indexes of nutrient content, acidity and alkalinity and the like of the soil can be known, and accordingly a proper soil improvement plan is formulated.
The machine learning-based three-dimensional modeling method for the southern hilly soil comprises the following steps of:
s1: a sampling tool is used for soil collection in the south hills, a sampling network is designed according to the gradient and the orientation characteristics of the terrain, soil sampling is carried out in different seasons, the sampling depth is 2/3 of the soil layer depth, 3-5 samples are taken from each sampling point and mixed into a uniform sample, the collected samples are identified, the serial numbers of the sampling points, the sampling dates and the sampling depth information are recorded, and the collected data comprise soil types, textures, water content, organic matter content, PH values, terrain data, vegetation data and humidity data;
S2: preprocessing the collected soil data, checking whether missing values, abnormal values and error values exist in the data, filling the missing values by using an interpolation method, selecting and deleting the abnormal values and the error values, standardizing the water content, the organic matter content and the PH value data, carrying out moving average and seasonal decomposition on the humidity data, carrying out normalization on the topographic data and the vegetation data, and converting the topographic data and the vegetation data into numerical codes;
S3: the preprocessed data comprise soil type, texture, water content, organic matter content, PH value, topographic data, vegetation data and humidity data, PCA is used for performing dimension reduction optimization by using principal component analysis, topographic gradient and slope direction characteristics and vegetation seasonal variation characteristics are added, and characteristic engineering for machine learning is selected;
S4: dividing the soil type, texture, water content, organic matter content, pH value, topographic data, vegetation data and humidity data after the dimension reduction optimization into a training set and a testing set, wherein 85% of the data are the training set and 15% of the data are the testing set;
S5: using a machine learning SVM algorithm to establish an SVM model, using a training set to train, and optimizing the model by adopting a regularization parameter C and a bandwidth parameter gamma in a Gaussian kernel function;
s6: using the test set to evaluate the performance of the model, and adopting a Root Mean Square Error (RMSE) and a decision coefficient (R-squared) for evaluation;
s7: after the model evaluation passes, the model is used to model soil data for the new southern hilly area.
Further, in the step S1, a sampling network is designed according to the slope and the orientation of the terrain, and soil sampling is performed in different seasons, including:
carrying out terrain analysis on the area needing to collect soil by adopting a digital elevation model, dividing different gradient intervals, wherein the gradient is 0-15 degrees mild, the gradient is 15-30 degrees moderate, the gradient is more than 30 degrees steep, customizing sampling grid density and layout according to different terrain features, and increasing sampling point density in the area with steeper gradient;
according to the direction of hilly terrain, sampling points are arranged on the south-north slopes and east-west slopes, sampling points are arranged in areas with obvious yin-yang slope differences, and sampling points are arranged at typical micro-terrain features;
And (3) making different sampling plans according to different seasons of the rainy season and the dry season in the south, collecting the saturated state of soil moisture and the moisture holding capacity of the dry season, and the states of decomposition of soil organic matters and circulation of nutrient elements in different seasons, and arranging tracking and sampling after typhoons and stormy extreme weather events.
Further, the soil type, texture, water content, organic matter content, PH value, topography data, vegetation data, humidity data acquisition method in S1 includes:
Recording type data of soil samples, such as red soil, yellow soil and brown soil, according to the soil classification system; the size and composition of soil particles are described by adopting a texture classification method, and the soil particles are recorded as texture data, such as sandy soil, loam and powder soil; measuring the moisture content in the soil by using a resistance method; measuring the organic matter content in the soil by adopting a combustion loss method; measuring the pH value of the soil by using an electronic pH meter; acquiring digital elevation model data of the region and recording the digital elevation model data as terrain data; acquiring remote sensing data of vegetation coverage type and density and recording the remote sensing data as vegetation data; the climate data of the area are obtained as precipitation and relative humidity, and are recorded as humidity data of soil.
Further, in the step S2, the abnormal value and the error value are selected to be deleted, the water content, the organic matter content and the PH value data are standardized, the humidity data are subjected to moving average and seasonal decomposition, the topography data and the vegetation data are normalized, and the soil type and the texture data are converted into numerical codes, which comprises the following steps:
for PH value data, establishing a scatter diagram to visualize the PH value data, quantitatively identifying PH value data points which deviate from a population remarkably by adopting a Grubbs statistical method, and identifying normal conditions of the PH value deviation from a conventional range caused by specific soil types such as red soil and brick red soil which are rich in iron-aluminum oxide of south hills by combining with soil science knowledge;
The method comprises the steps of converting topographic data and vegetation data into standard normal distribution data with a mean value of 0 and a variance of 1;
For humidity data, smoothing the humidity data by adopting a sliding average method, setting the size of a sliding window according to the rainfall frequency of a southern hilly area, calculating the average value and standard deviation of all annual average rainfall days based on 10-year statistical data, defining the years of which the annual average rainfall days are higher than the average value plus one standard deviation as higher rainfall frequency, the years of which the average value is lower than one standard deviation as lower rainfall frequency, setting the window of the higher rainfall frequency as 7 days, the window of the lower rainfall frequency as 21 days, setting the windows of the rest rainfall frequencies as 14 days, applying a sliding average algorithm to the humidity data sequence of each sampling point, calculating the average value of the humidity data in the window size days before and after the sampling point, and replacing the original data point by the average value;
Carrying out seasonal decomposition on the smoothed humidity data, dividing the soil humidity data according to the years, decomposing each annual data into a trend item, a seasonal item and a random item by using a seasonal decomposition method X-13ARIMA-SEATS, processing the seasonal item, identifying and subtracting a humidity peak value caused by seasonal rainfall, and recombining the processed seasonal item, the trend item and the random item to generate a seasonally adjusted humidity data sequence;
the topography data, vegetation data and humidity data are normalized by Min-Max, Where x is the raw data, x_normalized is the normalized data, max (x) is the maximum value in the raw data, and min (x) is the minimum value in the raw data;
The soil type and texture data are converted into numerical codes, and the soil type data are respectively [1, 0], [0,1,0], [0, 1] and the soil type data are respectively [2,0,0], [0,2,0], [0, 2] in terms of red soil, yellow soil and brown soil, and the texture data are respectively [0, 0], [0, 1] in terms of sandy soil, loam and powder soil.
Further, the step S3 of performing dimension reduction optimization by using principal component analysis PCA comprises the following steps:
s31: calculating a covariance matrix from the preprocessed data;
S32: the covariance matrix is subjected to eigenvalue decomposition to obtain eigenvalues and corresponding eigenvectors;
S33: sorting according to the magnitudes of the characteristic values, selecting the characteristic vectors corresponding to the maximum k characteristic values, wherein the selection of k is based on the accumulated contribution rate of the characteristic values;
s34: and carrying out linear transformation on the original data through the selected feature vector, and mapping the data into a new low-dimensional space.
Further, the step S3 of adding features of the slope and the slope direction of the terrain and seasonal variation features of the vegetation, and selecting a feature engineering for machine learning includes:
Acquiring a grid elevation data set covering an acquired area and comprising geographic coordinates and corresponding elevation values from a digital elevation model, performing quality inspection on the data, correcting abnormal values, calculating the gradient of each grid unit by adopting GIS software, calculating the gradient of each grid, performing spatial registration on the calculated gradient and gradient grid data and the existing soil attribute data, extracting the gradient and gradient values corresponding to each soil sample point, and adding the gradient and gradient values into a feature matrix as additional features;
And (3) carrying out time sequence analysis on the vegetation data after normalization processing, periodically extracting seasonal trend through fast Fourier transform FFT analysis, and constructing key features reflecting seasonal change, namely seasonal mean value, maximum value, minimum value, peak-valley difference and seasonal index, based on analysis results.
Further, the optimization of the S5 model by using the regularization parameter C and the bandwidth parameter γ in the gaussian kernel function includes:
the Gaussian kernel function is Where x is one sample data point for which a correlation is to be calculated, xi is another sample data point in the sample data set,Representing the euclidean distance, gamma is the bandwidth parameter of the gaussian kernel function, different values of C and gamma are tried through a grid search method, and the parameter combination with the best performance is selected.
The invention has the beneficial effects that: soil data of the southern hilly areas, including soil types, textures, water content, organic matter content and PH value indexes, are collected, the soil can be classified by means of machine learning through three-dimensional modeling of the southern hilly soil by means of a machine learning method SVM, proper crop planting is selected, farmers can select proper crops, crop yield and quality are improved, understanding of indexes such as nutrient content and acidity and alkalinity of the soil can be facilitated, and accordingly proper soil improvement plans are formulated.
Drawings
Fig. 1 is a flowchart of a machine learning-based three-dimensional modeling method for southern hilly soil.
Detailed Description
The present invention will be further described more fully hereinafter, but the scope of the invention is not limited thereto.
The machine learning-based three-dimensional modeling method for the southern hilly soil comprises the following steps of:
s1: a sampling tool is used for soil collection in the south hills, a sampling network is designed according to the gradient and the orientation characteristics of the terrain, soil sampling is carried out in different seasons, the sampling depth is 2/3 of the soil layer depth, 3-5 samples are taken from each sampling point and mixed into a uniform sample, the collected samples are identified, the serial numbers of the sampling points, the sampling dates and the sampling depth information are recorded, and the collected data comprise soil types, textures, water content, organic matter content, PH values, terrain data, vegetation data and humidity data;
S2: preprocessing the collected soil data, checking whether missing values, abnormal values and error values exist in the data, filling the missing values by using an interpolation method, selecting and deleting the abnormal values and the error values, standardizing the water content, the organic matter content and the PH value data, carrying out moving average and seasonal decomposition on the humidity data, carrying out normalization on the topographic data and the vegetation data, and converting the topographic data and the vegetation data into numerical codes;
S3: the preprocessed data comprise soil type, texture, water content, organic matter content, PH value, topographic data, vegetation data and humidity data, PCA is used for performing dimension reduction optimization by using principal component analysis, topographic gradient and slope direction characteristics and vegetation seasonal variation characteristics are added, and characteristic engineering for machine learning is selected;
S4: dividing the soil type, texture, water content, organic matter content, pH value, topographic data, vegetation data and humidity data after the dimension reduction optimization into a training set and a testing set, wherein 85% of the data are the training set and 15% of the data are the testing set;
S5: using a machine learning SVM algorithm to establish an SVM model, using a training set to train, and optimizing the model by adopting a regularization parameter C and a bandwidth parameter gamma in a Gaussian kernel function;
s6: using the test set to evaluate the performance of the model, and adopting a Root Mean Square Error (RMSE) and a decision coefficient (R-squared) for evaluation;
s7: after the model evaluation passes, the model is used to model soil data for the new southern hilly area.
Further, in the step S1, a sampling network is designed according to the slope and the orientation of the terrain, and soil sampling is performed in different seasons, including:
carrying out terrain analysis on the area needing to collect soil by adopting a digital elevation model, dividing different gradient intervals, wherein the gradient is 0-15 degrees mild, the gradient is 15-30 degrees moderate, the gradient is more than 30 degrees steep, customizing sampling grid density and layout according to different terrain features, and increasing sampling point density in the area with steeper gradient;
According to the direction of hilly terrain, south-north slopes and east-west slopes, sampling points are arranged on different directions, sampling points are arranged in areas with obvious yin-yang slope differences, and sampling points are arranged at typical micro-terrain features;
And (3) making different sampling plans according to different seasons of the rainy season and the dry season in the south, collecting the saturated state of soil moisture and the moisture holding capacity of the dry season, and the states of decomposition of soil organic matters and circulation of nutrient elements in different seasons, and arranging tracking and sampling after typhoons and stormy extreme weather events.
Further, the soil type, texture, water content, organic matter content, PH value, topography data, vegetation data, humidity data acquisition method in S1 includes:
Recording type data of soil samples, such as red soil, yellow soil and brown soil, according to the soil classification system; the size and composition of soil particles are described by adopting a texture classification method, and the soil particles are recorded as texture data, such as sandy soil, loam and powder soil; measuring the moisture content in the soil by using a resistance method; measuring the organic matter content in the soil by adopting a combustion loss method; measuring the pH value of the soil by using an electronic pH meter; acquiring digital elevation model data of the region and recording the digital elevation model data as terrain data; acquiring remote sensing data of vegetation coverage type and density and recording the remote sensing data as vegetation data; the climate data of the area are obtained as precipitation and relative humidity, and are recorded as humidity data of soil.
Further, in the step S2, the abnormal value and the error value are selected to be deleted, the water content, the organic matter content and the PH value data are standardized, the humidity data are subjected to moving average and seasonal decomposition, the topography data and the vegetation data are normalized, and the soil type and the texture data are converted into numerical codes, which comprises the following steps:
for PH value data, establishing a scatter diagram to visualize the PH value data, quantitatively identifying PH value data points which deviate from a population remarkably by adopting a Grubbs statistical method, and identifying normal conditions of the PH value deviation from a conventional range caused by specific soil types such as red soil and brick red soil which are rich in iron-aluminum oxide of south hills by combining with soil science knowledge;
The method comprises the steps of converting topographic data and vegetation data into standard normal distribution data with a mean value of 0 and a variance of 1;
For humidity data, smoothing the humidity data by adopting a sliding average method, setting the size of a sliding window according to the rainfall frequency of a southern hilly area, calculating the average value and standard deviation of all annual average rainfall days based on 10-year statistical data, defining the years of which the annual average rainfall days are higher than the average value plus one standard deviation as higher rainfall frequency, the years of which the average value is lower than one standard deviation as lower rainfall frequency, setting the window of the higher rainfall frequency as 7 days, the window of the lower rainfall frequency as 21 days, setting the windows of the rest rainfall frequencies as 14 days, applying a sliding average algorithm to the humidity data sequence of each sampling point, calculating the average value of the humidity data in the window size days before and after the sampling point, and replacing the original data point by the average value;
Carrying out seasonal decomposition on the smoothed humidity data, dividing the soil humidity data according to the years, decomposing each annual data into a trend item, a seasonal item and a random item by using a seasonal decomposition method X-13ARIMA-SEATS, processing the seasonal item, identifying and subtracting a humidity peak value caused by seasonal rainfall, and recombining the processed seasonal item, the trend item and the random item to generate a seasonally adjusted humidity data sequence;
the topography data, vegetation data and humidity data are normalized by Min-Max, Where x is the raw data, x_normalized is the normalized data, max (x) is the maximum value in the raw data, and min (x) is the minimum value in the raw data;
The soil type and texture data are converted into numerical codes, and the soil type data are respectively [1, 0], [0,1,0], [0, 1] and the soil type data are respectively [2,0,0], [0,2,0], [0, 2] in terms of red soil, yellow soil and brown soil, and the texture data are respectively [0, 0], [0, 1] in terms of sandy soil, loam and powder soil.
Further, the step S3 of performing dimension reduction optimization by using principal component analysis PCA comprises the following steps:
s31: calculating a covariance matrix from the preprocessed data;
S32: the covariance matrix is subjected to eigenvalue decomposition to obtain eigenvalues and corresponding eigenvectors;
S33: sorting according to the magnitudes of the characteristic values, selecting the characteristic vectors corresponding to the maximum k characteristic values, wherein the selection of k is based on the accumulated contribution rate of the characteristic values;
s34: and carrying out linear transformation on the original data through the selected feature vector, and mapping the data into a new low-dimensional space.
Further, the step S3 of adding features of the slope and the slope direction of the terrain and seasonal variation features of the vegetation, and selecting a feature engineering for machine learning includes:
Acquiring a grid elevation data set covering an acquired area and comprising geographic coordinates and corresponding elevation values from a digital elevation model, performing quality inspection on the data, correcting abnormal values, calculating the gradient of each grid unit by adopting GIS software, calculating the gradient of each grid, performing spatial registration on the calculated gradient and gradient grid data and the existing soil attribute data, extracting the gradient and gradient values corresponding to each soil sample point, and adding the gradient and gradient values into a feature matrix as additional features;
And (3) carrying out time sequence analysis on the vegetation data after normalization processing, periodically extracting seasonal trend through fast Fourier transform FFT analysis, and constructing key features reflecting seasonal change, namely seasonal mean value, maximum value, minimum value, peak-valley difference and seasonal index, based on analysis results.
Further, the optimization of the S5 model by using the regularization parameter C and the bandwidth parameter γ in the gaussian kernel function includes:
the Gaussian kernel function is Where x is one sample data point for which a correlation is to be calculated, xi is another sample data point in the sample data set,Representing the euclidean distance, gamma is the bandwidth parameter of the gaussian kernel function, different values of C and gamma are tried through a grid search method, and the parameter combination with the best performance is selected.
The invention provides a machine learning-based three-dimensional modeling method for the soil of the south hilly area, which is used for collecting soil data of the south hilly area, including soil types, textures, water content, organic matter content, PH value indexes, and the machine learning-based three-dimensional modeling method for the soil of the south hilly area is used for carrying out three-dimensional modeling on the soil of the south hilly area by using a machine learning method SVM, so that the soil can be classified, proper crop planting can be selected, proper crops can be selected, crop yield and quality can be improved, and the understanding of indexes such as nutrient content, acidity and alkalinity of the soil can be facilitated, so that a proper soil improvement plan can be formulated.

Claims (7)

1. The machine learning-based three-dimensional modeling method for the southern hilly soil is characterized by comprising the following steps of:
s1: a sampling tool is used for soil collection in the south hills, a sampling network is designed according to the gradient and the orientation characteristics of the terrain, soil sampling is carried out in different seasons, the sampling depth is 2/3 of the soil layer depth, 3-5 samples are taken from each sampling point and mixed into a uniform sample, the collected samples are identified, the serial numbers of the sampling points, the sampling dates and the sampling depth information are recorded, and the collected data comprise soil types, textures, water content, organic matter content, PH values, terrain data, vegetation data and humidity data;
S2: preprocessing the collected soil data, checking whether missing values, abnormal values and error values exist in the data, filling the missing values by using an interpolation method, selecting and deleting the abnormal values and the error values, standardizing the water content, the organic matter content and the PH value data, carrying out moving average and seasonal decomposition on the humidity data, carrying out normalization on the topographic data and the vegetation data, and converting the topographic data and the vegetation data into numerical codes;
S3: the preprocessed data comprise soil type, texture, water content, organic matter content, PH value, topographic data, vegetation data and humidity data, PCA is used for performing dimension reduction optimization by using principal component analysis, topographic gradient and slope direction characteristics and vegetation seasonal variation characteristics are added, and characteristic engineering for machine learning is selected;
S4: dividing the soil type, texture, water content, organic matter content, pH value, topographic data, vegetation data and humidity data after the dimension reduction optimization into a training set and a testing set, wherein 85% of the data are the training set and 15% of the data are the testing set;
S5: using a machine learning SVM algorithm to establish an SVM model, using a training set to train, and optimizing the model by adopting a regularization parameter C and a bandwidth parameter gamma in a Gaussian kernel function;
s6: using the test set to evaluate the performance of the model, and adopting a Root Mean Square Error (RMSE) and a decision coefficient (R-squared) for evaluation;
s7: after the model evaluation is passed, the model is used to predict and model soil data of a new southern hilly area.
2. The machine learning-based three-dimensional modeling method for southern hilly soil according to claim 1, wherein the step S1 of designing a sampling network according to the slope and orientation characteristics of the terrain, and performing soil sampling in different seasons comprises:
carrying out terrain analysis on the area needing to collect soil by adopting a digital elevation model, dividing different gradient intervals, wherein the gradient is 0-15 degrees mild, the gradient is 15-30 degrees moderate, the gradient is more than 30 degrees steep, customizing sampling grid density and layout according to different terrain features, and increasing sampling point density in the area with steeper gradient;
According to the direction of hilly terrain, south-north slopes and east-west slopes, sampling points are arranged on different directions, sampling points are arranged in areas with obvious yin-yang slope differences, and sampling points are arranged at typical micro-terrain features;
And (3) making different sampling plans according to different seasons of the rainy season and the dry season in the south, collecting the saturated state of soil moisture and the moisture holding capacity of the dry season, and the states of decomposition of soil organic matters and circulation of nutrient elements in different seasons, and arranging tracking and sampling after typhoons and stormy extreme weather events.
3. The machine learning-based southern hilly soil three-dimensional modeling method according to claim 1, wherein the soil type, texture, water content, organic matter content, PH, topography data, vegetation data, humidity data collection method in S1 comprises:
According to the soil classification system, recording type data of soil samples, namely red soil, yellow soil and brown soil; describing the size and the composition of soil particles by adopting a texture classification method, and recording the size and the composition as texture data, wherein the texture data are sandy soil, loam soil and powder soil; measuring the moisture content in the soil by using a resistance method; measuring the organic matter content in the soil by adopting a combustion loss method; measuring the pH value of the soil by using an electronic pH meter; acquiring digital elevation model data of the region and recording the digital elevation model data as terrain data; acquiring remote sensing data of vegetation coverage type and density and recording the remote sensing data as vegetation data; the climate data of the area are obtained as precipitation and relative humidity, and are recorded as humidity data of soil.
4. The machine learning based southern hilly soil three-dimensional modeling method according to claim 1, wherein the step S2 of selecting and deleting abnormal values and error values, normalizing water content, organic matter content and PH value data, performing moving average and seasonal decomposition on humidity data, normalizing topography data and vegetation data, and converting the topography data and the vegetation data into numerical codes comprises:
For PH value data, establishing a scatter diagram to visualize the PH value data, quantitatively identifying PH value data points which deviate from a group remarkably by adopting a Grubbs statistical method, and identifying normal conditions of the PH value deviation from a conventional range caused by the specific soil type red soil and the soil rich in iron-aluminum oxide of the brick red soil in the south hills by combining with the soil science knowledge;
The method comprises the steps of converting topographic data and vegetation data into standard normal distribution data with a mean value of 0 and a variance of 1;
For humidity data, smoothing the humidity data by adopting a sliding average method, setting the size of a sliding window according to the rainfall frequency of a southern hilly area, calculating the average value and standard deviation of all annual average rainfall days based on 10-year statistical data, defining the years of which the annual average rainfall days are higher than the average value plus one standard deviation as higher rainfall frequency, the years of which the average value is lower than one standard deviation as lower rainfall frequency, setting the window of the higher rainfall frequency as 7 days, the window of the lower rainfall frequency as 21 days, setting the windows of the rest rainfall frequencies as 14 days, applying a sliding average algorithm to the humidity data sequence of each sampling point, calculating the average value of the humidity data in the window size days before and after the sampling point, and replacing the original data point by the average value;
Carrying out seasonal decomposition on the smoothed humidity data, dividing the soil humidity data according to the years, decomposing each annual data into a trend item, a seasonal item and a random item by using a seasonal decomposition method X-13ARIMA-SEATS, processing the seasonal item, identifying and subtracting a humidity peak value caused by seasonal rainfall, and recombining the processed seasonal item, the trend item and the random item to generate a seasonally adjusted humidity data sequence;
the topography data, vegetation data and humidity data are normalized by Min-Max, Where x is the raw data, x_normalized is the normalized data, max (x) is the maximum value in the raw data, and min (x) is the minimum value in the raw data;
The soil type and texture data are converted into numerical codes, and the soil type data are respectively [1, 0], [0,1,0], [0, 1] and the soil type data are respectively [2,0,0], [0,2,0], [0, 2] in terms of red soil, yellow soil and brown soil, and the texture data are respectively [0, 0], [0, 1] in terms of sandy soil, loam and powder soil.
5. A machine learning based southern hilly soil three-dimensional modeling method as defined in claim 1, wherein S3 is optimized for dimension reduction using principal component analysis PCA, comprising the steps of:
s31: calculating a covariance matrix from the preprocessed data;
S32: the covariance matrix is subjected to eigenvalue decomposition to obtain eigenvalues and corresponding eigenvectors;
S33: sorting according to the magnitudes of the characteristic values, selecting the characteristic vectors corresponding to the maximum k characteristic values, wherein the selection of k is based on the accumulated contribution rate of the characteristic values;
s34: and carrying out linear transformation on the original data through the selected feature vector, and mapping the data into a new low-dimensional space.
6. The machine learning-based three-dimensional modeling method for southern hilly soil according to claim 1, wherein the step S3 of adding features of the terrain gradient and the slope direction and seasonal variation of vegetation, selecting a feature engineering for machine learning comprises:
Acquiring a grid elevation data set covering an acquired area and comprising geographic coordinates and corresponding elevation values from a digital elevation model, performing quality inspection on the data, correcting abnormal values, calculating the gradient of each grid unit by adopting GIS software, calculating the gradient of each grid, performing spatial registration on the calculated gradient and gradient grid data and the existing soil attribute data, extracting the gradient and gradient values corresponding to each soil sample point, and adding the gradient and gradient values into a feature matrix as additional features;
And (3) carrying out time sequence analysis on the vegetation data after normalization processing, periodically extracting seasonal trend through fast Fourier transform FFT analysis, and constructing key features reflecting seasonal change, namely seasonal mean value, maximum value, minimum value, peak-valley difference and seasonal index, based on analysis results.
7. The machine learning-based southern hilly soil three-dimensional modeling method as defined in claim 1, wherein the S5 model is optimized with regularization parameter C and bandwidth parameter γ in gaussian kernel function, and includes:
the Gaussian kernel function is Where x is one sample data point for which a correlation is to be calculated, xi is another sample data point in the sample data set,Representing the euclidean distance, gamma is the bandwidth parameter of the gaussian kernel function, different values of C and gamma are tried through a grid search method, and the parameter combination with the best performance is selected.
CN202410881760.1A 2024-07-03 2024-07-03 Machine learning-based three-dimensional modeling method for soil of south hills Pending CN118429564A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410881760.1A CN118429564A (en) 2024-07-03 2024-07-03 Machine learning-based three-dimensional modeling method for soil of south hills

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410881760.1A CN118429564A (en) 2024-07-03 2024-07-03 Machine learning-based three-dimensional modeling method for soil of south hills

Publications (1)

Publication Number Publication Date
CN118429564A true CN118429564A (en) 2024-08-02

Family

ID=92310727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410881760.1A Pending CN118429564A (en) 2024-07-03 2024-07-03 Machine learning-based three-dimensional modeling method for soil of south hills

Country Status (1)

Country Link
CN (1) CN118429564A (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3109635A1 (en) * 2020-04-27 2021-10-29 IFP Energies Nouvelles Method of detecting at least one geological component of a rock sample
CN115266612A (en) * 2022-07-27 2022-11-01 福建农林大学 A method for mapping soil available phosphorus in cultivated land in southern hilly areas based on high-resolution environmental variables
CN116227692A (en) * 2023-02-06 2023-06-06 中国科学院生态环境研究中心 Crop heavy metal enrichment risk quantification method, system and storable medium
US11704576B1 (en) * 2020-01-29 2023-07-18 Arva Intelligence Corp. Identifying ground types from interpolated covariates
CN116773961A (en) * 2023-06-16 2023-09-19 广西电网有限责任公司电力科学研究院 Transmission line corrosion detection method based on vibration signal high-frequency characteristic analysis
CN117036088A (en) * 2023-08-21 2023-11-10 安阳市游园管理站 Data acquisition and analysis method for identifying growth situation of greening plants by AI
CN117312968A (en) * 2023-09-12 2023-12-29 宁夏大学 Method for predicting organic matter content of saline-alkali farmland soil
CN117390555A (en) * 2023-10-27 2024-01-12 电子科技大学 A method to realize multi-dimensional classification and prediction of debris flow disaster risk
CN117393072A (en) * 2023-10-11 2024-01-12 电子科技大学长三角研究院(湖州) XRF soil heavy metal element quantitative analysis method based on CARS-PCA-BLS
CN117688511A (en) * 2023-12-22 2024-03-12 中国科学院、水利部成都山地灾害与环境研究所 Multi-source satellite soil-water machine learning fusion method under action of geographic climate factors
CN118094170A (en) * 2024-04-29 2024-05-28 中国林业科学研究院森林生态环境与自然保护研究所(国家林业和草原局世界自然遗产保护研究中心) Coupled forest soil attribute mapping method

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11704576B1 (en) * 2020-01-29 2023-07-18 Arva Intelligence Corp. Identifying ground types from interpolated covariates
FR3109635A1 (en) * 2020-04-27 2021-10-29 IFP Energies Nouvelles Method of detecting at least one geological component of a rock sample
CN115266612A (en) * 2022-07-27 2022-11-01 福建农林大学 A method for mapping soil available phosphorus in cultivated land in southern hilly areas based on high-resolution environmental variables
CN116227692A (en) * 2023-02-06 2023-06-06 中国科学院生态环境研究中心 Crop heavy metal enrichment risk quantification method, system and storable medium
CN116773961A (en) * 2023-06-16 2023-09-19 广西电网有限责任公司电力科学研究院 Transmission line corrosion detection method based on vibration signal high-frequency characteristic analysis
CN117036088A (en) * 2023-08-21 2023-11-10 安阳市游园管理站 Data acquisition and analysis method for identifying growth situation of greening plants by AI
CN117312968A (en) * 2023-09-12 2023-12-29 宁夏大学 Method for predicting organic matter content of saline-alkali farmland soil
CN117393072A (en) * 2023-10-11 2024-01-12 电子科技大学长三角研究院(湖州) XRF soil heavy metal element quantitative analysis method based on CARS-PCA-BLS
CN117390555A (en) * 2023-10-27 2024-01-12 电子科技大学 A method to realize multi-dimensional classification and prediction of debris flow disaster risk
CN117688511A (en) * 2023-12-22 2024-03-12 中国科学院、水利部成都山地灾害与环境研究所 Multi-source satellite soil-water machine learning fusion method under action of geographic climate factors
CN118094170A (en) * 2024-04-29 2024-05-28 中国林业科学研究院森林生态环境与自然保护研究所(国家林业和草原局世界自然遗产保护研究中心) Coupled forest soil attribute mapping method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
安小宇;鲁奎豪;崔光照;: "基于改进樽海鞘优化BP神经网络的土壤墒情预测", 中国农机化学报, no. 11, 15 November 2019 (2019-11-15) *

Similar Documents

Publication Publication Date Title
Sagredo et al. Climatology of Andean glaciers: A framework to understand glacier response to climate change
CN111368736B (en) Rice refined estimation method based on SAR and optical remote sensing data
Westman Measuring realized niche spaces: climatic response of chaparral and coastal sage scrub
CN112749627A (en) Method and device for dynamically monitoring tobacco based on multi-source remote sensing image
CN113221765B (en) Vegetation phenological period extraction method based on digital camera image effective pixels
Zhang et al. Winter wheat identification by integrating spectral and temporal information derived from multi-resolution remote sensing data
CN109800921A (en) A kind of Regional Fall Wheat yield estimation method based on remote sensing phenology assimilation and particle swarm optimization algorithm
Navidi et al. Ecological potential assessment and land use area estimation of agricultural lands based on multi-time images of Sentinel-2 using ANP-WLC and GIS in Bastam, Iran
Fitzgerald et al. Directed sampling using remote sensing with a response surface sampling design for site-specific agriculture
US20240420254A1 (en) A versatile crop yield estimator
CN118376761A (en) Detection method, equipment and medium based on soil data
CN117556695A (en) A deep learning-based simulation method for crop root soil moisture content
CN115876721A (en) Crop classification method, system, medium, computer equipment and terminal
CN114997730A (en) Urban and rural planning and design area data intelligent monitoring analysis evaluation system based on multi-dimensional features
CN113139717B (en) Crop seedling condition grading remote sensing monitoring method and device
CN118673296B (en) Construction method for comprehensive renovation ecological restoration evaluation model of homeland space
CN113538388B (en) Arable land loss assessment method based on MODIS NDVI time sequence data
Bao et al. A fine digital soil mapping by integrating remote sensing-based process model and deep learning method in Northeast China
Dimyati et al. Paddy field classification with MODIS-terra multi-temporal image transformation using phenological approach in Java Island
CN113570273A (en) Spatial method and system for irrigation farmland statistical data
CN117292282B (en) A method and system for monitoring the growth of gardening and greening based on high-resolution UAV remote sensing
CN118469060A (en) Soil heavy metal pollution distribution simulation prediction method based on machine learning area
CN118537746A (en) Remote sensing monitoring method for coastal wetland of bay
CN118429564A (en) Machine learning-based three-dimensional modeling method for soil of south hills
Li et al. Examining hickory plantation expansion and evaluating suitability for it using multitemporal satellite imagery and ancillary data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination