CN112216356A - High-entropy alloy hardness prediction method based on machine learning - Google Patents
High-entropy alloy hardness prediction method based on machine learning Download PDFInfo
- Publication number
- CN112216356A CN112216356A CN202011140018.3A CN202011140018A CN112216356A CN 112216356 A CN112216356 A CN 112216356A CN 202011140018 A CN202011140018 A CN 202011140018A CN 112216356 A CN112216356 A CN 112216356A
- Authority
- CN
- China
- Prior art keywords
- entropy alloy
- hardness
- machine learning
- predicting
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 229910045601 alloy Inorganic materials 0.000 title claims abstract description 67
- 239000000956 alloy Substances 0.000 title claims abstract description 67
- 238000000034 method Methods 0.000 title claims abstract description 60
- 238000010801 machine learning Methods 0.000 title claims abstract description 38
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 27
- 238000012216 screening Methods 0.000 claims abstract description 26
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000002790 cross-validation Methods 0.000 claims abstract description 10
- 230000002068 genetic effect Effects 0.000 claims description 18
- 230000006870 function Effects 0.000 claims description 16
- 239000013598 vector Substances 0.000 claims description 16
- 238000005457 optimization Methods 0.000 claims description 11
- 229910001030 Iron–nickel alloy Inorganic materials 0.000 claims description 6
- 239000007769 metal material Substances 0.000 abstract description 2
- 230000000875 corresponding effect Effects 0.000 description 9
- 239000000463 material Substances 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 239000000203 mixture Substances 0.000 description 5
- 230000035772 mutation Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012795 verification Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 239000012761 high-performance material Substances 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000000704 physical effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 229910002058 ternary alloy Inorganic materials 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 229910001325 element alloy Inorganic materials 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 229910000765 intermetallic Inorganic materials 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 229910001092 metal group alloy Inorganic materials 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010587 phase diagram Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 229910002059 quaternary alloy Inorganic materials 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000006104 solid solution Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C60/00—Computational materials science, i.e. ICT specially adapted for investigating the physical or chemical properties of materials or phenomena associated with their design, synthesis, processing, characterisation or utilisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
Landscapes
- Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Theoretical Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Chemical & Material Sciences (AREA)
- Crystallography & Structural Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A high-entropy alloy hardness prediction method based on machine learning belongs to the technical field of metal material hardness prediction and is used for solving the problems that a traditional method is time-consuming, labor-consuming and inaccurate in prediction and search of high-entropy alloy with excellent performance. The method comprises the steps of obtaining a characteristic data training set for predicting the hardness of the high-entropy alloy; screening the characteristic data to obtain an optimal characteristic combination; selecting a machine learning model by a ten-fold cross validation method; adopting the selected machine learning model and inputting the optimal characteristic combination to carry out model training; and (4) predicting the hardness of the unknown high-entropy alloy according to the trained model, and selecting the high-entropy alloy with high predicted hardness and good predicted reliability. Compared with the existing exhaustion method, the characteristic screening method needs to arrange the combination of all the characteristics to search the optimal characteristics, and not only can accurately predict the performance of the high-entropy alloy, but also saves computing resources and time when the high-entropy alloy hardness prediction is carried out based on a machine learning algorithm.
Description
Technical Field
The invention relates to the technical field of metal material hardness prediction, in particular to a high-entropy alloy hardness prediction method based on machine learning.
Technical Field
Conventional metal alloys generally consist of one or two main elements and some auxiliary elements, in conventional ternary alloys the composition tends to be located in the vicinity of one vertex in the triangular phase diagram, whereas in recent years high-entropy alloys which have attracted considerable attention tend to consist of five or more elements, each with a molar fraction between 5% and 35%, and are therefore sometimes referred to as multi-main-element alloys. The high-entropy alloy generates a high-entropy effect in the high-entropy alloy due to the high configuration entropy contained in the high-entropy alloy, and the hypothesis of the high-entropy effect shows that when the configuration entropy of the alloy is high, the alloy is more prone to generate a solid solution phase (SS) compared with an intermetallic compound phase (IM) or an amorphous phase (AM). Therefore, high entropy alloys tend to be superior in properties such as higher strength and hardness, better wear resistance, superior ductility, and the like.
The traditional method for searching high-performance materials, namely high-entropy alloys, usually represents the performance of the materials through experiments, theories or calculations, but the methods are time-consuming and labor-consuming, and difficult to carry out high-throughput material characterization, and the high-entropy alloys have huge composition component spaces and microstructure spaces, so that the traditional method is very difficult to search for the high-entropy alloys with excellent performance. With the advent of artificial intelligence and the big data era, finding specific materials with excellent properties by machine learning methods is also being applied to various high-performance material search problems.
Predicting material properties based on machine learning often requires the following steps: data collection, feature engineering, model selection and training, error analysis, and verification. The most important aspect in feature engineering is feature selection, which will affect the performance prediction result to a large extent, but there is no general method for selecting features suitable for the performance prediction of a specific material. Therefore, how to find the most suitable feature screening method is particularly important in the process of applying machine learning to predict the hardness of the high-entropy alloy.
Disclosure of Invention
In view of the above problems, the invention provides a high-entropy alloy hardness prediction method based on machine learning, which is used for solving the problems that the traditional method is time-consuming, labor-consuming and inaccurate in prediction and search of high-entropy alloy with excellent performance.
A high-entropy alloy hardness prediction method based on machine learning comprises the following steps,
acquiring a characteristic data training set for predicting the hardness of the high-entropy alloy;
step two, screening the characteristic data to obtain an optimal characteristic combination;
selecting a machine learning model by a ten-fold cross validation method;
step four, adopting the selected machine learning model and inputting the optimal characteristic combination to carry out model training;
and step five, predicting the hardness of the unknown high-entropy alloy according to the trained model, and selecting the high-entropy alloy with high predicted hardness and good predicted reliability.
Further, in the step one, the high-entropy alloy is an Al-Co-Cr-Cu-Fe-Ni system.
Further, the characteristics in the first step include atomic radius difference, electronegativity difference, valence electron concentration, enthalpy of mixing, configuration entropy, omega parameter, lambda parameter, gamma parameter, local electronegativity mismatch, number of flowing electrons, cohesive energy, modulus mismatch, local size mismatch, energy term, nabarro coefficient, work function, shear modulus difference, local modulus mismatch, and lattice distortion energy.
Further, the step two of screening the feature data comprises the steps of firstly determining the number of features in the feature combination, then screening the features with large correlation by using the Pearson correlation coefficient, and finally obtaining the optimal feature combination by using a genetic algorithm.
Further, the number of features in the feature combination is determined to be 3.
Further, the principle of screening the features with large correlation by using the pearson correlation coefficient is to retain the features with the pearson correlation coefficient value larger than 0.95 after screening as the highly correlated features.
Further, the optimal feature combination is three features of gamma parameter, electron concentration and work function.
Further, the machine learning model in the fourth step is a support vector regression algorithm model.
Furthermore, a Gaussian kernel function is adopted in the support vector regression algorithm, and Bayesian optimization is used for adjusting and optimizing the hyperparameters.
The invention has the beneficial technical effects that:
the invention firstly determines the number of the characteristics in the characteristic combination, then screens the characteristics with large correlation by using the Pearson correlation coefficient, and then screens by using the genetic algorithm to obtain the optimal characteristic combination.
Drawings
The invention may be better understood by referring to the following description in conjunction with the accompanying drawings, in which like reference numerals are used throughout the figures to indicate like or similar parts. The accompanying drawings, which are incorporated in and form a part of this specification, illustrate preferred embodiments of the present invention and, together with the detailed description, serve to further explain the principles and advantages of the invention.
Fig. 1 shows a schematic flow chart of a high-entropy alloy hardness prediction method based on machine learning according to an embodiment of the invention.
FIG. 2 shows a precision verification diagram of the step of determining the number of features in a feature combination in the high-entropy alloy hardness prediction method based on machine learning according to the embodiment of the invention.
FIG. 3 shows a Pearson correlation coefficient diagram between different features of a machine learning-based high-entropy alloy hardness prediction method according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which will vary from one implementation to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the art having the benefit of this disclosure. It should be noted that, in order to avoid obscuring the present invention with unnecessary details, only the device structures and/or processing steps closely related to the solution according to the present invention are shown in the drawings, and other details not so relevant to the present invention are omitted.
Fig. 1 shows a schematic flow chart of a high-entropy alloy hardness prediction method based on machine learning according to an embodiment of the invention. As shown in fig. 1, a method for predicting hardness of high-entropy alloy based on machine learning comprises the following steps,
acquiring a characteristic data training set for predicting the hardness of the high-entropy alloy;
according to the embodiment of the invention, the high-entropy alloy system of the invention is Al-Co-Cr-Cu-Fe-Ni system, the data set used is 155 composition data sets in the document [1], including 1 ternary alloy, 22 quaternary alloys, 95 quinary alloys, 38 hexahydric alloys and formulas of 20 physical characteristics, including atomic radius difference (delta r), electronegativity difference (delta chi), Valence Electron Concentration (VEC), mixed enthalpy (delta H), configuration entropy (delta S), omega parameter (omega), lambda parameter (lambda), gamma parameter (gamma), local electronegativity mismatch (D, chi), number of mobile electrons (e/a), cohesive energy (Ec), modulus mismatch (eta), local size mismatch (D.r), energy term (A), Nabalo coefficient (F), work function (W), modulus (G), shear modulus difference (δ G), local modulus mismatch (D.G), lattice distortion energy (μ). The above-mentioned values of the physical properties of the respective elements used in the formulas of the 20 physical features were found by examining ten published documents, thereby obtaining 20 physical feature data sets of 155 groups as training data sets.
Step two, screening the characteristic data to obtain an optimal characteristic combination;
according to the embodiment of the invention, the number of the features in the feature combination is determined, then the features with large correlation are screened by using the Pearson correlation coefficient, and then the features are screened by using a genetic algorithm.
Various features related to the predicted performance are included in the feature dataset, but not all of them will contribute to the prediction of the target performance, and features may be classified into useful, useless and redundant features for the prediction of the physical properties of the material. Therefore, the features need to be further selected, and the purpose of feature selection is to remove useless and redundant features, reduce the dimension of the original data set, improve the accuracy and reduce the complexity of the model.
A verification method is considered, and the support vector regression model comprising a plurality of physical characteristics can ensure that the prediction precision is high enough and the number of the characteristics is small as much as possible, so that the trained model is low in complexity and high in generalization capability when facing an unknown high-entropy alloy composition space.
Firstly, for all feature data sets, 200 feature combinations containing different numbers of features are randomly selected, then corresponding feature data are respectively selected for training, and the precision is verified by a ten-fold cross validation method, wherein the precision result is shown in fig. 2, as can be seen from fig. 2, the feature combinations containing three features can ensure a certain precision value and also ensure that the number of the feature combinations is small, and the precision value can be properly reduced, and the generalization capability of the model is taken as a final optimization target to prevent overfitting.
Then, removing the characteristic of high correlation degree by using a Pearson correlation coefficient; the Pearson Correlation Coefficient (PCC) is used to measure the correlation between two quantities, with values between-1 and +1, the closer the absolute value is to 1 indicating greater correlation, since there are highly correlated features in the feature data, one of which can be substituted for the other, referred to as redundant features. In order to reduce the computation time and improve the robustness of the model by removing irrelevant and redundant features, pearson correlation coefficients are used to derive pairwise correlations of features, and only one of the features is kept for highly relevant features to reduce redundant information in subsequent modeling, and fig. 3 shows a pearson correlation coefficient diagram between different features. As shown in fig. 3, the features with correlation greater than 0.95 are highly correlated, further, in order to select the features with retention, the model is tested by evaluating a certain feature, the importance of each feature to the model is ranked, the original data set is divided into a training set (80%) and a testing set (20%) according to a ten-fold cross validation, an SVR gaussian kernel is constructed to be trained based on one feature of the training set, the testing error is calculated by using a root mean square error Rmse, and finally, 5 features with high correlation and poor effect, including an energy term (a), a lattice distortion energy (μ), a shear modulus (G), a shear modulus difference (δ G), and an electronegativity difference (Δ χ), are removed, and 15 features are left.
In addition, the Pearson correlation coefficient between a single feature and the hardness can be calculated to obtain the Pearson relation between the feature and the hardness, the Pearson correlation coefficient values are arranged from large to small, the features with low hardness correlation are removed, and a group of features with high hardness correlation can be obtained. However, the pearson correlation coefficient describes a linear dependence between variables, and has a good effect on the characteristics of the linear relationship, and if there is a non-linear relationship between the variables, the result of the pearson correlation coefficient is poor.
Finally, screening the characteristics by using a genetic algorithm; the genetic algorithm is characterized in that the genetic algorithm can directly act on the structural object for operation without complex basic methods commonly used in mathematics such as derivation and the like for solving the optimization problem, and in the operation process, the optimization direction and the solution range of each step of the genetic algorithm are not required to be set in advance manually, and the genetic algorithm can be required to search the optimization direction and range of the genetic algorithm in each iteration. The invention limits the number of feature screening, the effect of screening features is better, the determined population number is 100, the iteration number is 200, but stable results can be obtained when the population is iterated for dozens of times, 10 individuals are selected from the population as one side of a parent during crossing, and then 10 individuals are selected as the other side, and the individuals are mutually crossed one by one to generate new samples; the mutation probability is set to be 0.01, different iteration results are obtained by executing the genetic algorithm for multiple times, but the screening results of each time are in the first 6 of feature sorting, and the 1 st is also found. No matter whether the selection is carried out in 20 characteristics or less, the selection is carried out in 20 characteristics, each selected result is within the first six names, and the optimal characteristic combination can be found in each case with less time consumption, which shows that the genetic algorithm can save the characteristic screening time while ensuring the precision.
Specifically, the main steps of the genetic algorithm include population initialization, selection, crossing, mutation and fitness value calculation.
Population initialization: the number of feature choices in the invention is three, so that each individual in the initial population, namely each feature combination, also only comprises three features, each individual comprises 20 data points, 3 values of the 20 data points are randomly set to be 1, and the rest 17 values are 0, so that the individual can represent a feature combination comprising the three features, and the operations are repeated to generate the initial population of the genetic algorithm consisting of a plurality of individuals.
Selecting: selecting excellent individuals from the population in a probability mode according to the fitness value and a certain rule and method, and then putting the excellent individuals into cross variation to generate a next generation population. The selection operators are various, generally, the commonly used selection operator is a proportion selection operator, the fitness and the total fitness of each individual in a population are calculated, the ratio of the fitness and the total fitness of each individual is taken as the respective relative fitness, the sum of the relative fitness of all the individuals is 1, the discs are divided into a plurality of parts in one disc according to the relative fitness, then a random number between 0 and 1 is generated, and the selection of the individual is determined according to which block of the disc the random number is located. Repeating the operation to generate a new population.
And (3) crossing: the common crossover operator is bitwise crossover, that is, randomly selecting individuals in two populations, randomly selecting a site, interchanging the values of each bit between the site and the last site, or, to say, interchanging the parts between the first site and the site, thus generating two offspring individuals, repeating the operation for more than several times, generating a new offspring individual with a population number to form a new population, in the present invention, in order to ensure that a feature combination containing three features is still generated after the crossover operation, when selecting the parent individuals to be crossed, counting the positions and the number corresponding to the case that one of the sites in the two individuals is 1 and the other is 0, and conversely, the positions and the number corresponding to the case that one of the sites in the two individuals is 1 and the other is 0, and half of the maximum value of the two numbers is the number of the sites to be swapped, then crossing the corresponding number of the sites, thereby ensuring that the interior of the generated new individual still corresponds to the three characteristics.
Mutation: setting a smaller value between [0 and 1] as a mutation probability, randomly taking a value between [0 and 1] when each individual performs mutation operation, and turning over the value of any site in the individual if the value is smaller than the previously set value; in order to ensure that the individuals correspond to the three characteristics, if a 1 site on one individual is overturned, a 0 site is found relatively randomly and is also overturned, and if a 1 site is overturned, a 0 site is found relatively randomly and is overturned.
And (3) calculating a fitness value: the general approach is to derive its ability to solve the problem, and the present invention uses combinations of features represented by the individual to substitute into a ten-fold cross validation based on support vector regression and uses Root Mean Square Error (RMSE) to evaluate the error value, from which the fitness value corresponding to the individual itself is derived.
The optimal feature combination screened by the method is three features of gamma parameter, electron concentration and work function, and is used as a physical feature combination used in subsequent machine learning.
Selecting a machine learning model by a ten-fold cross validation method;
according to the embodiment of the invention, in order to statistically verify the generalization capability of the model, avoid overfitting and improve the stability of the verification result, the invention uses a ten-fold cross verification segmentation data set, uses various machine learning models to model the high-entropy alloy prediction hardness, trains various methods in the high-entropy alloy training data set and compares the precision and the stability, and finally selects and adopts a support vector regression algorithm (SVR) to accurately predict the hardness characteristic of the high-entropy alloy.
And (3) obtaining an average value of multiple ten-fold cross validation accuracies by using the ten-fold cross validation segmentation data set, wherein the accuracy is calculated by a root mean square error, and a calculation formula is as follows:
wherein RMSE represents a root mean square error value; m represents the total number of samples; x is the number ofiRepresenting a sample feature vector; y isiRepresenting the true value of the ith sample; f (x)i) Representing the predicted value of the model for the ith sample.
Step four, adopting the selected machine learning model and inputting the optimal characteristic combination to carry out model training;
different from a general regression method, the support vector regression algorithm formally fits a straight line:however, training does not necessarily require a perfect fit to all the provided sample points, but allows for a maximum deviation of ε between the predicted values and the true hardness values, i.e., the loss is only calculated when the deviation between the predicted values and the true values is greater than ε. The penalty function for support vector regression is minimized as:
wherein,represents the slope of the line; c represents a regularization parameter; lεRepresenting the epsilon-insensitive loss function,
for the specific case that the sample cannot be fitted with linear equation well in the sample space, the method can be usedBy means of vectors of instances in the samplesMapping to high dimensions, as shown in FIG. 2, the feature vectors can be fitted with good linearity under high dimensions, and all support vector machines can be usedAll change toIntroducing a relaxation variable and solving an optimization problem by using a Lagrange multiplier method:
wherein m represents the total number of training samples; alpha is alphaiRepresenting the Lagrangian multiplier in the Lagrangian multiplier method corresponding to each sample point, since it is referred to in the above optimization equationOnly part of (A) isSince it is very difficult to map the sample points to a suitable high-dimensional feature space that can fit the sample points linearly using mapping, and the appropriate mapping phi () is not known, the application uses "kernel techniques", i.e. setting kernel functions, for simple computationTherefore, the learning of the corresponding support vector regression can be completed only by setting a proper kernel function in a high-dimensional space without finding a complex mapping equation, the types of the common kernel functions are many, and the following Gaussian kernel functions are selected by comparison:
after introducing the above kernel function, equation (1) is converted into:
and solving the optimization problem by using an SMO optimization algorithm, and solving a Lagrange multiplier to obtain a support vector regression model.
And step five, predicting the hardness of the unknown high-entropy alloy according to the trained model, and selecting the high-entropy alloy with high predicted hardness and good predicted reliability.
According to the embodiment of the invention, in order to verify the necessity of the feature screening method and the validity of the result, 1140 feature combinations in total of all three feature combinations in 20 features are respectively substituted into the support vector regression algorithm for training and testing, a ten-fold cross validation method and a root mean square error calculation method are used for calculating relative test errors of the feature combinations, and the feature combinations are subjected to prediction accuracy sequencing from small to large according to the test errors to serve as a judgment standard of the prediction accuracy of the feature screening method. Table 1 shows the feature combinations ranked 20 top in prediction accuracy searched by the exhaustive method.
TABLE 1
As can be seen from Table 1, compared with the method of selecting the optimal feature combination by the exhaustion method, the method of the present invention determines the number of features in the feature combination, then screens the features with large correlation by the Pearson correlation coefficient, and then screens the features by the genetic algorithm, which is not only accurate, but also saves more computing resources and time.
The high-entropy alloy test set of the Al-Co-Cr-Cu-Fe-Ni system used by the invention is from 614143 data sets of the Al-Co-Cr-Cu-Fe-Ni system high-entropy alloy components, which are provided by the document [1] and have predicted hardness values higher than 750HV and are searched from 1895147 specific components of the Al-Co-Cr-Cu-Fe-Ni system high-entropy alloy, prediction search is carried out in the data sets, so that the excellent specific components of the high-entropy alloy can be better searched, and the specific components of the high-entropy alloy with higher hardness, which are found in the prediction search process, are shown in Table 2.
TABLE 2
The composition specific gravity of each element in the high-entropy alloy shown in table 2 is the corresponding molar ratio, and the high-entropy alloy with the highest predicted hardness is: al (Al)0.41Co0.2Cr0.18Fe0.16Ni0.05The predicted hardness was 791.532194 HV.
The invention obtains the conclusion that the genetic algorithm is selected to carry out feature screening as the optimal feature screening method in the high-entropy alloy hardness prediction process according to the comprehensive comparison of prediction accuracy and timeliness, and obtains the prediction feature combination-gamma parameter, electron concentration and work function with optimal prediction performance by repeatedly carrying out feature screening by the genetic algorithm, compared with the feature screening method of selecting the optimal feature combination by an exhaustion method in 20 physical features, the invention firstly determines the number of features in the feature combination, then screens the features with large correlation by using the Pearson correlation coefficient, then screens the features by the genetic algorithm, is more computing resource-saving and time-saving, and the processes of carrying out feature combination screening firstly, then using machine learning models such as Gauss kernel support vector regression and the like to predict the high-entropy alloy hardness and search the virtual space high-entropy alloy are efficient and stable, the invention constructs a complete and universal implementation framework for predicting the material performance by machine learning, realizes a search method for searching specific components of high-performance materials in a large virtual space, successfully predicts and recommends several high-entropy alloys with high hardness, and predicts the corresponding hardness.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.
Claims (9)
1. A high-entropy alloy hardness prediction method based on machine learning is characterized by comprising the following steps,
acquiring a characteristic data training set for predicting the hardness of the high-entropy alloy;
step two, screening the characteristic data to obtain an optimal characteristic combination;
selecting a machine learning model by a ten-fold cross validation method;
step four, adopting the selected machine learning model and inputting the optimal characteristic combination to carry out model training;
and step five, predicting the hardness of the unknown high-entropy alloy according to the trained model, and selecting the high-entropy alloy with high predicted hardness and good predicted reliability.
2. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 1, wherein the high-entropy alloy in the first step is an Al-Co-Cr-Cu-Fe-Ni system.
3. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 1, wherein the characteristics in the step one comprise atomic radius difference, electronegativity difference, valence electron concentration, mixed enthalpy, configuration entropy, omega parameter, lambda parameter, gamma parameter, local electronegativity mismatch, number of flowing electrons, cohesive energy, modulus mismatch, local size mismatch, energy term, nabarro coefficient, work function, shear modulus difference, local modulus mismatch and lattice distortion energy.
4. The method for predicting the hardness of the high-entropy alloy based on the machine learning of claim 1, wherein the step two of screening the feature data comprises the steps of firstly determining the number of features in a feature combination, then screening features with high correlation by using a Pearson correlation coefficient, and finally obtaining an optimal feature combination by using a genetic algorithm.
5. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 4, wherein the number of the features in the feature combination is determined to be 3.
6. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 4, wherein a principle of screening the features with high correlation by using the Pearson correlation coefficient is to reserve the features with the Pearson correlation coefficient value larger than 0.95 as the high correlation features after screening.
7. The method for predicting the hardness of the high-entropy alloy based on machine learning according to any one of claims 1 to 4, wherein the optimal feature combination is three features of gamma parameter, electron concentration and work function.
8. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 1, wherein the machine learning model in the fourth step is a support vector regression algorithm model.
9. The method for predicting the hardness of the high-entropy alloy based on the machine learning as claimed in claim 8, wherein a Gaussian kernel function is adopted in the support vector regression algorithm, and Bayesian optimization is used for the optimization of the hyperparameters.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011140018.3A CN112216356A (en) | 2020-10-22 | 2020-10-22 | High-entropy alloy hardness prediction method based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011140018.3A CN112216356A (en) | 2020-10-22 | 2020-10-22 | High-entropy alloy hardness prediction method based on machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112216356A true CN112216356A (en) | 2021-01-12 |
Family
ID=74054872
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011140018.3A Pending CN112216356A (en) | 2020-10-22 | 2020-10-22 | High-entropy alloy hardness prediction method based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112216356A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112802563A (en) * | 2021-01-19 | 2021-05-14 | 哈尔滨理工大学 | Machine learning-based two-dimensional transition metal sulfide band gap and energy band structure prediction method and device |
CN113870957A (en) * | 2021-10-22 | 2021-12-31 | 中国航发北京航空材料研究院 | Eutectic high-entropy alloy component design method and device based on machine learning |
CN114464274A (en) * | 2022-01-14 | 2022-05-10 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method based on machine learning and improved genetic algorithm feature screening |
CN114580272A (en) * | 2022-02-16 | 2022-06-03 | 昆明贵金属研究所 | Design method for simultaneously optimizing conductivity and hardness of multi-element electric contact alloy |
CN114613449A (en) * | 2022-03-01 | 2022-06-10 | 哈尔滨工业大学(深圳) | Novel amorphous alloy design method based on machine learning |
CN114613456A (en) * | 2022-03-07 | 2022-06-10 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method based on improved density peak value clustering algorithm |
CN115394381A (en) * | 2022-08-24 | 2022-11-25 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method and device based on machine learning and two-step data expansion |
CN115579091A (en) * | 2022-11-09 | 2023-01-06 | 广东海洋大学 | High-entropy alloy component design method based on machine learning and multi-performance collaborative optimization |
CN116434880A (en) * | 2023-03-06 | 2023-07-14 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration |
CN116597923A (en) * | 2023-05-19 | 2023-08-15 | 小米汽车科技有限公司 | Model generation method, material information determination method, device, equipment and medium |
CN116720058A (en) * | 2023-04-28 | 2023-09-08 | 贵研铂业股份有限公司 | Method for realizing key feature combination screening of machine learning candidate features |
CN117789875A (en) * | 2023-12-14 | 2024-03-29 | 广东海洋大学 | Data driving method for designing high-strength high-entropy alloy and application |
CN117874712A (en) * | 2024-03-11 | 2024-04-12 | 四川省光为通信有限公司 | Single-mode non-airtight optical module performance prediction method based on Gaussian process regression |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834975A (en) * | 2015-05-13 | 2015-08-12 | 国家电网公司 | Power network load factor prediction method based on intelligent algorithm optimization combination |
CN110442954A (en) * | 2019-07-31 | 2019-11-12 | 东北大学 | The super high strength stainless steel design method of lower machine learning is instructed based on physical metallurgy |
-
2020
- 2020-10-22 CN CN202011140018.3A patent/CN112216356A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834975A (en) * | 2015-05-13 | 2015-08-12 | 国家电网公司 | Power network load factor prediction method based on intelligent algorithm optimization combination |
CN110442954A (en) * | 2019-07-31 | 2019-11-12 | 东北大学 | The super high strength stainless steel design method of lower machine learning is instructed based on physical metallurgy |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112802563A (en) * | 2021-01-19 | 2021-05-14 | 哈尔滨理工大学 | Machine learning-based two-dimensional transition metal sulfide band gap and energy band structure prediction method and device |
CN112802563B (en) * | 2021-01-19 | 2024-04-12 | 哈尔滨理工大学 | Two-dimensional transition metal sulfide band gap and energy band structure prediction method and device based on machine learning |
CN113870957A (en) * | 2021-10-22 | 2021-12-31 | 中国航发北京航空材料研究院 | Eutectic high-entropy alloy component design method and device based on machine learning |
CN114464274A (en) * | 2022-01-14 | 2022-05-10 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method based on machine learning and improved genetic algorithm feature screening |
CN114580272A (en) * | 2022-02-16 | 2022-06-03 | 昆明贵金属研究所 | Design method for simultaneously optimizing conductivity and hardness of multi-element electric contact alloy |
CN114613449A (en) * | 2022-03-01 | 2022-06-10 | 哈尔滨工业大学(深圳) | Novel amorphous alloy design method based on machine learning |
CN114613449B (en) * | 2022-03-01 | 2024-08-20 | 哈尔滨工业大学(深圳) | Novel amorphous alloy design method based on machine learning |
CN114613456B (en) * | 2022-03-07 | 2023-04-28 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method based on improved density peak clustering algorithm |
CN114613456A (en) * | 2022-03-07 | 2022-06-10 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method based on improved density peak value clustering algorithm |
CN115394381B (en) * | 2022-08-24 | 2023-08-22 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method and device based on machine learning and two-step data expansion |
CN115394381A (en) * | 2022-08-24 | 2022-11-25 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method and device based on machine learning and two-step data expansion |
CN115579091A (en) * | 2022-11-09 | 2023-01-06 | 广东海洋大学 | High-entropy alloy component design method based on machine learning and multi-performance collaborative optimization |
CN116434880A (en) * | 2023-03-06 | 2023-07-14 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration |
CN116434880B (en) * | 2023-03-06 | 2023-09-08 | 哈尔滨理工大学 | High-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration |
CN116720058A (en) * | 2023-04-28 | 2023-09-08 | 贵研铂业股份有限公司 | Method for realizing key feature combination screening of machine learning candidate features |
CN116597923A (en) * | 2023-05-19 | 2023-08-15 | 小米汽车科技有限公司 | Model generation method, material information determination method, device, equipment and medium |
CN117789875A (en) * | 2023-12-14 | 2024-03-29 | 广东海洋大学 | Data driving method for designing high-strength high-entropy alloy and application |
CN117789875B (en) * | 2023-12-14 | 2024-05-10 | 广东海洋大学 | Data driving method for designing high-strength high-entropy alloy and application |
CN117874712A (en) * | 2024-03-11 | 2024-04-12 | 四川省光为通信有限公司 | Single-mode non-airtight optical module performance prediction method based on Gaussian process regression |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112216356A (en) | High-entropy alloy hardness prediction method based on machine learning | |
Duncan et al. | Photometric redshifts for the next generation of deep radio continuum surveys–II. Gaussian processes and hybrid estimates | |
Dahi et al. | A quantum-inspired genetic algorithm for solving the antenna positioning problem | |
Strelioff et al. | Inferring Markov chains: Bayesian estimation, model comparison, entropy rate, and out-of-class modeling | |
Davenport et al. | Tuning support vector machines for minimax and Neyman-Pearson classification | |
CN103810101A (en) | Software defect prediction method and system | |
CN107341497A (en) | The unbalanced weighting data streams Ensemble classifier Forecasting Methodology of sampling is risen with reference to selectivity | |
CN110659207A (en) | Heterogeneous cross-project software defect prediction method based on nuclear spectrum mapping migration integration | |
CN111581116B (en) | Cross-project software defect prediction method based on hierarchical data screening | |
CN110019421A (en) | A kind of time series data classification method based on data characteristics segment | |
CN113012766A (en) | Self-adaptive soft measurement modeling method based on online selective integration | |
CN114881343A (en) | Short-term load prediction method and device of power system based on feature selection | |
Gao et al. | A joint landscape metric and error image approach to unsupervised band selection for hyperspectral image classification | |
Zhang et al. | Multivariate discrete grey model base on dummy drivers | |
CN115394381B (en) | High-entropy alloy hardness prediction method and device based on machine learning and two-step data expansion | |
Senoglu et al. | Goodness-of-fit tests based on Kullback-Leibler information | |
Ortelli et al. | Faster estimation of discrete choice models via dataset reduction | |
CN113344031B (en) | Text classification method | |
Marconato et al. | Identification of Wiener-Hammerstein benchmark data by means of support vector machines | |
CN114550842A (en) | Molecular prediction method and system for drug compound inhibiting biological activity of target protein | |
López et al. | Chaos and Regularity in the Double Pendulum with Lagrangian Descriptors | |
Schuetzke et al. | A universal synthetic dataset for machine learning on spectroscopic data | |
Quinlan et al. | Bayesian design of experiments for logistic regression to evaluate multiple nuclear forensic algorithms | |
Nidhi et al. | Predictive Model for Students' Academic Performance Using Classification and Feature Selection Techniques | |
CN106156856A (en) | The method and apparatus selected for mixed model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |