[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112216356A - High-entropy alloy hardness prediction method based on machine learning - Google Patents

High-entropy alloy hardness prediction method based on machine learning Download PDF

Info

Publication number
CN112216356A
CN112216356A CN202011140018.3A CN202011140018A CN112216356A CN 112216356 A CN112216356 A CN 112216356A CN 202011140018 A CN202011140018 A CN 202011140018A CN 112216356 A CN112216356 A CN 112216356A
Authority
CN
China
Prior art keywords
entropy alloy
hardness
machine learning
predicting
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011140018.3A
Other languages
Chinese (zh)
Inventor
邹瑞
李述
王鹏
李帅
杨致远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202011140018.3A priority Critical patent/CN112216356A/en
Publication of CN112216356A publication Critical patent/CN112216356A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C60/00Computational materials science, i.e. ICT specially adapted for investigating the physical or chemical properties of materials or phenomena associated with their design, synthesis, processing, characterisation or utilisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Landscapes

  • Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A high-entropy alloy hardness prediction method based on machine learning belongs to the technical field of metal material hardness prediction and is used for solving the problems that a traditional method is time-consuming, labor-consuming and inaccurate in prediction and search of high-entropy alloy with excellent performance. The method comprises the steps of obtaining a characteristic data training set for predicting the hardness of the high-entropy alloy; screening the characteristic data to obtain an optimal characteristic combination; selecting a machine learning model by a ten-fold cross validation method; adopting the selected machine learning model and inputting the optimal characteristic combination to carry out model training; and (4) predicting the hardness of the unknown high-entropy alloy according to the trained model, and selecting the high-entropy alloy with high predicted hardness and good predicted reliability. Compared with the existing exhaustion method, the characteristic screening method needs to arrange the combination of all the characteristics to search the optimal characteristics, and not only can accurately predict the performance of the high-entropy alloy, but also saves computing resources and time when the high-entropy alloy hardness prediction is carried out based on a machine learning algorithm.

Description

High-entropy alloy hardness prediction method based on machine learning
Technical Field
The invention relates to the technical field of metal material hardness prediction, in particular to a high-entropy alloy hardness prediction method based on machine learning.
Technical Field
Conventional metal alloys generally consist of one or two main elements and some auxiliary elements, in conventional ternary alloys the composition tends to be located in the vicinity of one vertex in the triangular phase diagram, whereas in recent years high-entropy alloys which have attracted considerable attention tend to consist of five or more elements, each with a molar fraction between 5% and 35%, and are therefore sometimes referred to as multi-main-element alloys. The high-entropy alloy generates a high-entropy effect in the high-entropy alloy due to the high configuration entropy contained in the high-entropy alloy, and the hypothesis of the high-entropy effect shows that when the configuration entropy of the alloy is high, the alloy is more prone to generate a solid solution phase (SS) compared with an intermetallic compound phase (IM) or an amorphous phase (AM). Therefore, high entropy alloys tend to be superior in properties such as higher strength and hardness, better wear resistance, superior ductility, and the like.
The traditional method for searching high-performance materials, namely high-entropy alloys, usually represents the performance of the materials through experiments, theories or calculations, but the methods are time-consuming and labor-consuming, and difficult to carry out high-throughput material characterization, and the high-entropy alloys have huge composition component spaces and microstructure spaces, so that the traditional method is very difficult to search for the high-entropy alloys with excellent performance. With the advent of artificial intelligence and the big data era, finding specific materials with excellent properties by machine learning methods is also being applied to various high-performance material search problems.
Predicting material properties based on machine learning often requires the following steps: data collection, feature engineering, model selection and training, error analysis, and verification. The most important aspect in feature engineering is feature selection, which will affect the performance prediction result to a large extent, but there is no general method for selecting features suitable for the performance prediction of a specific material. Therefore, how to find the most suitable feature screening method is particularly important in the process of applying machine learning to predict the hardness of the high-entropy alloy.
Disclosure of Invention
In view of the above problems, the invention provides a high-entropy alloy hardness prediction method based on machine learning, which is used for solving the problems that the traditional method is time-consuming, labor-consuming and inaccurate in prediction and search of high-entropy alloy with excellent performance.
A high-entropy alloy hardness prediction method based on machine learning comprises the following steps,
acquiring a characteristic data training set for predicting the hardness of the high-entropy alloy;
step two, screening the characteristic data to obtain an optimal characteristic combination;
selecting a machine learning model by a ten-fold cross validation method;
step four, adopting the selected machine learning model and inputting the optimal characteristic combination to carry out model training;
and step five, predicting the hardness of the unknown high-entropy alloy according to the trained model, and selecting the high-entropy alloy with high predicted hardness and good predicted reliability.
Further, in the step one, the high-entropy alloy is an Al-Co-Cr-Cu-Fe-Ni system.
Further, the characteristics in the first step include atomic radius difference, electronegativity difference, valence electron concentration, enthalpy of mixing, configuration entropy, omega parameter, lambda parameter, gamma parameter, local electronegativity mismatch, number of flowing electrons, cohesive energy, modulus mismatch, local size mismatch, energy term, nabarro coefficient, work function, shear modulus difference, local modulus mismatch, and lattice distortion energy.
Further, the step two of screening the feature data comprises the steps of firstly determining the number of features in the feature combination, then screening the features with large correlation by using the Pearson correlation coefficient, and finally obtaining the optimal feature combination by using a genetic algorithm.
Further, the number of features in the feature combination is determined to be 3.
Further, the principle of screening the features with large correlation by using the pearson correlation coefficient is to retain the features with the pearson correlation coefficient value larger than 0.95 after screening as the highly correlated features.
Further, the optimal feature combination is three features of gamma parameter, electron concentration and work function.
Further, the machine learning model in the fourth step is a support vector regression algorithm model.
Furthermore, a Gaussian kernel function is adopted in the support vector regression algorithm, and Bayesian optimization is used for adjusting and optimizing the hyperparameters.
The invention has the beneficial technical effects that:
the invention firstly determines the number of the characteristics in the characteristic combination, then screens the characteristics with large correlation by using the Pearson correlation coefficient, and then screens by using the genetic algorithm to obtain the optimal characteristic combination.
Drawings
The invention may be better understood by referring to the following description in conjunction with the accompanying drawings, in which like reference numerals are used throughout the figures to indicate like or similar parts. The accompanying drawings, which are incorporated in and form a part of this specification, illustrate preferred embodiments of the present invention and, together with the detailed description, serve to further explain the principles and advantages of the invention.
Fig. 1 shows a schematic flow chart of a high-entropy alloy hardness prediction method based on machine learning according to an embodiment of the invention.
FIG. 2 shows a precision verification diagram of the step of determining the number of features in a feature combination in the high-entropy alloy hardness prediction method based on machine learning according to the embodiment of the invention.
FIG. 3 shows a Pearson correlation coefficient diagram between different features of a machine learning-based high-entropy alloy hardness prediction method according to an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention will be described hereinafter with reference to the accompanying drawings. In the interest of clarity and conciseness, not all features of an actual implementation are described in the specification. It will of course be appreciated that in the development of any such actual embodiment, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which will vary from one implementation to another. Moreover, it will be appreciated that such a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking for those of ordinary skill in the art having the benefit of this disclosure. It should be noted that, in order to avoid obscuring the present invention with unnecessary details, only the device structures and/or processing steps closely related to the solution according to the present invention are shown in the drawings, and other details not so relevant to the present invention are omitted.
Fig. 1 shows a schematic flow chart of a high-entropy alloy hardness prediction method based on machine learning according to an embodiment of the invention. As shown in fig. 1, a method for predicting hardness of high-entropy alloy based on machine learning comprises the following steps,
acquiring a characteristic data training set for predicting the hardness of the high-entropy alloy;
according to the embodiment of the invention, the high-entropy alloy system of the invention is Al-Co-Cr-Cu-Fe-Ni system, the data set used is 155 composition data sets in the document [1], including 1 ternary alloy, 22 quaternary alloys, 95 quinary alloys, 38 hexahydric alloys and formulas of 20 physical characteristics, including atomic radius difference (delta r), electronegativity difference (delta chi), Valence Electron Concentration (VEC), mixed enthalpy (delta H), configuration entropy (delta S), omega parameter (omega), lambda parameter (lambda), gamma parameter (gamma), local electronegativity mismatch (D, chi), number of mobile electrons (e/a), cohesive energy (Ec), modulus mismatch (eta), local size mismatch (D.r), energy term (A), Nabalo coefficient (F), work function (W), modulus (G), shear modulus difference (δ G), local modulus mismatch (D.G), lattice distortion energy (μ). The above-mentioned values of the physical properties of the respective elements used in the formulas of the 20 physical features were found by examining ten published documents, thereby obtaining 20 physical feature data sets of 155 groups as training data sets.
Step two, screening the characteristic data to obtain an optimal characteristic combination;
according to the embodiment of the invention, the number of the features in the feature combination is determined, then the features with large correlation are screened by using the Pearson correlation coefficient, and then the features are screened by using a genetic algorithm.
Various features related to the predicted performance are included in the feature dataset, but not all of them will contribute to the prediction of the target performance, and features may be classified into useful, useless and redundant features for the prediction of the physical properties of the material. Therefore, the features need to be further selected, and the purpose of feature selection is to remove useless and redundant features, reduce the dimension of the original data set, improve the accuracy and reduce the complexity of the model.
A verification method is considered, and the support vector regression model comprising a plurality of physical characteristics can ensure that the prediction precision is high enough and the number of the characteristics is small as much as possible, so that the trained model is low in complexity and high in generalization capability when facing an unknown high-entropy alloy composition space.
Firstly, for all feature data sets, 200 feature combinations containing different numbers of features are randomly selected, then corresponding feature data are respectively selected for training, and the precision is verified by a ten-fold cross validation method, wherein the precision result is shown in fig. 2, as can be seen from fig. 2, the feature combinations containing three features can ensure a certain precision value and also ensure that the number of the feature combinations is small, and the precision value can be properly reduced, and the generalization capability of the model is taken as a final optimization target to prevent overfitting.
Then, removing the characteristic of high correlation degree by using a Pearson correlation coefficient; the Pearson Correlation Coefficient (PCC) is used to measure the correlation between two quantities, with values between-1 and +1, the closer the absolute value is to 1 indicating greater correlation, since there are highly correlated features in the feature data, one of which can be substituted for the other, referred to as redundant features. In order to reduce the computation time and improve the robustness of the model by removing irrelevant and redundant features, pearson correlation coefficients are used to derive pairwise correlations of features, and only one of the features is kept for highly relevant features to reduce redundant information in subsequent modeling, and fig. 3 shows a pearson correlation coefficient diagram between different features. As shown in fig. 3, the features with correlation greater than 0.95 are highly correlated, further, in order to select the features with retention, the model is tested by evaluating a certain feature, the importance of each feature to the model is ranked, the original data set is divided into a training set (80%) and a testing set (20%) according to a ten-fold cross validation, an SVR gaussian kernel is constructed to be trained based on one feature of the training set, the testing error is calculated by using a root mean square error Rmse, and finally, 5 features with high correlation and poor effect, including an energy term (a), a lattice distortion energy (μ), a shear modulus (G), a shear modulus difference (δ G), and an electronegativity difference (Δ χ), are removed, and 15 features are left.
In addition, the Pearson correlation coefficient between a single feature and the hardness can be calculated to obtain the Pearson relation between the feature and the hardness, the Pearson correlation coefficient values are arranged from large to small, the features with low hardness correlation are removed, and a group of features with high hardness correlation can be obtained. However, the pearson correlation coefficient describes a linear dependence between variables, and has a good effect on the characteristics of the linear relationship, and if there is a non-linear relationship between the variables, the result of the pearson correlation coefficient is poor.
Finally, screening the characteristics by using a genetic algorithm; the genetic algorithm is characterized in that the genetic algorithm can directly act on the structural object for operation without complex basic methods commonly used in mathematics such as derivation and the like for solving the optimization problem, and in the operation process, the optimization direction and the solution range of each step of the genetic algorithm are not required to be set in advance manually, and the genetic algorithm can be required to search the optimization direction and range of the genetic algorithm in each iteration. The invention limits the number of feature screening, the effect of screening features is better, the determined population number is 100, the iteration number is 200, but stable results can be obtained when the population is iterated for dozens of times, 10 individuals are selected from the population as one side of a parent during crossing, and then 10 individuals are selected as the other side, and the individuals are mutually crossed one by one to generate new samples; the mutation probability is set to be 0.01, different iteration results are obtained by executing the genetic algorithm for multiple times, but the screening results of each time are in the first 6 of feature sorting, and the 1 st is also found. No matter whether the selection is carried out in 20 characteristics or less, the selection is carried out in 20 characteristics, each selected result is within the first six names, and the optimal characteristic combination can be found in each case with less time consumption, which shows that the genetic algorithm can save the characteristic screening time while ensuring the precision.
Specifically, the main steps of the genetic algorithm include population initialization, selection, crossing, mutation and fitness value calculation.
Population initialization: the number of feature choices in the invention is three, so that each individual in the initial population, namely each feature combination, also only comprises three features, each individual comprises 20 data points, 3 values of the 20 data points are randomly set to be 1, and the rest 17 values are 0, so that the individual can represent a feature combination comprising the three features, and the operations are repeated to generate the initial population of the genetic algorithm consisting of a plurality of individuals.
Selecting: selecting excellent individuals from the population in a probability mode according to the fitness value and a certain rule and method, and then putting the excellent individuals into cross variation to generate a next generation population. The selection operators are various, generally, the commonly used selection operator is a proportion selection operator, the fitness and the total fitness of each individual in a population are calculated, the ratio of the fitness and the total fitness of each individual is taken as the respective relative fitness, the sum of the relative fitness of all the individuals is 1, the discs are divided into a plurality of parts in one disc according to the relative fitness, then a random number between 0 and 1 is generated, and the selection of the individual is determined according to which block of the disc the random number is located. Repeating the operation to generate a new population.
And (3) crossing: the common crossover operator is bitwise crossover, that is, randomly selecting individuals in two populations, randomly selecting a site, interchanging the values of each bit between the site and the last site, or, to say, interchanging the parts between the first site and the site, thus generating two offspring individuals, repeating the operation for more than several times, generating a new offspring individual with a population number to form a new population, in the present invention, in order to ensure that a feature combination containing three features is still generated after the crossover operation, when selecting the parent individuals to be crossed, counting the positions and the number corresponding to the case that one of the sites in the two individuals is 1 and the other is 0, and conversely, the positions and the number corresponding to the case that one of the sites in the two individuals is 1 and the other is 0, and half of the maximum value of the two numbers is the number of the sites to be swapped, then crossing the corresponding number of the sites, thereby ensuring that the interior of the generated new individual still corresponds to the three characteristics.
Mutation: setting a smaller value between [0 and 1] as a mutation probability, randomly taking a value between [0 and 1] when each individual performs mutation operation, and turning over the value of any site in the individual if the value is smaller than the previously set value; in order to ensure that the individuals correspond to the three characteristics, if a 1 site on one individual is overturned, a 0 site is found relatively randomly and is also overturned, and if a 1 site is overturned, a 0 site is found relatively randomly and is overturned.
And (3) calculating a fitness value: the general approach is to derive its ability to solve the problem, and the present invention uses combinations of features represented by the individual to substitute into a ten-fold cross validation based on support vector regression and uses Root Mean Square Error (RMSE) to evaluate the error value, from which the fitness value corresponding to the individual itself is derived.
The optimal feature combination screened by the method is three features of gamma parameter, electron concentration and work function, and is used as a physical feature combination used in subsequent machine learning.
Selecting a machine learning model by a ten-fold cross validation method;
according to the embodiment of the invention, in order to statistically verify the generalization capability of the model, avoid overfitting and improve the stability of the verification result, the invention uses a ten-fold cross verification segmentation data set, uses various machine learning models to model the high-entropy alloy prediction hardness, trains various methods in the high-entropy alloy training data set and compares the precision and the stability, and finally selects and adopts a support vector regression algorithm (SVR) to accurately predict the hardness characteristic of the high-entropy alloy.
And (3) obtaining an average value of multiple ten-fold cross validation accuracies by using the ten-fold cross validation segmentation data set, wherein the accuracy is calculated by a root mean square error, and a calculation formula is as follows:
Figure BDA0002737967650000061
wherein RMSE represents a root mean square error value; m represents the total number of samples; x is the number ofiRepresenting a sample feature vector; y isiRepresenting the true value of the ith sample; f (x)i) Representing the predicted value of the model for the ith sample.
Step four, adopting the selected machine learning model and inputting the optimal characteristic combination to carry out model training;
different from a general regression method, the support vector regression algorithm formally fits a straight line:
Figure BDA0002737967650000062
however, training does not necessarily require a perfect fit to all the provided sample points, but allows for a maximum deviation of ε between the predicted values and the true hardness values, i.e., the loss is only calculated when the deviation between the predicted values and the true values is greater than ε. The penalty function for support vector regression is minimized as:
Figure BDA0002737967650000063
wherein,
Figure BDA0002737967650000064
represents the slope of the line; c represents a regularization parameter; lεRepresenting the epsilon-insensitive loss function,
Figure BDA0002737967650000065
for the specific case that the sample cannot be fitted with linear equation well in the sample space, the method can be used
Figure BDA0002737967650000066
By means of vectors of instances in the samples
Figure BDA0002737967650000067
Mapping to high dimensions, as shown in FIG. 2, the feature vectors can be fitted with good linearity under high dimensions, and all support vector machines can be used
Figure BDA0002737967650000068
All change to
Figure BDA0002737967650000069
Introducing a relaxation variable and solving an optimization problem by using a Lagrange multiplier method:
Figure BDA0002737967650000071
wherein m represents the total number of training samples; alpha is alphaiRepresenting the Lagrangian multiplier in the Lagrangian multiplier method corresponding to each sample point, since it is referred to in the above optimization equation
Figure BDA0002737967650000072
Only part of (A) is
Figure BDA0002737967650000073
Since it is very difficult to map the sample points to a suitable high-dimensional feature space that can fit the sample points linearly using mapping, and the appropriate mapping phi () is not known, the application uses "kernel techniques", i.e. setting kernel functions, for simple computation
Figure BDA0002737967650000074
Therefore, the learning of the corresponding support vector regression can be completed only by setting a proper kernel function in a high-dimensional space without finding a complex mapping equation, the types of the common kernel functions are many, and the following Gaussian kernel functions are selected by comparison:
Figure BDA0002737967650000075
after introducing the above kernel function, equation (1) is converted into:
Figure BDA0002737967650000076
and solving the optimization problem by using an SMO optimization algorithm, and solving a Lagrange multiplier to obtain a support vector regression model.
And step five, predicting the hardness of the unknown high-entropy alloy according to the trained model, and selecting the high-entropy alloy with high predicted hardness and good predicted reliability.
According to the embodiment of the invention, in order to verify the necessity of the feature screening method and the validity of the result, 1140 feature combinations in total of all three feature combinations in 20 features are respectively substituted into the support vector regression algorithm for training and testing, a ten-fold cross validation method and a root mean square error calculation method are used for calculating relative test errors of the feature combinations, and the feature combinations are subjected to prediction accuracy sequencing from small to large according to the test errors to serve as a judgment standard of the prediction accuracy of the feature screening method. Table 1 shows the feature combinations ranked 20 top in prediction accuracy searched by the exhaustive method.
TABLE 1
Figure BDA0002737967650000077
Figure BDA0002737967650000081
As can be seen from Table 1, compared with the method of selecting the optimal feature combination by the exhaustion method, the method of the present invention determines the number of features in the feature combination, then screens the features with large correlation by the Pearson correlation coefficient, and then screens the features by the genetic algorithm, which is not only accurate, but also saves more computing resources and time.
The high-entropy alloy test set of the Al-Co-Cr-Cu-Fe-Ni system used by the invention is from 614143 data sets of the Al-Co-Cr-Cu-Fe-Ni system high-entropy alloy components, which are provided by the document [1] and have predicted hardness values higher than 750HV and are searched from 1895147 specific components of the Al-Co-Cr-Cu-Fe-Ni system high-entropy alloy, prediction search is carried out in the data sets, so that the excellent specific components of the high-entropy alloy can be better searched, and the specific components of the high-entropy alloy with higher hardness, which are found in the prediction search process, are shown in Table 2.
TABLE 2
Figure BDA0002737967650000082
Figure BDA0002737967650000091
The composition specific gravity of each element in the high-entropy alloy shown in table 2 is the corresponding molar ratio, and the high-entropy alloy with the highest predicted hardness is: al (Al)0.41Co0.2Cr0.18Fe0.16Ni0.05The predicted hardness was 791.532194 HV.
The invention obtains the conclusion that the genetic algorithm is selected to carry out feature screening as the optimal feature screening method in the high-entropy alloy hardness prediction process according to the comprehensive comparison of prediction accuracy and timeliness, and obtains the prediction feature combination-gamma parameter, electron concentration and work function with optimal prediction performance by repeatedly carrying out feature screening by the genetic algorithm, compared with the feature screening method of selecting the optimal feature combination by an exhaustion method in 20 physical features, the invention firstly determines the number of features in the feature combination, then screens the features with large correlation by using the Pearson correlation coefficient, then screens the features by the genetic algorithm, is more computing resource-saving and time-saving, and the processes of carrying out feature combination screening firstly, then using machine learning models such as Gauss kernel support vector regression and the like to predict the high-entropy alloy hardness and search the virtual space high-entropy alloy are efficient and stable, the invention constructs a complete and universal implementation framework for predicting the material performance by machine learning, realizes a search method for searching specific components of high-performance materials in a large virtual space, successfully predicts and recommends several high-entropy alloys with high hardness, and predicts the corresponding hardness.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

Claims (9)

1. A high-entropy alloy hardness prediction method based on machine learning is characterized by comprising the following steps,
acquiring a characteristic data training set for predicting the hardness of the high-entropy alloy;
step two, screening the characteristic data to obtain an optimal characteristic combination;
selecting a machine learning model by a ten-fold cross validation method;
step four, adopting the selected machine learning model and inputting the optimal characteristic combination to carry out model training;
and step five, predicting the hardness of the unknown high-entropy alloy according to the trained model, and selecting the high-entropy alloy with high predicted hardness and good predicted reliability.
2. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 1, wherein the high-entropy alloy in the first step is an Al-Co-Cr-Cu-Fe-Ni system.
3. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 1, wherein the characteristics in the step one comprise atomic radius difference, electronegativity difference, valence electron concentration, mixed enthalpy, configuration entropy, omega parameter, lambda parameter, gamma parameter, local electronegativity mismatch, number of flowing electrons, cohesive energy, modulus mismatch, local size mismatch, energy term, nabarro coefficient, work function, shear modulus difference, local modulus mismatch and lattice distortion energy.
4. The method for predicting the hardness of the high-entropy alloy based on the machine learning of claim 1, wherein the step two of screening the feature data comprises the steps of firstly determining the number of features in a feature combination, then screening features with high correlation by using a Pearson correlation coefficient, and finally obtaining an optimal feature combination by using a genetic algorithm.
5. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 4, wherein the number of the features in the feature combination is determined to be 3.
6. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 4, wherein a principle of screening the features with high correlation by using the Pearson correlation coefficient is to reserve the features with the Pearson correlation coefficient value larger than 0.95 as the high correlation features after screening.
7. The method for predicting the hardness of the high-entropy alloy based on machine learning according to any one of claims 1 to 4, wherein the optimal feature combination is three features of gamma parameter, electron concentration and work function.
8. The method for predicting the hardness of the high-entropy alloy based on machine learning according to claim 1, wherein the machine learning model in the fourth step is a support vector regression algorithm model.
9. The method for predicting the hardness of the high-entropy alloy based on the machine learning as claimed in claim 8, wherein a Gaussian kernel function is adopted in the support vector regression algorithm, and Bayesian optimization is used for the optimization of the hyperparameters.
CN202011140018.3A 2020-10-22 2020-10-22 High-entropy alloy hardness prediction method based on machine learning Pending CN112216356A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011140018.3A CN112216356A (en) 2020-10-22 2020-10-22 High-entropy alloy hardness prediction method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011140018.3A CN112216356A (en) 2020-10-22 2020-10-22 High-entropy alloy hardness prediction method based on machine learning

Publications (1)

Publication Number Publication Date
CN112216356A true CN112216356A (en) 2021-01-12

Family

ID=74054872

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011140018.3A Pending CN112216356A (en) 2020-10-22 2020-10-22 High-entropy alloy hardness prediction method based on machine learning

Country Status (1)

Country Link
CN (1) CN112216356A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112802563A (en) * 2021-01-19 2021-05-14 哈尔滨理工大学 Machine learning-based two-dimensional transition metal sulfide band gap and energy band structure prediction method and device
CN113870957A (en) * 2021-10-22 2021-12-31 中国航发北京航空材料研究院 Eutectic high-entropy alloy component design method and device based on machine learning
CN114464274A (en) * 2022-01-14 2022-05-10 哈尔滨理工大学 High-entropy alloy hardness prediction method based on machine learning and improved genetic algorithm feature screening
CN114580272A (en) * 2022-02-16 2022-06-03 昆明贵金属研究所 Design method for simultaneously optimizing conductivity and hardness of multi-element electric contact alloy
CN114613449A (en) * 2022-03-01 2022-06-10 哈尔滨工业大学(深圳) Novel amorphous alloy design method based on machine learning
CN114613456A (en) * 2022-03-07 2022-06-10 哈尔滨理工大学 High-entropy alloy hardness prediction method based on improved density peak value clustering algorithm
CN115394381A (en) * 2022-08-24 2022-11-25 哈尔滨理工大学 High-entropy alloy hardness prediction method and device based on machine learning and two-step data expansion
CN115579091A (en) * 2022-11-09 2023-01-06 广东海洋大学 High-entropy alloy component design method based on machine learning and multi-performance collaborative optimization
CN116434880A (en) * 2023-03-06 2023-07-14 哈尔滨理工大学 High-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration
CN116597923A (en) * 2023-05-19 2023-08-15 小米汽车科技有限公司 Model generation method, material information determination method, device, equipment and medium
CN116720058A (en) * 2023-04-28 2023-09-08 贵研铂业股份有限公司 Method for realizing key feature combination screening of machine learning candidate features
CN117789875A (en) * 2023-12-14 2024-03-29 广东海洋大学 Data driving method for designing high-strength high-entropy alloy and application
CN117874712A (en) * 2024-03-11 2024-04-12 四川省光为通信有限公司 Single-mode non-airtight optical module performance prediction method based on Gaussian process regression

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834975A (en) * 2015-05-13 2015-08-12 国家电网公司 Power network load factor prediction method based on intelligent algorithm optimization combination
CN110442954A (en) * 2019-07-31 2019-11-12 东北大学 The super high strength stainless steel design method of lower machine learning is instructed based on physical metallurgy

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834975A (en) * 2015-05-13 2015-08-12 国家电网公司 Power network load factor prediction method based on intelligent algorithm optimization combination
CN110442954A (en) * 2019-07-31 2019-11-12 东北大学 The super high strength stainless steel design method of lower machine learning is instructed based on physical metallurgy

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112802563A (en) * 2021-01-19 2021-05-14 哈尔滨理工大学 Machine learning-based two-dimensional transition metal sulfide band gap and energy band structure prediction method and device
CN112802563B (en) * 2021-01-19 2024-04-12 哈尔滨理工大学 Two-dimensional transition metal sulfide band gap and energy band structure prediction method and device based on machine learning
CN113870957A (en) * 2021-10-22 2021-12-31 中国航发北京航空材料研究院 Eutectic high-entropy alloy component design method and device based on machine learning
CN114464274A (en) * 2022-01-14 2022-05-10 哈尔滨理工大学 High-entropy alloy hardness prediction method based on machine learning and improved genetic algorithm feature screening
CN114580272A (en) * 2022-02-16 2022-06-03 昆明贵金属研究所 Design method for simultaneously optimizing conductivity and hardness of multi-element electric contact alloy
CN114613449A (en) * 2022-03-01 2022-06-10 哈尔滨工业大学(深圳) Novel amorphous alloy design method based on machine learning
CN114613449B (en) * 2022-03-01 2024-08-20 哈尔滨工业大学(深圳) Novel amorphous alloy design method based on machine learning
CN114613456B (en) * 2022-03-07 2023-04-28 哈尔滨理工大学 High-entropy alloy hardness prediction method based on improved density peak clustering algorithm
CN114613456A (en) * 2022-03-07 2022-06-10 哈尔滨理工大学 High-entropy alloy hardness prediction method based on improved density peak value clustering algorithm
CN115394381B (en) * 2022-08-24 2023-08-22 哈尔滨理工大学 High-entropy alloy hardness prediction method and device based on machine learning and two-step data expansion
CN115394381A (en) * 2022-08-24 2022-11-25 哈尔滨理工大学 High-entropy alloy hardness prediction method and device based on machine learning and two-step data expansion
CN115579091A (en) * 2022-11-09 2023-01-06 广东海洋大学 High-entropy alloy component design method based on machine learning and multi-performance collaborative optimization
CN116434880A (en) * 2023-03-06 2023-07-14 哈尔滨理工大学 High-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration
CN116434880B (en) * 2023-03-06 2023-09-08 哈尔滨理工大学 High-entropy alloy hardness prediction method based on fuzzy self-consistent clustering integration
CN116720058A (en) * 2023-04-28 2023-09-08 贵研铂业股份有限公司 Method for realizing key feature combination screening of machine learning candidate features
CN116597923A (en) * 2023-05-19 2023-08-15 小米汽车科技有限公司 Model generation method, material information determination method, device, equipment and medium
CN117789875A (en) * 2023-12-14 2024-03-29 广东海洋大学 Data driving method for designing high-strength high-entropy alloy and application
CN117789875B (en) * 2023-12-14 2024-05-10 广东海洋大学 Data driving method for designing high-strength high-entropy alloy and application
CN117874712A (en) * 2024-03-11 2024-04-12 四川省光为通信有限公司 Single-mode non-airtight optical module performance prediction method based on Gaussian process regression

Similar Documents

Publication Publication Date Title
CN112216356A (en) High-entropy alloy hardness prediction method based on machine learning
Duncan et al. Photometric redshifts for the next generation of deep radio continuum surveys–II. Gaussian processes and hybrid estimates
Dahi et al. A quantum-inspired genetic algorithm for solving the antenna positioning problem
Strelioff et al. Inferring Markov chains: Bayesian estimation, model comparison, entropy rate, and out-of-class modeling
Davenport et al. Tuning support vector machines for minimax and Neyman-Pearson classification
CN103810101A (en) Software defect prediction method and system
CN107341497A (en) The unbalanced weighting data streams Ensemble classifier Forecasting Methodology of sampling is risen with reference to selectivity
CN110659207A (en) Heterogeneous cross-project software defect prediction method based on nuclear spectrum mapping migration integration
CN111581116B (en) Cross-project software defect prediction method based on hierarchical data screening
CN110019421A (en) A kind of time series data classification method based on data characteristics segment
CN113012766A (en) Self-adaptive soft measurement modeling method based on online selective integration
CN114881343A (en) Short-term load prediction method and device of power system based on feature selection
Gao et al. A joint landscape metric and error image approach to unsupervised band selection for hyperspectral image classification
Zhang et al. Multivariate discrete grey model base on dummy drivers
CN115394381B (en) High-entropy alloy hardness prediction method and device based on machine learning and two-step data expansion
Senoglu et al. Goodness-of-fit tests based on Kullback-Leibler information
Ortelli et al. Faster estimation of discrete choice models via dataset reduction
CN113344031B (en) Text classification method
Marconato et al. Identification of Wiener-Hammerstein benchmark data by means of support vector machines
CN114550842A (en) Molecular prediction method and system for drug compound inhibiting biological activity of target protein
López et al. Chaos and Regularity in the Double Pendulum with Lagrangian Descriptors
Schuetzke et al. A universal synthetic dataset for machine learning on spectroscopic data
Quinlan et al. Bayesian design of experiments for logistic regression to evaluate multiple nuclear forensic algorithms
Nidhi et al. Predictive Model for Students' Academic Performance Using Classification and Feature Selection Techniques
CN106156856A (en) The method and apparatus selected for mixed model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination