CN113310928A - Method for rapidly identifying high-temperature sterilized milk with shelf life within and out of date - Google Patents
Method for rapidly identifying high-temperature sterilized milk with shelf life within and out of date Download PDFInfo
- Publication number
- CN113310928A CN113310928A CN202110503705.5A CN202110503705A CN113310928A CN 113310928 A CN113310928 A CN 113310928A CN 202110503705 A CN202110503705 A CN 202110503705A CN 113310928 A CN113310928 A CN 113310928A
- Authority
- CN
- China
- Prior art keywords
- milk
- model
- samples
- shelf life
- temperature sterilized
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 235000020191 long-life milk Nutrition 0.000 title claims abstract description 45
- 238000000034 method Methods 0.000 title claims abstract description 34
- 235000013336 milk Nutrition 0.000 claims abstract description 42
- 210000004080 milk Anatomy 0.000 claims abstract description 42
- 239000008267 milk Substances 0.000 claims abstract description 41
- 238000012360 testing method Methods 0.000 claims abstract description 25
- 238000002329 infrared spectrum Methods 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 16
- 238000012549 training Methods 0.000 claims abstract description 14
- 238000001514 detection method Methods 0.000 claims abstract description 12
- 230000002159 abnormal effect Effects 0.000 claims abstract description 6
- 238000012795 verification Methods 0.000 claims abstract description 4
- 239000000523 sample Substances 0.000 claims description 27
- 238000001228 spectrum Methods 0.000 claims description 16
- 238000012706 support-vector machine Methods 0.000 claims description 15
- 238000007637 random forest analysis Methods 0.000 claims description 12
- 238000012216 screening Methods 0.000 claims description 12
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 11
- 238000004422 calculation algorithm Methods 0.000 claims description 10
- 238000010521 absorption reaction Methods 0.000 claims description 8
- 238000002834 transmittance Methods 0.000 claims description 8
- 238000002835 absorbance Methods 0.000 claims description 7
- 239000007788 liquid Substances 0.000 claims description 6
- 102000014171 Milk Proteins Human genes 0.000 claims description 5
- 108010011756 Milk Proteins Proteins 0.000 claims description 5
- 235000021243 milk fat Nutrition 0.000 claims description 5
- 235000021239 milk protein Nutrition 0.000 claims description 5
- 239000007787 solid Substances 0.000 claims description 5
- 238000009499 grossing Methods 0.000 claims description 4
- 239000013307 optical fiber Substances 0.000 claims description 4
- 238000012937 correction Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000035945 sensitivity Effects 0.000 claims description 3
- 230000009466 transformation Effects 0.000 claims description 3
- 230000004069 differentiation Effects 0.000 claims description 2
- 230000003595 spectral effect Effects 0.000 abstract description 6
- 238000004458 analytical method Methods 0.000 abstract description 5
- 238000004476 mid-IR spectroscopy Methods 0.000 abstract description 2
- 238000000105 evaporative light scattering detection Methods 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 10
- 239000004615 ingredient Substances 0.000 description 5
- 230000008859 change Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 235000018102 proteins Nutrition 0.000 description 3
- 102000004169 proteins and genes Human genes 0.000 description 3
- 108090000623 proteins and genes Proteins 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000000701 chemical imaging Methods 0.000 description 2
- 238000013145 classification model Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000004611 spectroscopical analysis Methods 0.000 description 2
- 238000010183 spectrum analysis Methods 0.000 description 2
- 230000001954 sterilising effect Effects 0.000 description 2
- 238000004659 sterilization and disinfection Methods 0.000 description 2
- 206010012735 Diarrhoea Diseases 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 108010010803 Gelatin Proteins 0.000 description 1
- GUBGYTABKSRVRQ-QKKXKWKRSA-N Lactose Natural products OC[C@H]1O[C@@H](O[C@H]2[C@H](O)[C@@H](O)C(O)O[C@@H]2CO)[C@H](O)[C@@H](O)[C@H]1O GUBGYTABKSRVRQ-QKKXKWKRSA-N 0.000 description 1
- 102000004882 Lipase Human genes 0.000 description 1
- 108090001060 Lipase Proteins 0.000 description 1
- 239000004367 Lipase Substances 0.000 description 1
- 241001440840 Mikania micrantha Species 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 206010047700 Vomiting Diseases 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 235000013365 dairy product Nutrition 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 229920000159 gelatin Polymers 0.000 description 1
- 239000008273 gelatin Substances 0.000 description 1
- 235000019322 gelatine Nutrition 0.000 description 1
- 235000011852 gelatine desserts Nutrition 0.000 description 1
- 239000008101 lactose Substances 0.000 description 1
- 235000019421 lipase Nutrition 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 235000020200 pasteurised milk Nutrition 0.000 description 1
- 229920003023 plastic Polymers 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000004448 titration Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 230000017260 vegetative to reproductive phase transition of meristem Effects 0.000 description 1
- 230000008673 vomiting Effects 0.000 description 1
- 235000020334 white tea Nutrition 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
- G01N21/31—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry
- G01N21/35—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light
- G01N21/3577—Investigating relative effect of material at wavelengths characteristic of specific elements or molecules, e.g. atomic absorption spectrometry using infrared light for analysing liquids, e.g. polluted water
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/24323—Tree-organised classifiers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
The invention belongs to the technical field of milk product analysis, and particularly relates to a method for rapidly identifying high-temperature sterilized milk with an expiration date. The field of the invention relates to the field of mid-infrared spectroscopy. The invention mainly comprises the following steps: the method comprises the steps of obtaining infrared spectrum data of milk samples sterilized at high temperature within a quality guarantee period and at an expired time, preprocessing an original infrared spectrum, removing abnormal values, dividing a preprocessed data set into a training set and a testing set, selecting spectral bands of the training set, constructing a prediction model on the training set, and evaluating the model by using the testing set and a verification set. The application of the detection comprises the following steps: the model was used to predict whether autoclaved milk was stored for more than 6 months. The invention has the advantages of high testing speed, no damage to samples, simultaneous mass detection and the like.
Description
Technical Field
The invention belongs to the technical field of milk product analysis, and particularly relates to a method for rapidly identifying high-temperature sterilized milk with a shelf life within and out of date. The field of the invention is related to the field of mid-infrared spectroscopy.
Background
The shelf life of domestic high-temperature sterilized milk is usually 1-6 months, the tetra Pak milk is packed in a tetra Pak paper box for 6 months, and the package in a transparent plastic bag is usually 1 month. The high-temperature sterilized milk is prepared by adopting a high-temperature instantaneous sterilization method (UHT) in a milk processing technology, and treating the milk for 0.5-4 s at 130-150 ℃. The method has high sterilization temperature, can remove most microorganisms, and can retain most nutrients in short time[1]. However, milk contains a large amount of heat-resistant lipase and protease produced by psychrophile, the enzyme activity still partially exists even after UHT treatment, fat and protein are continuously and slowly decomposed in the storage process, so that the product quality is reduced, and the shelf life is shortened[2]Meanwhile, the secondary structure of protein in the milk is changed along with the increase of storage time[3]. Manufacturers on the market modify the production date to cause overdue milk to flow into the market, so that the events such as diarrhea and vomiting occur after the consumption of consumers, the detection means of the storage time of the high-temperature sterilized milk is few, and the research on the identification technology of whether the high-temperature sterilized milk is overdue is not reported.
At present, a rapid identification technology for high-temperature sterilized milk within a quality guarantee period and after-expiration is lacked, and the rapid identification technology for whether the high-temperature sterilized milk is out-of-date or not is realized by a Chinese and foreign spectrum technology, so that a rapid and convenient detection technology can be provided for a market supervision department to monitor and restrict market behaviors. Mid-infrared spectral analysis is a modern technology which is rapidly developed in recent years, is free from damage and pollution and can simultaneously analyze multiple components, is widely applied to quality determination of milk, and is successfully applied to thermosensitive protein content of high-temperature sterilized milk at present[4]In the determination, the storage time of the milk can be reflected by the change of the mid-infrared spectrum caused by the change of the determined protein, so that a rapid identification method of the high-temperature sterilized milk with the shelf life within and out of date can be established by utilizing the mid-infrared spectrum (MIR).
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a method for rapidly identifying high-temperature sterilized milk with an inner and an outer shelf life. In order to determine the optimal combination of preprocessing and modeling algorithms, 5 preprocessing methods including no processing are used for the spectral data, and 2 modeling methods are combined to establish a total of 10 shelf-life and expired high-temperature sterilized milk identification models. And screening out the characteristic spectrum used for modeling by carrying out Pearson correlation test on the spectrum data and carrying out significance analysis on the correlation. The accuracy of the established model on the test set is 1.00, and the accuracy on the verification set is 1.00.
The technical scheme of the invention is as follows:
a rapid identification technology of high-temperature sterilized milk with shelf life and expiration, the method comprises the following steps:
1) selecting a milk sample: respectively collecting high-temperature sterilized milk with the shelf life within and beyond as samples;
2) acquiring a mid-infrared spectrum, namely scanning a detection sample by using a milk component detector, and outputting light transmittance corresponding to the sample by using a computer connected with the milk component detector to obtain a sample spectrogram;
3) preprocessing the collected original mid-infrared spectrum data, converting the spectrum data into absorbance (A) by transmittance (T), and removing abnormal values;
4) dividing the data set into a training set and a testing set;
5) selection of a modeling waveband: screening the significant difference wave bands of the two milk samples, and removing the water absorption area;
6) establishing and screening a model: taking the mid-infrared spectrum of a milk sample of a training set as an input value, taking the category of high-temperature sterilized milk within a shelf life and after-expiration as an output value, constructing a model on the training set by using different spectrum preprocessing methods and different modeling algorithms, evaluating and screening the model according to the principle that indexes such as accuracy, specificity, sensitivity and AUC are higher, and selecting an optimal data preprocessing method and a modeling algorithm to combine to construct the model;
7) and (3) verification and application of the optimal model: and taking high-temperature sterilized milk samples with the shelf life within and after the shelf life, identifying the samples by using the screened optimal model, and evaluating the application performance of the samples.
Wherein:
when the mid-infrared spectrum is collected in the step 2), respectively pouring high-temperature sterilized milk samples within the quality guarantee period and out of date into cylindrical sampling tubes with the diameter of 3.5cm and the height of 9cm to ensure that the liquid level height is more than 6cm, then carrying out water bath on the samples in a water bath kettle at 42 ℃ for 15-20min, and extending a solid optical fiber probe into the liquid to carry out sample absorption detection;
log according to A) in step 3)10(1/T) converting the transmittance (T) into absorbance (A), and removing abnormal values by using the mahalanobis distance and the percentage content of milk fat and milk protein, wherein data that the mahalanobis distance of the spectrum is less than or equal to 3 and the percentage content of milk fat and milk protein is within the range of +/-3.5 standard deviations of the average value are reserved;
the method for screening the difference wave band used in the step 5) is Pearson correlation test and significance test of the correlation, and the removed water absorption area is 3587.94-2970.66cm-1And 1716.81-1543.2cm-1;
The spectrum preprocessing method used in the step 6) comprises first-order differentiation (Diff), standard normal variable transformation (SNV), Multivariate Scattering Correction (MSC) and Savitzky-Golag convolution smoothing (SG convolution smoothing for short), and the modeling algorithm used in the method is Random Forest (RF) and Support Vector Machine (SVM).
Compared with the prior art, the invention has the following beneficial effects:
the invention is characterized in that the selected modeling wave band is 929.513-1021.993cm-1、1099.059-1268.604cm-1、1453.562-1515.215cm-1、1576.868-1623.107cm-1、1680.907-2632.672cm-1And 2829.190-3048.828cm-1And waiting 6 wave bands, and adopting a modeling method of a support vector machine. At present, a detection means for detecting the shelf life of the high-temperature sterilized milk in the shelf life and the shelf life beyond the shelf life is not reported, and the invention provides a method for detecting the storage time of the high-temperature sterilized milk. Compared with the traditional methods for judging the milk quality such as acidity titration, sensory evaluation and the like, the method has the advantages of more convenient operation, time saving, capability of obtaining the detection result in real time and capability of simultaneously detecting the milk quality in large batchSterilizing milk sample at high temperature.
Drawings
FIG. 1: the invention models the average spectrum of the band. Namely the average absorbance graph of the second class of high-temperature sterilized milk in the modeling waveband. Description of reference numerals: in the figure, the abscissa is the spectral wave number and the ordinate is the absorbance; the solid line 0 indicates the shelf-life of the autoclaved milk, and the dotted line 1 indicates the shelf-life of the autoclaved milk; FIG. 1(a) is a total average spectrum of the modeled band, and FIGS. 1(b), 1(c), and 1(d) are enlarged partial spectra, respectively, in which the band of FIG. 1(b) is 929.513-1021.993cm-1、1099.059-1268.604cm-1、1453.562-1515.215cm-1And 1576.868-1623.107cm-1The wavelength range of FIG. 1(c) is 1680.907-2632.672cm-1The band of FIG. 1(d) is 2829.190-3048.828cm-1。
FIG. 2: the invention tests the confusion matrix of the set. Description of reference numerals: the ROC curve can measure the performance of the model in a test set, the abscissa is the false positive rate, the ordinate is the true positive rate, the AUC is the area enclosed by the ROC curve and the coordinate axis, the value range is between 0.5 and 1, and if the AUC is closer to 1.0, the method provided by the invention is higher in authenticity.
FIG. 3: the invention tests set classification probabilities. Description of reference numerals: the abscissa is the prediction probability, and the ordinate is the prediction category, for example, the lower left circle in the figure indicates that the probability that the sample is classified into 0 category is 0.942, i.e. the sample is determined to be correctly classified.
Detailed Description
The invention is further illustrated by the following figures and examples.
Experimental tests, in which specific conditions are not specified in the examples given in the present invention, were carried out according to conventional methods and conditions, or according to conditions recommended by the manufacturers of the equipment or instruments (e.g., instructions for use of the product).
Example 1: model building and screening
Instruments and equipment:
selecting MilkoScan produced by FOSS companyTM7RM milk ingredient detector (used according to product)The specification operates).
The method comprises the following specific steps:
(1) collecting milk sample and measuring mid-infrared spectrum
Several batches of high-temperature sterilized milk were purchased separately and the shelf life of these milk was 6 months. All the autoclaved milks were kept in a laboratory at a temperature of 18-26 ℃ and measured for mid-infrared spectra for 16 months for 2-3 milk samples per batch of all brands per month. 585 samples of autoclaved milk were obtained.
Respectively pouring milk samples into cylindrical sample tubes with the diameter of 3.5cm and the height of 9cm, ensuring that the liquid level height is more than 6cm, then carrying out water bath on the milk samples in a water bath kettle at 42 ℃ for 15-20min, extending a solid optical fiber probe into the liquid, carrying out sample absorption detection, and obtaining the light transmittance of the samples through software of the solid optical fiber probe.
(2) Data pre-processing
Calculating mahalanobis distance for mid-infrared spectrum of 585 high-temperature sterilized milk samplesThe Martensis distance of the reserved spectrum is less than or equal to 3, and the percentage content of milk fat and milk protein is 3.5 standard deviationsData in the range are shown in table 1, the abnormal samples of the high-temperature sterilized milk in 2 quality guarantee periods are removed through statistics of sample size change in the process, 414 high-temperature sterilized milk samples in the quality guarantee periods and 169 high-temperature sterilized milk samples outside the quality guarantee periods are obtained, and the samples are divided into a training set (n-434: 315 high-temperature sterilized milk in the quality guarantee periods and 119 high-temperature sterilized milk outside the quality guarantee periods) and a testing set (n-109: 79 high-temperature sterilized milk in the quality guarantee periods and 30 high-temperature sterilized milk outside the quality guarantee periods) according to a hierarchical sampling method;
in the modeling process, 0 represents high-temperature sterilized milk in the shelf life, and 1 represents high-temperature sterilized milk outside the shelf life. Table 2 is a descriptive statistic of the conventional milk ingredients of the two types of high-temperature sterilized milk, and it can be seen from table 2 that the lactose content and total solid content of the high-temperature sterilized milk within the shelf life are significantly higher than those of the high-temperature sterilized milk outside the shelf life (P <0.05), and the other conventional milk ingredients have no significant difference. The results are shown in tables 1 and 2.
TABLE 1 sample size variation when rejecting outliers
TABLE 2 descriptive statistics of two types of conventional milk ingredients for high temperature sterilized milk
Description of the drawings: each parameter value is expressed as mean. + -. standard deviation of the mean. a, b: there were significant differences (P <0.05) in the mean values with different superscripts for the same milk ingredients in different autoclaved milk types.
(3) Screening of modeling bands
Converting the spectral data from light transmittance (T) to absorbance (A), removing water absorption region, and performing Pearson correlation test on the spectral data[5]And carrying out significance analysis on the correlation, and finally selecting 929.513-1021.993cm-1、1099.059-1268.604cm-1、1453.562-1515.215cm-1、1576.868-1623.107cm-1、1680.907-2632.672cm-1And 2829.190-3048.828cm-1And modeling the wave bands. FIG. 1 is a spectrum of a modeled band of the present invention.
(4) Model building
The data set is divided into a training set (n-434), a test set (n-109), and a validation set (n-40).
Respectively using first order differential (Diff)[6]Standard normal variable transformation (SNV)[7]Multivariate Scattering Correction (MSC)[8]And SG convolution smoothing[9]The spectral data is pre-processed and also compared to data that has not been pre-processed.
Using Random Forest (RF)[10]And Support Vector Machine (SVM)[11]The algorithm uses the training set data to establish a classification model and tests the setThe samples in (1) are predicted. The results of the modeling with RF and SVM algorithms under different pre-processing are shown in table 3.
TABLE 3 modeling results for RF and SVM under different preconditions
(5) Screening and determination of optimal models
In the identification model, the accuracy is the probability that correct judgment accounts for all judgment, and the closer the value is to 1, the better the value is; sensitivity represents the proportion of one class of the two classes that is correctly classified, with values as close to 1 as possible. Specificity represents the probability that another class in a binary class is correctly classified, with values as close to 1 as possible. AUC is the area under the ROC curve, which intuitively reflects the classification capability expressed by the ROC curve, AUC 1 represents that the classifier is a perfect classifier, 0.5< AUC <1 represents that the classifier is better than a random classifier, and 0< AUC <0.5 represents that the classifier is worse than the random classifier. As can be seen from Table 3, the Support Vector Machine (SVM) model classification training is superior to the Random Forest (RF) model. Both the random forest and the support vector machine are classification algorithms with strong learning capability, high learning speed and high accuracy, but compared with the random forest, the support vector machine has strong generalization capability and low training cost, and can effectively avoid 'dimensional disaster' aiming at multidimensional data. Different preprocessing is performed on the data, so that the operation difficulty is increased to different degrees, and the operation time is prolonged. Therefore, the present invention selects a model that does not deal with the combination of the support vector machine and the model built as the optimal model among the models.
109 samples of the test set were predicted using the selected optimal classification model. The ROC curve is used for measuring the performance of the model in the test set, the false positive rate is used as an abscissa, the true positive rate is used as an ordinate, and the obtained ROC curve is shown in figure 2. The AUC is the area enclosed by the coordinate axes under the ROC curve, the value range is between 0.5 and 1, and the closer the AUC is to 1.0, the higher the authenticity of the method is. As can be seen from fig. 2, the AUC of the test set in this embodiment is 1, which indicates that the classification effect of the model on the test set is very good.
Fig. 3 shows the probability of class classification in the test set, for example, the green dot at the bottom left of the figure indicates that the probability of the sample being classified as class 0 is 0.774 and is the correct classification. As can be seen, all samples in the test set were correctly classified with a majority probability of 0.925-1.000.
Example 2: application of the model of the invention
The results of measurement and treatment of 40 samples by the method of spectrometry, data preprocessing, and the like in example 1, and identification of the samples using the selected optimal model are shown in table 4.
Table 4 results of the model application of the present invention
The identification result of the invention is completely the same as the real condition, wherein, the milk is sterilized at high temperature within 20 quality guarantee periods, and the milk is sterilized at high temperature outside 20 quality guarantee periods, thereby reaching the accuracy of 1.00.
The invention has short detection time (only two minutes), needs small sample amount (20mL), can simultaneously detect the high-temperature sterilized milk in large batch, and judges whether the storage time of the high-temperature sterilized milk exceeds 6 months, and belongs to the condition of being in the quality guarantee period or being over the quality guarantee period.
Primary references
[1] Kouchun et al, low talk about high temperature sterilized milk and pasteurized milk [ J ]. china dairy industry, 2006(05): 61-63;
[2]Burton,H.Ultra-high-temperature processing of milk and milk products.[M].Springer US,1994;
[3] wangyuang et al, based on infrared spectroscopic analysis of changes in the secondary structure of proteins during storage of pasteurized milk [ J ]. Shenyang university of agriculture, academic, 2019,50(01): 93-100;
[4]Montemurro M,Schwaighofer A,Schmidt A,Culzoni MJ,Mayer HK,Lendl B.High-throughput quantitation of bovine milk proteins and discrimination of commercial milk types by external cavity-quantum cascade laser spectroscopy and chemometrics.Analyst.2019Sep 9;144(18):5571-5579;
[5]Pearson K.Determination of the coefficient of correlation[J].Science,1909,30(757):23-25;
[6] zhao et al, research [ J/OL ] for detecting the tracing of edible gelatin variety based on near infrared spectrum combined with machine learning algorithm, proceedings of Henan university of agriculture 1-10[2021-04-29 ];
[7] lbumangong et al, the influence research of the spectrum data pretreatment on the intertidal zone sediment nitrogen LSSVM model [ J ] spectroscopy and spectrum analysis, 2020,40(08): 2409-;
[8] zhao et al, research [ J/OL ] for detecting the tracing of edible gelatin variety based on near infrared spectrum combined with machine learning algorithm, proceedings of Henan university of agriculture 1-10[2021-04-29 ];
[9] mayuqiang and the like, hyperspectral imaging-based research on characteristic analysis and monitoring models of mikania micrantha in different flowering stages [ J ]. Yunnan university journal (Nature science edition), 2021,43(02): 290-;
[10] han Meng et al, apple crown LAI hyperspectral estimation [ J ] spectroscopy and spectral analysis based on SVM and RF 2016,36(03): 800-;
[11] a quality of old books and quality, a white tea storage year judgment based on a hyperspectral imaging technology [ J/OL ] food industry science and technology 1-13[2021-04-29 ].
Claims (1)
1. A method for rapidly identifying high-temperature sterilized milk with in-shelf life and over-shelf life is characterized by comprising the following steps:
1) selecting a milk sample: respectively collecting high-temperature sterilized milk with the shelf life within and beyond as a detection sample;
2) acquiring a mid-infrared spectrum, namely scanning the detection sample in the step 1) by using a milk component detector, and outputting light transmittance corresponding to the sample by using a computer connected with the milk component detector to obtain a sample spectrogram;
3) preprocessing the collected original mid-infrared spectrum data, converting the spectrum data into absorbance (A) by transmittance (T), and removing abnormal values;
4) dividing the data set into a training set and a testing set;
5) selection of a modeling waveband: screening the significant difference wave bands of the two milk samples, and removing the water absorption area;
6) establishing and screening a model: taking the mid-infrared spectrum of a milk sample of a training set as an input value, taking the category of high-temperature sterilized milk within a shelf life and after-expiration as an output value, constructing a model on the training set by using different spectrum preprocessing methods and different modeling algorithms, evaluating and screening the model by using accuracy, specificity, sensitivity and AUC indexes, and constructing the model by combining an optimal data preprocessing method and the modeling algorithm;
7) and (3) verification and application of the optimal model: taking high-temperature sterilized milk samples with the shelf life in and after the shelf life, identifying the samples by using the screened optimal model, and evaluating the application performance of the samples;
wherein:
when the mid-infrared spectrum is collected in the step 2), respectively pouring high-temperature sterilized milk samples within the quality guarantee period and out of date into cylindrical sampling tubes with the diameter of 3.5cm and the height of 9cm to ensure that the liquid level height is more than 6cm, then carrying out water bath on the samples in a water bath kettle at 42 ℃ for 15-20min, and extending a solid optical fiber probe into the liquid to carry out sample absorption detection;
log according to A) in step 3)10(1/T) converting the transmittance (T) into absorbance (A), and removing abnormal values by using the mahalanobis distance and the percentage content of milk fat and milk protein, wherein data that the mahalanobis distance of the spectrum is less than or equal to 3 and the percentage content of milk fat and milk protein is within the range of +/-3.5 standard deviations of the average value are reserved;
the method for screening the difference wave band used in the step 5) is Pearson correlation test and significance test of the correlation, and the removed water absorption area is 3587.94-2970.66cm-1And 1716.81-1543.2cm-1The selected modeling band is 929.513-1021.993cm-1、1099.059-1268.604cm-1、1453.562-1515.215cm-1、1576.868-1623.107cm-1、1680.907-2632.672cm-1And 2829.190-3048.828cm-16 wave bands;
the spectrum preprocessing method used in the step 6) is to use first-order differentiation, standard normal variable transformation, multivariate scattering correction and Savitzky-Golag convolution smoothing, the modeling algorithm used is random forest and support vector machine, and the non-preprocessing and support vector machine is selected to construct a prediction model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110503705.5A CN113310928A (en) | 2021-05-10 | 2021-05-10 | Method for rapidly identifying high-temperature sterilized milk with shelf life within and out of date |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110503705.5A CN113310928A (en) | 2021-05-10 | 2021-05-10 | Method for rapidly identifying high-temperature sterilized milk with shelf life within and out of date |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113310928A true CN113310928A (en) | 2021-08-27 |
Family
ID=77371836
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110503705.5A Withdrawn CN113310928A (en) | 2021-05-10 | 2021-05-10 | Method for rapidly identifying high-temperature sterilized milk with shelf life within and out of date |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113310928A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116227974A (en) * | 2022-12-26 | 2023-06-06 | 中国农业科学院蜜蜂研究所 | Identification method for honey sensory and quality ratings |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108446599A (en) * | 2018-02-27 | 2018-08-24 | 首都师范大学 | A kind of high spectrum image wave band fast selecting method of p value statistic modeling independence |
CN108844917A (en) * | 2018-09-29 | 2018-11-20 | 山东大学 | A kind of Near Infrared Spectroscopy Data Analysis based on significance tests and Partial Least Squares |
CN111579500A (en) * | 2020-05-20 | 2020-08-25 | 湖南城市学院 | Heavy metal content support vector machine regression method combining wave bands and ratios of indoor and outdoor spectrums |
CN112525850A (en) * | 2020-10-01 | 2021-03-19 | 华中农业大学 | Spectral fingerprint identification method for milk, mare, camel, goat and buffalo milk |
CN112666111A (en) * | 2020-10-01 | 2021-04-16 | 华中农业大学 | Method for quickly identifying milk and mare milk |
CN112666112A (en) * | 2020-10-01 | 2021-04-16 | 华中农业大学 | Batch discrimination model and method for camel milk and mare milk |
-
2021
- 2021-05-10 CN CN202110503705.5A patent/CN113310928A/en not_active Withdrawn
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108446599A (en) * | 2018-02-27 | 2018-08-24 | 首都师范大学 | A kind of high spectrum image wave band fast selecting method of p value statistic modeling independence |
CN108844917A (en) * | 2018-09-29 | 2018-11-20 | 山东大学 | A kind of Near Infrared Spectroscopy Data Analysis based on significance tests and Partial Least Squares |
CN111579500A (en) * | 2020-05-20 | 2020-08-25 | 湖南城市学院 | Heavy metal content support vector machine regression method combining wave bands and ratios of indoor and outdoor spectrums |
CN112525850A (en) * | 2020-10-01 | 2021-03-19 | 华中农业大学 | Spectral fingerprint identification method for milk, mare, camel, goat and buffalo milk |
CN112666111A (en) * | 2020-10-01 | 2021-04-16 | 华中农业大学 | Method for quickly identifying milk and mare milk |
CN112666112A (en) * | 2020-10-01 | 2021-04-16 | 华中农业大学 | Batch discrimination model and method for camel milk and mare milk |
Non-Patent Citations (2)
Title |
---|
CHAO DU 等: ""Genetic Analysis of Milk Production Traits and Mid-Infrared Spectra in Chinese Holstein Population"", 《ANIMALS》 * |
张爱武 等: ""p值统计量建模独立性的高光谱波段选择方法"", 《红外与激光工程》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116227974A (en) * | 2022-12-26 | 2023-06-06 | 中国农业科学院蜜蜂研究所 | Identification method for honey sensory and quality ratings |
CN116227974B (en) * | 2022-12-26 | 2024-01-30 | 中国农业科学院蜜蜂研究所 | Identification method for honey sensory and quality ratings |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhang et al. | Fast prediction of sugar content in Dangshan pear (Pyrus spp.) using hyperspectral imagery data | |
Giovenzana et al. | Rapid evaluation of craft beer quality during fermentation process by vis/NIR spectroscopy | |
CN107024450A (en) | A kind of method for differentiating different brands and hop count milk powder based on near-infrared spectrum technique | |
Yuan et al. | Non-invasive measurements of ‘Yunhe’pears by vis-NIRS technology coupled with deviation fusion modeling approach | |
Tian et al. | Measurement orientation compensation and comparison of transmission spectroscopy for online detection of moldy apple core | |
Jiang et al. | Rapid nondestructive detecting of wheat varieties and mixing ratio by combining hyperspectral imaging and ensemble learning | |
CN113310936A (en) | Rapid identification method for four high-temperature sterilized commercial milks | |
Sun et al. | Non-destructive detection of blackheart and soluble solids content of intact pear by online NIR spectroscopy | |
Nturambirwe et al. | Detecting bruise damage and level of severity in apples using a contactless nir spectrometer | |
CN113310930A (en) | Spectral identification method of high-temperature sterilized milk, pasteurized milk and pasteurized milk mixed with high-temperature sterilized milk | |
Qi et al. | Rapid and non-destructive determination of soluble solid content of crown pear by visible/near-infrared spectroscopy with deep learning regression | |
Xu et al. | Nondestructive detection of total soluble solids in grapes using VMD‐RC and hyperspectral imaging | |
Jiang et al. | Rapid determination of acidity index of peanuts by near-infrared spectroscopy technology: Comparing the performance of different near-infrared spectral models | |
Lin et al. | Prediction of protein content in rice using a near-infrared imaging system as diagnostic technique | |
CN113310928A (en) | Method for rapidly identifying high-temperature sterilized milk with shelf life within and out of date | |
Liu et al. | Rapid determination of acidity index of peanut during storage by a portable near-infrared spectroscopy system | |
CN113310937A (en) | Method for rapidly identifying high-temperature sterilized milk, pasteurized fresh milk of dairy cow and reconstituted milk of milk powder | |
CN110609011A (en) | Near-infrared hyperspectral detection method and system for starch content of single-kernel corn seeds | |
CN110231306A (en) | A kind of method of lossless, the quick odd sub- seed protein content of measurement | |
CN113310929A (en) | Soybean powder doped in high-temperature sterilized milk and spectral identification method of doping proportion thereof | |
Shen et al. | Discrimination of blended Chinese rice wine ages based on near-infrared spectroscopy | |
CN110231302A (en) | A kind of method of the odd sub- seed crude fat content of quick measurement | |
Sun et al. | Nondestructive identification of barley seeds varieties using hyperspectral data from two sides of barley seeds | |
CN113324943A (en) | Yak milk and rapid identification model of milk mixed with yak milk | |
CN113310933A (en) | Spectrum identification method for number of days for storing raw buffalo milk |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210827 |