CN118471348B - Human body fluid spectrum analysis method and system based on artificial intelligence - Google Patents
Human body fluid spectrum analysis method and system based on artificial intelligence Download PDFInfo
- Publication number
- CN118471348B CN118471348B CN202410921467.3A CN202410921467A CN118471348B CN 118471348 B CN118471348 B CN 118471348B CN 202410921467 A CN202410921467 A CN 202410921467A CN 118471348 B CN118471348 B CN 118471348B
- Authority
- CN
- China
- Prior art keywords
- peak
- feature
- characteristic
- analysis
- signal intensity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 210000001124 body fluid Anatomy 0.000 title claims abstract description 28
- 239000010839 body fluid Substances 0.000 title claims abstract description 28
- 238000010183 spectrum analysis Methods 0.000 title claims abstract description 24
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 23
- 238000001228 spectrum Methods 0.000 claims abstract description 91
- 239000003550 marker Substances 0.000 claims abstract description 47
- 210000002966 serum Anatomy 0.000 claims abstract description 45
- 238000012545 processing Methods 0.000 claims abstract description 35
- 238000001514 detection method Methods 0.000 claims abstract description 29
- 230000003595 spectral effect Effects 0.000 claims description 47
- 239000000090 biomarker Substances 0.000 claims description 42
- 238000004458 analytical method Methods 0.000 claims description 40
- 238000004422 calculation algorithm Methods 0.000 claims description 32
- 238000013136 deep learning model Methods 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 19
- 238000007619 statistical method Methods 0.000 claims description 16
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000007621 cluster analysis Methods 0.000 claims description 11
- 230000010354 integration Effects 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 9
- 239000013598 vector Substances 0.000 claims description 9
- 239000000470 constituent Substances 0.000 claims description 8
- 230000006870 function Effects 0.000 claims description 8
- 238000005316 response function Methods 0.000 claims description 8
- 210000004369 blood Anatomy 0.000 claims description 7
- 239000008280 blood Substances 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 7
- 238000013135 deep learning Methods 0.000 claims description 6
- 239000000203 mixture Substances 0.000 claims description 6
- 238000003909 pattern recognition Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 238000010219 correlation analysis Methods 0.000 claims description 4
- 238000013016 damping Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 238000012896 Statistical algorithm Methods 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 230000003287 optical effect Effects 0.000 abstract 1
- 238000005516 engineering process Methods 0.000 description 12
- 238000006243 chemical reaction Methods 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000002965 ELISA Methods 0.000 description 4
- 238000010521 absorption reaction Methods 0.000 description 4
- 239000013076 target substance Substances 0.000 description 4
- 238000005033 Fourier transform infrared spectroscopy Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000004949 mass spectrometry Methods 0.000 description 3
- 238000000513 principal component analysis Methods 0.000 description 3
- 210000000601 blood cell Anatomy 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004611 spectroscopical analysis Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000012897 Levenberg–Marquardt algorithm Methods 0.000 description 1
- 238000001069 Raman spectroscopy Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000002306 biochemical method Methods 0.000 description 1
- 238000010241 blood sampling Methods 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 238000000701 chemical imaging Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 230000001066 destructive effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 238000002329 infrared spectrum Methods 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 238000000329 molecular dynamics simulation Methods 0.000 description 1
- 238000010238 partial least squares regression Methods 0.000 description 1
- 238000000053 physical method Methods 0.000 description 1
- 210000002381 plasma Anatomy 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000012628 principal component regression Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000004451 qualitative analysis Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 239000000376 reactant Substances 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/17—Systems in which incident light is modified in accordance with the properties of the material investigated
- G01N21/25—Colour; Spectral properties, i.e. comparison of effect of material on the light at two or more different wavelengths or wavelength bands
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/10—Signal processing, e.g. from mass spectrometry [MS] or from PCR
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/20—Supervised data analysis
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
- G16B40/30—Unsupervised data analysis
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Spectroscopy & Molecular Physics (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Epidemiology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Molecular Biology (AREA)
- Signal Processing (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Investigating Or Analysing Materials By Optical Means (AREA)
Abstract
The invention relates to the technical field of artificial intelligence, and provides a human body fluid spectrum analysis method and system based on artificial intelligence, comprising the steps of obtaining a serum sample, and collecting spectrum data to obtain various spectrum signals; processing the optical signal to obtain peak characteristic frequency, signal intensity and peak shape width; acquiring and analyzing main components of the peak characteristic frequency to obtain grouping characteristics; processing the signal intensity and analyzing the signal intensity to obtain statistical characteristics; and performing curve fitting on the peak shape width to obtain a characteristic peak position and a characteristic peak width, combining the characteristics, outputting a characteristic spectrum data set, comparing the characteristic spectrum data set to obtain a comparison result, and obtaining the type of the target marker and the concentration of the target marker according to the comparison result. Through the scheme, rapid detection in the serum sample is realized, and the data detection efficiency and the detection precision are improved.
Description
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a human body fluid spectrum analysis method and system based on artificial intelligence.
Background
In recent years, along with rapid development of science and technology and continuous improvement of health consciousness of people, the demand for analysis of various components in human body fluid is increasing. Body fluid spectroscopic analysis is an emerging non-invasive target detection method that allows doctors to detect and monitor targets by analyzing biomarkers in blood, urine, or other body fluids. The sensitivity and non-invasive nature of this approach makes it of great potential in clinical applications.
In the related technical means, enzyme-linked immunosorbent assay (ELISA), mass Spectrometry (MS) and other biochemical methods are adopted for human body fluid spectral analysis, and the traditional techniques play an important role in biomarker identification.
For the technical scheme, although specific biomarkers can be identified through ELISA, mass spectrometry and other methods, and the required clinical information is provided, when a large number of samples are processed, highly specialized experimental operation and complicated sample pretreatment are usually required, and the problem of low detection efficiency exists; meanwhile, the detection accuracy is often poor for a small part of known biomarkers through ELISA, mass spectrometry and other methods, but for potential markers or low-abundance markers which are not yet identified.
Disclosure of Invention
In order to solve the problems of low detection efficiency and poor detection precision in human body fluid analysis, the application provides a human body fluid spectrum analysis method and system based on artificial intelligence.
The invention provides a human body fluid spectrum analysis method based on artificial intelligence, which comprises the following steps: acquiring a serum sample of a tested person, and collecting spectrum data of the serum sample to obtain various spectrum signals; performing frequency conversion processing on each spectrum signal to obtain peak characteristic frequency, signal intensity and peak shape width of each spectrum signal; acquiring main components of the peak characteristic frequency, and performing cluster analysis on the main components to obtain grouping characteristics; normalizing the signal intensity according to a signal processing algorithm to obtain normalized signal intensity, and carrying out statistical analysis on the normalized signal intensity to obtain statistical characteristics; performing curve fitting on the peak shape width according to the main composition components and the normalized signal intensity to obtain a characteristic peak position and a characteristic peak width; inputting the grouping features, the statistical features, the feature peak positions and the feature peak widths into a preset deep learning model, and outputting a feature spectrum data set of a target marker; and comparing the characteristic spectrum data set with a preset biomarker database to obtain a comparison result, and inputting the comparison result into a preset artificial intelligent model to obtain the type of the target marker and the concentration of the target marker.
Preferably, the step of obtaining a serum sample of the tested person, collecting spectral data of the serum sample, and obtaining a plurality of spectral signals includes: collecting a blood sample of a tested person based on a serum sample collecting tool to obtain a serum sample of the tested person; preprocessing the serum sample, and collecting spectral data of the preprocessed serum sample by utilizing a spectral analysis instrument to obtain an initial spectral signal; and denoising and baseline correction are carried out on the initial spectrum signals by using digital signal processing software, so as to obtain various spectrum signals.
Preferably, the step of performing frequency conversion processing on each of the spectrum signals to obtain a peak characteristic frequency, a signal strength and a peak shape width of each of the spectrum signals includes: converting each spectrum signal into a signal in a frequency domain by utilizing fast Fourier transform to obtain a signal in the frequency domain; carrying out peak detection on the frequency domain signal according to spectrum analysis software to obtain a local maximum point, and taking the local maximum point as peak characteristic frequency; and calculating the signal intensity corresponding to the peak characteristic frequency by using an integral formula, wherein the integral formula is as follows:
wherein, To be at peak characteristic frequencyThe signal strength at which the signal is to be received,For spectral signals as frequencyIs a function of (a) and (b),As a lower limit of the integration the value of the integral,Is the upper limit of the integration; calculating the full width at half maximum of the peak characteristic frequency according to the Lorentzian fitting calculation to obtain a peak shape width, wherein the Lorentzian fitting calculation has the following formula for calculating the full width at half maximum of the peak characteristic frequency:
wherein, For peak characteristic frequencyThe full width at half maximum of the time,For the lorentz model damping coefficient corresponding to the peak characteristic frequency,Is the imaginary part of the complex dielectric response function at the peak characteristic frequency,Is the amplitude of the complex dielectric response function.
As a preferred scheme, the main component of the peak characteristic frequency is obtained, and cluster analysis is carried out on the main component to obtain grouping characteristics; normalizing the signal intensity according to a signal processing algorithm to obtain normalized signal intensity, and performing statistical analysis on the normalized signal intensity to obtain statistical characteristics, wherein the method comprises the following steps: extracting the peak characteristic frequency by utilizing nonnegative matrix factorization to obtain a component related to the biomarker activity, wherein the component related to the biomarker activity is taken as a main component; inputting the main components into a spatial cluster based on density so as to identify similar biomolecule patterns in the main components, and classifying the similar biomolecule patterns to obtain grouping characteristics; carrying out standardization processing on the signal intensity by utilizing maximum and minimum normalization, and adjusting each peak value signal of the signal intensity data to the same scale to obtain normalized signal intensity; and carrying out statistical analysis on the normalized signal strength according to a statistical algorithm and a probability distribution model to obtain statistical characteristics.
Preferably, the step of performing curve fitting on the peak shape width according to the main component and the normalized signal intensity to obtain a characteristic peak position and a characteristic peak width includes: identifying initial characteristic frequency points in the main constituent components according to the frequency domain signals, and determining the center frequency of each main constituent component; performing intensity calibration based on the normalized signal intensity to obtain the signal intensity corresponding to each peak value; performing cross analysis on all the center frequencies and the signal intensities corresponding to all the peaks, and extracting key characteristic frequency points; and inputting the key characteristic frequency points into a preset Voigt mixed model, carrying out iterative optimization by combining with signal intensity, and outputting characteristic peak positions and characteristic peak widths.
Preferably, the step of inputting the grouping feature, the statistical feature, the feature peak position and the feature peak width into a preset deep learning model and outputting a feature spectrum data set of the target marker includes: performing feature weight adjustment on the grouping features and the statistical features according to a preset network architecture to obtain feature vectors; the priority ranking is carried out on the characteristic peak positions and the characteristic peak widths based on the preset characteristic importance, and ranking results are obtained; performing comprehensive operation on the feature vector and the sequencing result to generate a feature mapping matrix; the feature mapping matrix is learned and trained by using a deep learning training algorithm, so that weight distribution corresponding to each feature is obtained; and inputting weight distribution corresponding to all the features into a preset deep learning model to obtain a final target marker feature spectrum data set.
As a preferred solution, the step of comparing the characteristic spectrum dataset with a preset biomarker database to obtain a comparison result, inputting the comparison result into a preset artificial intelligent model to obtain the type of the target marker and the concentration of the target marker includes: comparing each feature in the feature spectrum data set with the features of a preset database according to a similarity analysis algorithm to obtain a similarity score; performing threshold analysis on the similarity score, determining a matching threshold, and distinguishing the matched features and the unmatched features of the similarity score according to the matching threshold; constructing a recognition result set according to the characteristics matched with the similarity scores; assigning a weight to each feature in the set of recognition results by correlation analysis; and inputting the weighted recognition result set into a preset artificial intelligent model to perform deep pattern recognition and category classification, and outputting the type of the target marker and the concentration of the target marker.
The application also provides a human body fluid spectrum analysis system based on artificial intelligence, which comprises: the acquisition unit is used for acquiring a serum sample of a tested person, and acquiring spectrum data of the serum sample to obtain various spectrum signals; the analysis unit is used for carrying out frequency conversion processing on each spectrum signal to obtain peak characteristic frequency, signal intensity and peak shape width of each spectrum signal; the statistics unit is used for acquiring main components of the peak characteristic frequency, and carrying out cluster analysis on the main components to obtain grouping characteristics; normalizing the signal intensity according to a signal processing algorithm to obtain normalized signal intensity, and carrying out statistical analysis on the normalized signal intensity to obtain statistical characteristics; the fitting unit is used for performing curve fitting on the peak shape width according to the main composition components and the normalized signal intensity to obtain a characteristic peak position and a characteristic peak width; the computing unit is used for inputting the grouping feature, the statistical feature, the feature peak position and the feature peak width into a preset deep learning model and outputting a feature spectrum data set of the target marker; and the result unit is used for comparing the characteristic spectrum data set with a preset biomarker database to obtain a comparison result, and inputting the comparison result into a preset artificial intelligent model to obtain the type of the target marker and the concentration of the target marker.
Compared with the prior art, the application has the following beneficial effects: the detection efficiency is fast, and the detection precision is high. The serum sample is subjected to spectrum data acquisition by a spectrum imaging technology to obtain various spectrum signals, and the sample analysis rate is obviously improved by automatically acquiring and timely processing the spectrum data; the main components of the peak characteristic frequency are classified by using cluster analysis, so that a data processing mode can rapidly identify signals related to the target biomarker in a large amount of spectrum information, interference on irrelevant signals is reduced, and accuracy of identifying target substances is enhanced; meanwhile, the signal intensity is subjected to normalization processing and statistical analysis according to a signal processing algorithm, the normalization process enables the spectrum data from different serum samples to have comparability, the statistical analysis provides more parameters describing signal characteristics, the influence caused by variation in the data is eliminated, and the accuracy of a detection result is improved; the analysis of the characteristic of the fine peak value by utilizing the curve fitting technology, the quantifiable characteristic peak position and characteristic peak width are extracted, the identification precision of the characteristic peak is improved, the deep learning model is utilized to automatically learn and extract the key characteristic related to the target substance from complex spectrum data, the analysis efficiency is improved, the extracted characteristic spectrum data set is intelligently matched with a preset biomarker database through an artificial intelligent model, the detection rate is improved, the misjudgment opportunity is reduced, and the problems of low detection efficiency and poor detection precision exist when the human body fluid analysis is improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained from these drawings without inventive faculty for a person skilled in the art.
The structures, proportions, sizes, etc. shown in the drawings are shown only in connection with the present disclosure, and are not intended to limit the scope of the invention, since any modification, variation in proportions, or adjustment of the size, etc. of the structures, proportions, etc. should be considered as falling within the spirit and scope of the invention, without affecting the effect or achievement of the objective.
FIG. 1 is a schematic flow chart of a human body fluid spectrum analysis method based on artificial intelligence provided by an embodiment of the invention;
fig. 2 is a schematic block diagram of a human body fluid spectrum analysis system based on artificial intelligence according to an embodiment of the present invention.
Reference numerals illustrate:
10. Human body fluid spectrum analysis system based on artificial intelligence; 11. an acquisition unit; 12. an analysis unit; 13. a statistics unit; 14. fitting unit; 15. a calculation unit; 16. a result unit; 20. an electronic device; 21. a memory; 22. a processor.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations.
It is also to be understood that the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
The technical scheme of the invention is further described below by the specific embodiments with reference to the accompanying drawings.
Example 1:
As shown in fig. 1, the human body fluid spectrum analysis method based on artificial intelligence provided by the embodiment of the application comprises steps S100 to S600.
And step S100, acquiring a serum sample of the tested person, and collecting spectral data of the serum sample to obtain various spectral signals.
In this step, the serum sample is scanned non-invasively by spectroscopic imaging techniques, thereby capturing spectroscopic data that adequately reflects the composition and structural information of the biomolecules; specifically, by using an infrared spectrometer or a mass spectrometer and other instruments, through setting different wavelength ranges and resolutions, the absorption or emission conditions of serum samples on light are recorded, and finally, spectrum signals representing different chemical substances are obtained.
For example, a Fourier Transform Infrared (FTIR) spectrometer may be used to scan a serum sample over a particular wavelength band, such as 4000-400 cm -1, observe characteristic absorption peaks of molecules such as proteins, lipids, and store the resulting signals as spectrograms for subsequent analysis.
And step 200, performing frequency conversion processing on each spectrum signal to obtain the peak characteristic frequency, the signal strength and the peak shape width of each spectrum signal.
In this step, the spectral signal is converted from the time domain to the frequency domain by fourier transformation, so that different characteristic peaks in the spectrum are easier to identify and distinguish; specifically, the spectral signal data is subjected to digital signal processing by using calculation software to obtain a spectrogram, vibration or rotation modes caused by different chemical structures are analyzed from the spectrogram, peak characteristic frequencies are identified, and the signal intensity and peak width of corresponding peaks are measured.
For example, a certain characteristic absorption peak may be gaussian fitted, the center position thereof is determined as the peak characteristic frequency, the height of the peak is also measured as the signal intensity, and the full width at half maximum (FWHM) is calculated as the peak shape width index.
Step S300, obtaining main components of peak characteristic frequency, and performing cluster analysis on the main components to obtain grouping characteristics; and carrying out normalization processing on the signal intensity according to a signal processing algorithm to obtain normalized signal intensity, and carrying out statistical analysis on the normalized signal intensity to obtain statistical characteristics.
In the step, main variables in the spectrum signals are extracted through a multivariate statistical analysis method such as Principal Component Analysis (PCA), the data structure is simplified, and redundant information is eliminated; specifically, the extracted principal components are classified by using a clustering algorithm such as K-means to find potential biomarker modes, the signal intensity of each spectrum signal is normalized, and the distribution characteristics are analyzed by descriptive statistics, variance analysis and other methods to summarize statistical characteristics.
For example, determining in PCA analysis that the first few principal components contribute to the highest-duty variant interpretation, then K-means clustering the scores of these principal components as inputs, thereby dividing the spectral signals into several groups and inducing a characteristic spectral pattern; for signal strength, normalization may be achieved by dividing by the maximum signal strength, and then calculating statistics of mean, standard deviation, skewness, etc. for each normalized signal.
And step S400, performing curve fitting on the peak shape width according to the main components and the normalized signal intensity to obtain the characteristic peak position and the characteristic peak width.
In the step, the width and the position of each characteristic peak are accurately quantitatively analyzed by adopting a curve fitting technology in combination with main components and signal intensity information; specifically, fitting is performed using the shape of the corresponding peak of a curve model such as a gaussian or lorentz function to obtain the exact position (characteristic peak position) and peak shape width (characteristic peak width) of each characteristic peak.
For example, for a characteristic absorption peak, nonlinear least square fitting can be performed, and an optimal gaussian function curve is obtained by adjusting fitting parameters, so that a specific numerical value of a characteristic peak position in a spectrogram is accurately determined, and a characteristic peak width value of the peak is obtained.
And S500, inputting the grouping characteristics, the statistical characteristics, the characteristic peak positions and the characteristic peak widths into a preset deep learning model, and outputting a characteristic spectrum data set of the target marker.
In the step, the deep learning model is trained and learned by comprehensively applying grouping features, statistical features, feature peak positions and feature peak width data, so that the model has high-efficiency feature extraction and classification capability; specifically, the input spectral feature dataset is processed by adopting a pre-training deep learning model such as a Convolutional Neural Network (CNN), the DISCRIMINATIVE FEATURES of the target marker is automatically extracted through a multi-layer network structure, and finally the spectral dataset is output.
For example, a CNN including a plurality of convolution layers and pooling layers may be designed, normalized spectral data obtained by the foregoing steps may be input, so that the model automatically learns to distinguish spectral features of different target markers, and performs classification prediction according to the learned features, so as to output a feature spectral dataset of a desired target marker.
And S600, comparing the characteristic spectrum data set with a preset biomarker database to obtain a comparison result, and inputting the comparison result into a preset artificial intelligent model to obtain the type of the target marker and the concentration of the target marker.
In this step, the characteristic spectral dataset is matched with spectral features in a database of known biomarkers in order to quickly identify and measure the kind and concentration of a target marker, wherein the target marker is a marker of a disease; specifically, comparing and analyzing the artificial intelligent algorithm such as curve matching and pattern recognition with the verified biomarker spectrum data stored in the database to obtain a matching result with high similarity, and estimating the concentration of the target marker through further processing of the artificial intelligent model.
For example, an classifier such as a Support Vector Machine (SVM) may be used to compare the extracted target marker spectral dataset to existing biomarker spectral data within the database, scale the spectral match and identify the specific biomarker type; on the basis, advanced algorithms such as ensemble learning and the like can be utilized to carry out qualitative and quantitative analysis on the concentration of the biomarker so as to realize accurate biomarker measurement.
In this embodiment, a serum sample of a tested person is obtained, and then a plurality of spectrum signals are obtained by collecting spectrum data of the serum sample, and then the spectrum signals are subjected to frequency conversion processing to determine a peak characteristic frequency, a signal intensity and a peak shape width; performing cluster analysis by using main components of the peak characteristic frequency to obtain grouping characteristics, normalizing the signal intensity by a signal processing algorithm, and performing statistical analysis to obtain statistical characteristics; performing curve fitting on the peak shape width according to the main composition components and the normalized signal intensity to obtain a characteristic peak position and a characteristic peak width; inputting the obtained grouping features, statistical features, feature peak positions and feature peak widths into a preset deep learning model, and outputting a feature spectrum data set of the target marker; and the characteristic spectrum data set is compared with a preset biomarker database to obtain comparison results, and the results are input into a preset artificial intelligence model to obtain the type and concentration of the target marker.
The method comprises the steps of performing spectrum data acquisition on a serum sample through an advanced spectrum imaging technology to obtain various spectrum signals, wherein the step of performing automatic acquisition and timely processing on the spectrum data obviously improves the sample analysis rate, and the main component of peak characteristic frequency is classified by using cluster analysis, so that signals related to a target biomarker are rapidly identified in a large amount of spectrum information, interference on irrelevant signals is reduced, and the accuracy of identifying target substances is enhanced; the signal intensity is normalized and statistically analyzed through a signal processing algorithm, so that the spectrum data from different serum samples are comparable, and more parameters describing the signal characteristics are provided through the statistical analysis, so that the influence caused by variation in the data is eliminated, and the accuracy of a detection result is improved; the analysis of the characteristic of the fine peak value is carried out by a curve fitting technology, the quantifiable characteristic peak position and characteristic peak width are extracted, the identification precision of the characteristic peak is improved, and more accurate input data is provided for a subsequent deep learning model; the key features related to the target substances are automatically learned and extracted from complex spectrum data based on the deep learning model, so that the steps of manually selecting and analyzing the features in the traditional method are reduced, and the analysis efficiency is improved; the extracted characteristic spectrum data set is intelligently matched with a preset biomarker database through an artificial intelligent model, so that the detection rate is improved, the probability of misjudgment is reduced, and the problems of low detection efficiency and poor detection precision in human body fluid analysis are solved.
Example 2:
In step S100, a blood sample is collected for a subject based on a serum sample collection tool, so as to obtain a serum sample of the subject; preprocessing a serum sample, and acquiring spectral data of the preprocessed serum sample by utilizing a spectral analysis instrument to obtain an initial spectral signal; and denoising and baseline correction are carried out on the initial spectrum signals by using digital signal processing software, so as to obtain various spectrum signals.
The non-destructive serum sample collection is carried out by applying specific blood sampling equipment and technical rules, and blood cells and plasma in the sample are separated by physical methods such as centrifugation, so as to obtain a clear serum sample; specifically, different components in a blood sample are separated by using a High Performance Liquid Chromatography (HPLC) instrument or a centrifuge, then the separated serum sample is subjected to spectrum scanning by using an infrared spectrometer or a Raman spectrometer, interference noise is removed by a software algorithm, and a baseline is corrected, so that the reliability and the accuracy of signals are improved.
For example, after blood samples are collected, blood is treated for 10 minutes by using a centrifugal condition of 3000g, serum which does not contain blood cells is separated, the serum samples are operated in a spectral range of 4000-400 cm -1 by using an FTIR spectrometer, random noise is removed by software smoothing filtering, a baseline is corrected by using a polynomial fitting method, and the spectral signal is clear and the signal to noise ratio is high.
In step S200, each spectrum signal is converted into a signal in a frequency domain by using a fast fourier transform, so as to obtain a signal in the frequency domain; carrying out peak detection on the frequency domain signal according to spectrum analysis software to obtain a local maximum point, and taking the local maximum point as peak characteristic frequency; and calculating the signal intensity corresponding to the peak characteristic frequency by using an integral formula, wherein the integral formula is as follows:
wherein, To be at peak characteristic frequencyThe signal strength at which the signal is to be received,For spectral signals as frequencyIs a function of (a) and (b),As a lower limit of the integration the value of the integral,Is the upper limit of the integration; the full width at half maximum of the peak characteristic frequency is calculated according to Lorentz fitting calculation, so as to obtain the peak shape width, and the formula for calculating the full width at half maximum of the peak characteristic frequency by Lorentz fitting calculation is as follows:
wherein, For peak characteristic frequencyThe full width at half maximum of the time,For the lorentz model damping coefficient corresponding to the peak characteristic frequency,Is the imaginary part of the complex dielectric response function at the peak characteristic frequency,Is the amplitude of the complex dielectric response function.
Accurately converting a spectrum signal in a time domain into a frequency domain by implementing a high-efficiency numerical algorithm and a spectrum analysis technology, and executing peak positioning and feature extraction; specifically, a frequency domain signal is rapidly acquired by using an FFT algorithm, local polar points are identified and marked by means of advanced spectral processing software such as a signal processing tool box in Origin or MATLAB, the intensity of each characteristic peak is quantitatively calculated by a mathematical integration method, and then the energy diffusivity of a reactant system is accurately estimated by applying a formula of a Lorentz model, so that the width parameter of each characteristic peak is obtained.
For example, the collected infrared spectrum data is subjected to FFT to obtain a frequency domain signal, and then all significant peak points are automatically searched and marked in Origin software as characteristic frequencies by using a peak detection function. For each peak point, numerical integration is applied over the under-peak rangeTo the point of) Calculating the signal intensity of the peak value. Next, lorentz fitting is performed in combination with the accumulated data of the peaks, which are calculatedThis will provide key parameters for the subsequent molecular dynamics.
In step S300, the peak characteristic frequency is extracted by non-negative matrix factorization to obtain a component related to the biomarker activity, and the component related to the biomarker activity is taken as a main component; inputting the main components into a spatial cluster based on density so as to identify similar biomolecule patterns in the main components, and classifying the similar biomolecule patterns to obtain grouping characteristics; carrying out standardization processing on the signal intensity by utilizing maximum and minimum normalization, and adjusting each peak value signal of the signal intensity data to the same scale to obtain normalized signal intensity; and carrying out statistical analysis on the normalized signal intensity according to a statistical algorithm and a probability distribution model to obtain statistical characteristics.
Analyzing spectral data by utilizing a matrix analysis method and a clustering technology, and carrying out refinement treatment on the result by utilizing a statistical principle; specifically, a non-Negative Matrix Factorization (NMF) technology is applied to decompose principal components related to biomarker reaction from frequency domain signals, a density-based clustering algorithm such as DBSCAN is utilized to identify a biomolecule mode, spectral intensity data are normalized based on the maximum value, so that different spectral signals are comparable, and further statistical models such as normal distribution are applied to carry out deep analysis on the normalized data, so that the statistical characteristics of each characteristic peak are obtained.
For example, NMF is used to decompose the overall spectral signal matrix, determine several spectral dimensions that most significantly interact with the biomarkers, and then input the peak feature vectors of these dimensions into the DBSCAN clustering algorithm to identify different biomarker clusters. The normalization processing of the signal intensity data is realized, and the signal intensity of the down line and the up line are ensured to be mapped to the range between 0 and 1, so that the data are compared on the same scale. And then, using probability models such as bias analysis and kurtosis analysis to count the signal distribution condition of characteristic peaks so as to better identify potential biomarker spectrograms.
In step S400, initial characteristic frequency points in the main constituent components are identified according to the frequency domain signal, and the center frequency of each main constituent component is determined; performing intensity calibration based on the normalized signal intensity to obtain the signal intensity corresponding to each peak value; performing cross analysis on all center frequencies and signal intensities corresponding to all peaks, and extracting key characteristic frequency points; and inputting the key characteristic frequency points into a preset Voigt mixed model, carrying out iterative optimization by combining the signal intensity, and outputting characteristic peak positions and characteristic peak widths.
Accurately characterizing molecular vibration characteristics in body fluid by using a method combining peak shape analysis and intensity quantification, and evaluating the relationship between the spectral characteristics; specifically, central frequency points of all components in the frequency domain signal are collected, and the intensity of each peak value is evaluated in combination with the normalized intensity of the signal. And cross-checking the mutual influence of different characteristic peaks by utilizing multiple regression analysis or weight analysis, and determining key frequency points for determining biomarker identification. And then, utilizing a Voigt model (combining Gao Sihe Lorentz lines) to realize high-precision curve fitting aiming at each key frequency point and corresponding intensity, and optimizing peak shape model parameters through iterative calculation to finally obtain accurate characteristic peak position and peak width information.
For example, spectral analysis software is used to identify key characteristic peaks in the extracted NMF component frequency signal, and a multivariate data matrix is constructed by normalizing the intensity and center frequency data of the process. And carrying out principal component regression or partial least squares regression analysis by using the matrix, and determining the characteristic frequency point which can most represent the biomarker spectrum structure. The Voigt model parameters are iteratively adjusted by selecting an appropriate Voigt curve fitting algorithm (e.g., levenberg-Marquardt algorithm) until the best fit characteristic peak shape and width parameters highlight stable marker spectral features.
In step S500, adjusting feature weights of the packet features and the statistical features according to a preset network architecture to obtain feature vectors; the method comprises the steps of carrying out priority ranking on characteristic peak positions and characteristic peak widths based on preset characteristic importance to obtain ranking results; carrying out comprehensive operation on the feature vector and the sequencing result to generate a feature mapping matrix; the feature mapping matrix is learned and trained by using a deep learning training algorithm, so that weight distribution corresponding to each feature is obtained; and inputting weight distribution corresponding to all the features into a preset deep learning model to obtain a final target marker feature spectrum data set.
Intelligent weighting is carried out on the extracted high-dimensional features by introducing a machine learning algorithm, and the internal law of spectrum data is further learned by a deep learning network, so that the accuracy and the efficiency of target marker feature recognition are improved; specifically, different weights are given to the classification and statistical features through a neural network or a machine learning model such as a decision tree, so as to form feature vectors, and the importance of the feature peak positions and the feature peak widths is determined and sequenced by utilizing a feature selection algorithm such as recursive feature elimination. Combining feature weight and ordering information, generating a mapping feature matrix by using matrix operation such as linear algebra, training the matrix by using a deep learning network such as a Convolutional Neural Network (CNN) or a cyclic neural network (RNN), learning the contribution of each feature in target marker identification, and finally integrating weight information identification and outputting a feature data set of a target marker spectrum.
For example, the obtained cluster features and statistical features are weighted in a preset network by using a random forest algorithm to form input feature vectors, and the feature mapping matrix is formed by scoring and sorting the priorities of the spectral features by using a feature selection algorithm based on Gini's unrepeacy or information gain, and combining the results. And repeatedly training the feature mapping matrix through the configured deep convolution network architecture, optimizing the network weight, and finally obtaining the importance degree of each feature in target biomarker detection, so as to ensure that the deep learning model reliably outputs the feature data of the target marker spectrum.
In step S600, each feature in the feature spectrum dataset is compared with the features of the preset database according to the similarity analysis algorithm, so as to obtain a similarity score; carrying out threshold analysis on the similarity score, determining a matching threshold, and distinguishing the matched features and the unmatched features of the similarity score according to the matching threshold; constructing a recognition result set according to the characteristics of similarity score matching; assigning a weight to each feature in the recognition result set through correlation analysis; and inputting the weighted recognition result set into a preset artificial intelligent model to perform deep pattern recognition and category classification, and outputting the type of the target marker and the concentration of the target marker.
By comprehensively applying similarity measurement and threshold judgment technologies, the method ensures that only highly matched spectral features are used for the subsequent data identification step, thereby improving the accuracy of identifying specific biomarkers; specifically, the similarity score is calculated by comparing each spectral feature with the features of known biomarkers in the reference database using cosine similarity, euclidean distance or jekcard index, etc. In combination with these scores, an appropriate discrimination threshold is set to treat features with scores higher than the threshold as matching features and to classify the features into a set of recognition results, and the features are subsequently weighted by a correlation coefficient or other weighting technique. The weighted results are transferred to a deep learning model such as a multi-layer perceptron (MLP), and further pattern recognition and classification are performed to accurately output the type and concentration of the target markers.
For example, the peak position and peak shape information in the extracted feature spectrum data set is compared with the existing biomarker database by a similarity analysis function library in MATLAB, a similarity score for each peak to various known markers is obtained, and a matched spectrum feature set is determined according to a selected score threshold criteria. And (3) analyzing the corresponding weights of the matched features by using Pearson or Spearman correlation, inputting the weights into a logistic regression or neural network artificial intelligent model, carefully performing pattern learning and classification screening, outputting exact target marker type labels and concentration estimation value calculation, and realizing high-precision biomarker detection.
In the embodiment, complex serum sample spectrum information is scientifically converted into accurately identified biomarker data by combining various spectrum processing technologies and algorithms, so that the whole flow optimization from sample processing to data analysis is realized. The method not only improves the accuracy and efficiency of biomarker detection, but also enhances the reliability and wide applicability of the technology in practical application.
Example 3:
as shown in fig. 2, the application further provides a human body fluid spectrum analysis system 10 based on artificial intelligence, which comprises an acquisition unit 11, an analysis unit 12, a statistics unit 13, a fitting unit 14, a calculation unit 15 and a result unit 16.
The acquiring unit 11 is mainly used for acquiring a serum sample of a tested person, and acquiring spectrum data of the serum sample to obtain various spectrum signals.
The analysis unit 12 is mainly configured to perform frequency conversion processing on each spectrum signal, so as to obtain a peak characteristic frequency, a signal intensity and a peak shape width of each spectrum signal.
The statistics unit 13 is mainly used for acquiring main components of peak characteristic frequency, and performing cluster analysis on the main components to obtain grouping characteristics; and carrying out normalization processing on the signal intensity according to a signal processing algorithm to obtain normalized signal intensity, and carrying out statistical analysis on the normalized signal intensity to obtain statistical characteristics.
The fitting unit 14 is mainly used for performing curve fitting on the peak shape width according to the main components and the normalized signal intensity to obtain the characteristic peak position and the characteristic peak width.
The computing unit 15 is mainly used for inputting the grouping feature, the statistical feature, the feature peak position and the feature peak width into a preset deep learning model and outputting a feature spectrum data set of the target marker.
The result unit 16 is mainly used for comparing the characteristic spectrum data set with a preset biomarker database to obtain a comparison result, and inputting the comparison result into a preset artificial intelligent model to obtain the type of the target marker and the concentration of the target marker.
In the present embodiment, collection of high quality spectral data is ensured by the acquisition unit 11, providing a reliable starting point for subsequent analysis. The analysis unit 12 obtains key spectral features through efficient frequency transformation processing, which lays a foundation for identifying specific biomarkers. The statistical unit 13 performs a structured simplification and a numerical treatment on the complex spectral data, providing a clear group of biomolecular data. The fitting unit 14 further refines the spectral features and extracts accurate peak information by fitting techniques. The computing unit 15 processes and interprets the spectral data using a deep learning algorithm, enhancing the level of intelligence and automation of the analysis. Finally, the result unit 16 matches the extracted features with the biomarker database using the artificial intelligence model, and efficiently outputs the category of the target marker and its concentration information. The whole system improves the accuracy and efficiency of detection, and solves the problems of low detection efficiency and poor detection precision in human body fluid analysis.
It should be noted that, for convenience and brevity of description, the specific working process of the above-described system and each unit may refer to the corresponding process in the foregoing embodiment of the human body fluid spectrum analysis method based on artificial intelligence, which is not described herein again.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (6)
1. A human body fluid spectral analysis method based on artificial intelligence, the method for non-diagnostic purposes, comprising:
Acquiring a serum sample of a tested person, and collecting spectrum data of the serum sample to obtain various spectrum signals;
Converting each spectrum signal into a signal in a frequency domain by utilizing fast Fourier transform to obtain a signal in the frequency domain; carrying out peak detection on the frequency domain signal according to spectrum analysis software to obtain a local maximum point, and taking the local maximum point as peak characteristic frequency; and calculating the signal intensity corresponding to the peak characteristic frequency by using an integral formula, wherein the integral formula is as follows:
wherein, To be at peak characteristic frequencyThe signal strength at which the signal is to be received,For spectral signals as frequencyIs a function of (a) and (b),As a lower limit of the integration the value of the integral,Is the upper limit of the integration; calculating the full width at half maximum of the peak characteristic frequency according to the Lorentzian fitting calculation to obtain a peak shape width, wherein the Lorentzian fitting calculation has the following formula for calculating the full width at half maximum of the peak characteristic frequency:
wherein, For peak characteristic frequencyThe full width at half maximum of the time,For the lorentz model damping coefficient corresponding to the peak characteristic frequency,Is the imaginary part of the complex dielectric response function at the peak characteristic frequency,Amplitude as a complex dielectric response function;
Acquiring main components of the peak characteristic frequency, and performing cluster analysis on the main components to obtain grouping characteristics; normalizing the signal intensity according to a signal processing algorithm to obtain normalized signal intensity, and carrying out statistical analysis on the normalized signal intensity to obtain statistical characteristics;
performing curve fitting on the peak shape width according to the main composition components and the normalized signal intensity to obtain a characteristic peak position and a characteristic peak width;
Inputting the grouping features, the statistical features, the feature peak positions and the feature peak widths into a preset deep learning model, and outputting a feature spectrum data set of a target marker;
Comparing each feature in the feature spectrum data set with the features of a preset database according to a similarity analysis algorithm to obtain a similarity score; performing threshold analysis on the similarity score, determining a matching threshold, and distinguishing the matched features and the unmatched features of the similarity score according to the matching threshold; constructing a recognition result set according to the characteristics matched with the similarity scores; assigning a weight to each feature in the set of recognition results by correlation analysis; and inputting the weighted recognition result set into a preset artificial intelligent model to perform deep pattern recognition and category classification, and outputting the type of the target marker and the concentration of the target marker.
2. The artificial intelligence based human body fluid spectrum analysis method according to claim 1, wherein the step of obtaining a serum sample of a subject, collecting spectrum data of the serum sample, and obtaining a plurality of spectrum signals comprises the steps of:
Collecting a blood sample of a tested person based on a serum sample collecting tool to obtain a serum sample of the tested person;
preprocessing the serum sample, and collecting spectral data of the preprocessed serum sample by utilizing a spectral analysis instrument to obtain an initial spectral signal;
And denoising and baseline correction are carried out on the initial spectrum signals by using digital signal processing software, so as to obtain various spectrum signals.
3. The human body fluid spectrum analysis method based on artificial intelligence according to claim 1, wherein the main constituent components of the peak characteristic frequency are obtained, and cluster analysis is performed on the main constituent components to obtain grouping characteristics; normalizing the signal intensity according to a signal processing algorithm to obtain normalized signal intensity, and performing statistical analysis on the normalized signal intensity to obtain statistical characteristics, wherein the method comprises the following steps:
extracting the peak characteristic frequency by utilizing nonnegative matrix factorization to obtain a component related to the biomarker activity, wherein the component related to the biomarker activity is taken as a main component;
Inputting the main components into a spatial cluster based on density so as to identify similar biomolecule patterns in the main components, and classifying the similar biomolecule patterns to obtain grouping characteristics;
carrying out standardization processing on the signal intensity by utilizing maximum and minimum normalization, and adjusting each peak value signal of the signal intensity data to the same scale to obtain normalized signal intensity;
And carrying out statistical analysis on the normalized signal strength according to a statistical algorithm and a probability distribution model to obtain statistical characteristics.
4. The artificial intelligence based human body fluid spectrum analysis method according to claim 1, wherein the step of curve fitting the peak shape width according to the principal component and the normalized signal intensity to obtain a characteristic peak position and a characteristic peak width comprises:
Identifying initial characteristic frequency points in the main constituent components according to the frequency domain signals, and determining the center frequency of each main constituent component;
performing intensity calibration based on the normalized signal intensity to obtain the signal intensity corresponding to each peak value;
performing cross analysis on all the center frequencies and the signal intensities corresponding to all the peaks, and extracting key characteristic frequency points;
And inputting the key characteristic frequency points into a preset Voigt mixed model, carrying out iterative optimization by combining with signal intensity, and outputting characteristic peak positions and characteristic peak widths.
5. The artificial intelligence based human body fluid spectral analysis method according to claim 1, wherein the step of inputting the grouping feature, the statistical feature, the feature peak position and the feature peak width into a preset deep learning model to output a feature spectral data set of a target marker comprises:
Performing feature weight adjustment on the grouping features and the statistical features according to a preset network architecture to obtain feature vectors;
The priority ranking is carried out on the characteristic peak positions and the characteristic peak widths based on the preset characteristic importance, and ranking results are obtained;
performing comprehensive operation on the feature vector and the sequencing result to generate a feature mapping matrix;
The feature mapping matrix is learned and trained by using a deep learning training algorithm, so that weight distribution corresponding to each feature is obtained;
And inputting weight distribution corresponding to all the features into a preset deep learning model to obtain a final target marker feature spectrum data set.
6. An artificial intelligence based human body fluid spectral analysis system for non-diagnostic purposes, comprising:
The acquisition unit is used for acquiring a serum sample of a tested person, and acquiring spectrum data of the serum sample to obtain various spectrum signals;
the analysis unit is used for converting each spectrum signal into a signal in a frequency domain by utilizing fast Fourier transform to obtain a signal in the frequency domain; carrying out peak detection on the frequency domain signal according to spectrum analysis software to obtain a local maximum point, and taking the local maximum point as peak characteristic frequency; and calculating the signal intensity corresponding to the peak characteristic frequency by using an integral formula, wherein the integral formula is as follows:
wherein, To be at peak characteristic frequencyThe signal strength at which the signal is to be received,For spectral signals as frequencyIs a function of (a) and (b),As a lower limit of the integration the value of the integral,Is the upper limit of the integration; calculating the full width at half maximum of the peak characteristic frequency according to the Lorentzian fitting calculation to obtain a peak shape width, wherein the Lorentzian fitting calculation has the following formula for calculating the full width at half maximum of the peak characteristic frequency:
wherein, For peak characteristic frequencyThe full width at half maximum of the time,For the lorentz model damping coefficient corresponding to the peak characteristic frequency,Is the imaginary part of the complex dielectric response function at the peak characteristic frequency,Amplitude as a complex dielectric response function;
The statistics unit is used for acquiring main components of the peak characteristic frequency, and carrying out cluster analysis on the main components to obtain grouping characteristics; normalizing the signal intensity according to a signal processing algorithm to obtain normalized signal intensity, and carrying out statistical analysis on the normalized signal intensity to obtain statistical characteristics;
the fitting unit is used for performing curve fitting on the peak shape width according to the main composition components and the normalized signal intensity to obtain a characteristic peak position and a characteristic peak width;
The computing unit is used for inputting the grouping feature, the statistical feature, the feature peak position and the feature peak width into a preset deep learning model and outputting a feature spectrum data set of the target marker;
the result unit is used for comparing each characteristic in the characteristic spectrum data set with the characteristic of a preset database according to a similarity analysis algorithm to obtain a similarity score; performing threshold analysis on the similarity score, determining a matching threshold, and distinguishing the matched features and the unmatched features of the similarity score according to the matching threshold; constructing a recognition result set according to the characteristics matched with the similarity scores; assigning a weight to each feature in the set of recognition results by correlation analysis; and inputting the weighted recognition result set into a preset artificial intelligent model to perform deep pattern recognition and category classification, and outputting the type of the target marker and the concentration of the target marker.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410921467.3A CN118471348B (en) | 2024-07-10 | 2024-07-10 | Human body fluid spectrum analysis method and system based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410921467.3A CN118471348B (en) | 2024-07-10 | 2024-07-10 | Human body fluid spectrum analysis method and system based on artificial intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118471348A CN118471348A (en) | 2024-08-09 |
CN118471348B true CN118471348B (en) | 2024-09-27 |
Family
ID=92167009
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410921467.3A Active CN118471348B (en) | 2024-07-10 | 2024-07-10 | Human body fluid spectrum analysis method and system based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118471348B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117150387A (en) * | 2023-11-01 | 2023-12-01 | 奥谱天成(厦门)光电有限公司 | Raman spectrum peak fitting method, medium, equipment and device |
CN117405648A (en) * | 2023-10-24 | 2024-01-16 | 北京邮电大学 | Cervical cancer serum biomarker screening method based on Raman spectrum |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018060967A1 (en) * | 2016-09-29 | 2018-04-05 | Inesc Tec - Instituto De Engenharia De Sistemas E Computadores, Tecnologia E Ciência | Big data self-learning methodology for the accurate quantification and classification of spectral information under complex varlability and multi-scale interference |
CN117470804B (en) * | 2023-11-03 | 2024-09-13 | 北京汉林汇融科技服务有限公司 | Carbohydrate product near-infrared detection method and system based on AI algorithm |
-
2024
- 2024-07-10 CN CN202410921467.3A patent/CN118471348B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117405648A (en) * | 2023-10-24 | 2024-01-16 | 北京邮电大学 | Cervical cancer serum biomarker screening method based on Raman spectrum |
CN117150387A (en) * | 2023-11-01 | 2023-12-01 | 奥谱天成(厦门)光电有限公司 | Raman spectrum peak fitting method, medium, equipment and device |
Also Published As
Publication number | Publication date |
---|---|
CN118471348A (en) | 2024-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109142317B (en) | Raman spectrum substance identification method based on random forest model | |
CN101532954B (en) | Method for identifying traditional Chinese medicinal materials by combining infra-red spectra with cluster analysis | |
CN104374738B (en) | A kind of method for qualitative analysis improving identification result based on near-infrared | |
WO2018121122A1 (en) | Raman spectroscopy detection method for checking goods, and electronic device | |
CN117132778B (en) | Spectrum measurement correction calculation method and system | |
CN101713731A (en) | Method for distinguishing coating quality of medicine preparation | |
CN110650058A (en) | Network traffic analysis method, device, storage medium and equipment | |
CN113310936A (en) | Rapid identification method for four high-temperature sterilized commercial milks | |
CN108827909B (en) | Rapid soil classification method based on visible near infrared spectrum and multi-target fusion | |
CN114113471A (en) | Method and system for detecting food freshness of artificial nose refrigerator based on machine learning | |
CN118471348B (en) | Human body fluid spectrum analysis method and system based on artificial intelligence | |
Liu et al. | Stability analysis of hyperspectral band selection algorithms based on neighborhood rough set theory for classification | |
CN113310934A (en) | Method for quickly identifying milk cow milk mixed in camel milk and mixing proportion thereof | |
CN114781484A (en) | Cancer serum SERS spectrum classification method based on convolutional neural network | |
CN117556245B (en) | Method for detecting filtered impurities in tetramethylammonium hydroxide production | |
CN110084227A (en) | Mode identification method based on near-infrared spectrum technique | |
CN114611582A (en) | Method and system for analyzing substance concentration based on near infrared spectrum technology | |
CN112801172A (en) | Chinese cabbage pesticide residue qualitative analysis method based on fuzzy pattern recognition | |
CN113324943A (en) | Yak milk and rapid identification model of milk mixed with yak milk | |
CN113310937A (en) | Method for rapidly identifying high-temperature sterilized milk, pasteurized fresh milk of dairy cow and reconstituted milk of milk powder | |
CN116858822A (en) | Quantitative analysis method for sulfadiazine in water based on machine learning and Raman spectrum | |
CN112716447A (en) | Oral cancer classification system based on deep learning of Raman detection spectral data | |
CN117538287A (en) | Method and device for nondestructive testing of phosphorus content of Huangguan pear | |
CN116519661A (en) | Rice identification detection method based on convolutional neural network | |
CN117169440A (en) | Duck down peculiar smell detection method and detection equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |