CN111398539A - Water quality microorganism indication method based on big data and molecular biotechnology - Google Patents
Water quality microorganism indication method based on big data and molecular biotechnology Download PDFInfo
- Publication number
- CN111398539A CN111398539A CN202010157647.0A CN202010157647A CN111398539A CN 111398539 A CN111398539 A CN 111398539A CN 202010157647 A CN202010157647 A CN 202010157647A CN 111398539 A CN111398539 A CN 111398539A
- Authority
- CN
- China
- Prior art keywords
- water quality
- sample
- water
- indicating
- microorganisms
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/18—Water
- G01N33/186—Water using one or more living organisms, e.g. a fish
- G01N33/1866—Water using one or more living organisms, e.g. a fish using microorganisms
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Food Science & Technology (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention provides a water quality microorganism indication method based on big data and molecular biotechnology; collecting a sample in a research area, detecting water quality, and analyzing main pollution factors and water quality characteristics in the research area; (2) constructing a water quality comprehensive score capable of reflecting the water quality condition to obtain a water quality comprehensive score discrimination standard, and determining the water quality grades of all samples; (3) obtaining a microbial flora structure of a research area through high-throughput sequencing, and screening out indicative microorganisms capable of indicating water quality change based on bioinformatics analysis; (4) designing a specific primer, constructing a fitting equation of the relative abundance of the indicative microorganisms and the comprehensive water quality score, and predicting the water quality grade of the sample according to the relative abundance of the indicative microorganisms. The invention provides a technology for indicating water quality of a water environment by using planktonic bacteria in the water body; the screened indicative microorganisms can be widely applied to water quality indication and evaluation of various water environments, and powerful support is provided for pollution prevention and control.
Description
Technical Field
The invention relates to the technical field of water environment water quality detection and evaluation, in particular to a water quality microorganism indicating method based on big data and molecular biotechnology.
Background
Many urban drinking water sources come from surface waters, including various rivers, lakes, reservoirs, and the like. In recent years, with the continuous development of industrial and agricultural production, various exogenous pollutants enter a water environment through point source discharge, surface source influx or ship moving risk source discharge. The pollutants cause water environment pollution and also seriously threaten the safety of domestic drinking water and urban centralized water supply. Therefore, the strengthening of the research and development of the rapid and efficient detection and early warning technology of the water quality of the water environment has important significance for ensuring the normal development of social economy and guaranteeing the life safety of people.
At present, water environment water quality evaluation technologies at home and abroad are mainly divided into two categories, one is a chemical analysis and evaluation technology based on various water quality index monitoring, and the other is a water quality biological indication technology based on toxicity monitoring. The conventional water quality monitoring technology based on physical and chemical analysis is that various instruments are adopted, and the concentration of toxic and harmful substances in the water environment is directly measured by a quantitative or qualitative method. The standard chemical detection method for a plurality of pollutants can be directly applied to a water quality monitoring and early warning system, but has the defects of more monitoring indexes, time and labor consumption in the monitoring process, slow monitoring speed and more large-scale instruments for supporting. The water quality biological indication technical principle based on toxicity monitoring is as follows: when water quality is suddenly polluted, organisms are stressed by environment, typical movement behaviors are changed to cause corresponding changes of detection signals, and the water quality pollution level is indicated by setting different change thresholds. The technical key of the water environment water quality indication technology based on biotoxicity detection is the selection of indication organisms, and the commonly used indication organisms comprise: fish, daphnia and algae, and determining the degree of water environment pollution by indicating the response of organisms to different pollution conditions. However, biological monitoring of water quality based on toxicity monitoring is difficult to determine the cause of the effect based on the observed effect, the biggest difficulty being the uncertainty of the biological individual's reaction to the chemical substance. Under laboratory controlled conditions, biological individuals may exhibit a good linear response, but when introduced into a real environment, the indicator organisms, although significantly responsive to the contaminants, are poorly selective, and the monitored data may be extremely scattered, interfering with the analysis of the results.
The research on the existing research results shows that a quantitative prediction method for the pollution degree of offshore water bodies based on pollution indicating flora is established by Xiongjin wave, Xuelixia (CN 201910380096) and the like. The method comprises the steps of firstly obtaining physical and chemical index data of water bodies in a research area to calculate a comprehensive water pollution index (OPI), obtaining composition information of microbial communities in water body samples in different sea areas by combining high-throughput sequencing, screening pollution indicating microbial communities with an indicating effect on pollution states by utilizing a random forest algorithm, and then quantitatively predicting the water pollution degree by taking the relative abundance and weight of each microorganism in a pollution indicating microbial community as independent variables. Although the method can predict the pollution degree of the offshore water body, in practical application, the information of each pollution index bacterium in the sample needs to be obtained through high-throughput sequencing, so that the economic and time cost is high, and for workers without bioinformatics background, the method for screening the pollution index bacterium related information from the complex sequencing information is challenging.
Compared with CN 201910380096, the invention reduces the cost and operation difficulty of applying water quality indication microorganism to water environment water quality prediction by designing specific primers for the water quality indication microorganism. The CN 201910380096 research idea is that the difference in the same site is small, the difference between different sites is large as a standard, and various microorganisms are screened out as pollution indicator bacteria by a random forest algorithm. Since most of microorganisms in the water body cannot be cultured, CN 201910380096 must first obtain sequencing information of water body microbial communities by high-throughput sequencing in actual water quality prediction, and then obtain relative abundance information of different pollution indicating bacteria after screening sample sequencing information by bioinformatics analysis. On one hand, the operation time and the cost are increased, and on the other hand, the requirement on the professional quality of workers is higher. Firstly, determining the water quality score and the water quality grade of a sample by combining principal component analysis and evaluation criteria; obtaining most of microorganism information in samples through high-throughput sequencing, and screening out microorganisms with the closest relation to water quality change as water quality indicating microorganisms based on fitting analysis and correlation analysis. Then specific primers are designed aiming at the water quality indicator bacteria. And (3) acquiring the relative abundance of the water quality indicating microorganisms in the sample based on a molecular biology technology (fluorescent quantitative PCR), and further predicting the water quality condition of the sample. Compared with CN 201910380096, the invention reduces the types of water quality indicating microorganisms and designs the specific primers capable of rapidly monitoring the water quality indicating microorganism information under the condition of ensuring the accuracy of the prediction result. In practical application, a high-throughput sequencing technology is not needed, and molecular biology technologies (fluorescent quantitative PCR, digital chip, fluorescent in-situ hybridization and the like) are mainly used, so that the water quality prediction working cost is reduced, and the prediction working efficiency is improved.
Disclosure of Invention
Aiming at the defects of the existing water environment and water quality indication method, the invention aims to provide a water quality microorganism indication method based on big data and molecular biotechnology; in particular to a water environment and water quality indicating method for screening indicating microorganisms based on the characteristics of the environment and water quality and the change rule of bacterial flora under different water quality conditions. The method determines the environmental water quality characteristics based on the water quality index detection and the mathematical treatment of the water quality index detection value, analyzes the influence of different water quality characteristics on the bacterial community composition in a sample based on the bacterial community change rule under different water quality conditions, obtains the microbial population information sensitive to different water quality characteristics through data analysis, and designs specific primers aiming at water quality indicating microbes so as to indicate the water environment water quality.
As the implementation of the water environment water quality detection and indication technology needs to adopt the technical means and the requirements meeting the national standard detection technical specifications, the sample collection, the detection and the like need to meet the relevant specification requirements.
Specifically, the purpose of the invention is realized by the following technical scheme:
the invention provides a water quality microorganism indication method based on big data and molecular biotechnology, which comprises the following steps:
(1) collecting a research area sample, carrying out water quality detection according to the quality standard of the national surface water environment, and analyzing an index detection result to obtain main pollution factors and water quality characteristics of the research area;
(2) performing mathematical treatment on the main pollution factors, and obtaining a sample water quality comprehensive score based on principal component analysis;
(3) obtaining a microbial flora structure in a research area through high-throughput sequencing, and screening out microorganisms with an indicating effect on water quality through linear discriminant analysis (L EfSe), correlation analysis and fitting analysis;
(4) designing a specific primer of the water quality indicating microorganism, taking the relative abundance of the water quality indicating microorganism in the sample as an independent variable, taking the comprehensive score of the water quality of the sample as a dependent variable, constructing a fitting regression equation, and predicting the grade of the water quality of the sample.
Preferably, the step (1) is specifically: setting n water quality sampling points i, i is 1,2 …, n according to different peripheral environments or water quality characteristics of a research area, and continuously collecting surface water samples of each sample in a certain period; performing water quality analysis on all samples according to the regulation of the national surface water environment quality standard to obtain detection values of different water quality indexes of each sample; and counting the water quality factors exceeding the standard in all samples, analyzing the change rule of all water quality indexes among different sampling points, and determining the main pollution factors and the water quality characteristics in the research area.
Preferably, the step (2) is specifically:
the step (2) is specifically as follows:
A. carrying out standardized treatment on the main pollution factors, and excluding the difference of dimensions and magnitude;
1,2, ·, p; j ═ 1,2, ·, n; wherein p is the number of main pollution factors;
xijthe original data of the jth monitoring point of the ith index is obtained;and σiRespectively is the mean value and standard deviation of the ith index;
B. calculating a correlation coefficient matrix; according to a standardized data matrix (x)ij*)p×nCalculating a correlation coefficient matrix R ═ (R)ij)p×p
C. Calculating a characteristic value and a characteristic vector; according to the characteristic equation | λi-R | ═ 0, and the characteristic value λ is determinedi(i ═ 1,2, ·, p), and sorted by size; then, the characteristic value lambda is obtainediCorresponding feature vector ai(i | | 1,2, ·, p), requiring | | | ai1, |; wherein λiAs a main component FiThe greater the variance, the greater the contribution of the variance to the total variance;
D. calculating the contribution rate and determining the number of the principal components; according to the formulaCalculating principal component FiA contribution rate; determining the principal component F based on the principle that the characteristic value is greater than 85% or the characteristic value is greater than 1iThe number m of (2);
E. calculating the comprehensive score of the water quality of the samples, and determining the discrimination standard of each different water quality grade to obtain the water quality grade of each sample; according to(in the formula a)ijIs a principal component feature vector, ZjIs the normalized value of the original variable, i ═ 1,2, ·, m; j ═ 1,2, ·, p and the formulaAnd (i & lti & gt 1,2 & gth & ltm & gt), calculating a comprehensive score of the water quality of the sample.
And calculating the standard value of each water quality grade of the main pollution factors in the quality standard according to the process to obtain the comprehensive score discrimination standard of each water quality grade. And judging the water quality grade of each sample according to the discrimination standard.
Preferably, the step (3) specifically comprises the following steps of obtaining composition information of microbial communities in samples with different surrounding environments or water quality characteristics by adopting a high-throughput sequencing technology, screening out bacterial community information sensitive to different environmental conditions through linear discriminant analysis (L EfSe), and finding out bacteria which are most closely related to water quality change as water quality indicating microorganisms based on correlation analysis and fitting analysis by taking the bacteria sensitive to different environments as research objects.
The step (4) specifically comprises the following steps: searching gene sequence information of the water quality indicating microorganism, and designing a specific Primer of the water quality indicating microorganism through Primer design software Primer 6.0;
and (2) obtaining the relative abundance of the water quality indication microorganisms in the sample based on a fluorescent quantitative PCR technology by taking the specific primer and the bacterial universal primer as tools:
water quality indicator microorganism relative abundance-target gene copy number/bacterial gene copy number
CT1Number of cycles of primer for the target gene, CT2For the number of cycles of bacterial universal primers, the detection limit was set at 31;
and (3) taking the relative abundance of the water quality indicating microorganisms in the sample as an independent variable, taking the comprehensive score of the water quality of the sample as a dependent variable, constructing a fitting regression equation, and predicting the grade of the water quality of the sample.
Compared with the prior water environment and water quality indicating technology, the invention has the following beneficial effects:
(1) the invention avoids the problems of large detection range, complex process and the like of chemical indication and evaluation technology based on water quality index monitoring. The conventional water quality monitoring technology based on the physical and chemical analysis theory needs various instruments and directly measures the concentration of toxic and harmful substances in the water environment through a quantitative or qualitative method. The monitoring method has strong pertinence and high accuracy, and can monitor the types and the contents of pollutants. However, the detection work is complicated under the condition of a large number of detection indexes, and the requirement for rapidly and comprehensively evaluating the water quality of the water environment is difficult to meet. According to the invention, the indicative microorganisms sensitive to water quality change are screened out according to the water quality characteristics and the microbial community structure, so that the complexity of the working process is reduced while the scientificity of the indicating effect is ensured.
(2) The invention avoids the problem that the water environment water quality evaluation and indication technology based on biotoxicity monitoring is difficult to accurately and stably reflect the actual environment change. The water environment water quality indicating technology based on biotoxicity monitoring determines the water environment pollution level by indicating the reaction of organisms to different pollution conditions, and commonly used index organisms comprise: fish, water fleas and algae. However, the water quality indication technology based on toxicity monitoring is difficult to determine the cause of the effect according to the observed effect, and the biggest difficulty is in uncertainty of the reaction of biological individuals to chemical substances. Under laboratory controlled conditions, biological individuals show good linear responses, and when applied to practical environments, the indicating organisms have obvious responses to pollutants, but the selectivity is poor, and monitoring data can be extremely dispersed, thereby interfering with result analysis. The indicative microorganism screening process is completely based on the actual environment sample, and the interference of the difference between the laboratory environment and the actual environment on the water quality indicating result is avoided.
(3) The invention improves the efficiency of applying the water quality indicating microorganism to the actual water quality prediction work and reduces the application cost. The CN 201910380096 prior patent firstly obtains physical and chemical index data of water in a research area to calculate a comprehensive water pollution index (OPI), combines high-throughput sequencing to obtain composition information of microbial communities in water samples in different sea areas, screens out pollution indicating microbial communities with an indicating effect on pollution states by utilizing a random forest algorithm, and quantitatively predicts the water pollution degree by taking the relative abundance and weight of each microbe in a pollution indicating microbial community combination as independent variables. Although the method can predict the pollution degree of the offshore water body, in practical application, the information of each pollution index bacterium in the sample needs to be obtained through high-throughput sequencing, so that the economic and time cost is high, and for workers without bioinformatics background, the method for screening the pollution index bacterium related information from the complex sequencing information is challenging. The invention reduces the types of water quality indicating microorganisms and designs the specific primers capable of rapidly monitoring the water quality indicating microorganism information under the condition of ensuring the accuracy of the prediction result. In practical application, water quality indicating microorganism information is not required to be obtained by a high-throughput sequencing technology, but a molecular biology technology (fluorescent quantitative PCR, digital chip, fluorescent in-situ hybridization and the like) is mainly used. The water quality detection working cost and the experimental period are reduced, and meanwhile, the efficiency of prediction work is improved.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a schematic flow diagram of a water quality microorganism indicator method based on big data and molecular biotechnology;
FIG. 2 is a distribution diagram of sampling points in Taipu river and Jinze reservoir;
FIG. 3 is a diagram of marked bacterial flora of different cross sections in the reservoir area of the Jinze reservoir, wherein the upstream cross section is Taipu brake, the river cross section is a plano bridge, ruin bridge and a storage port, and the in-reservoir cross section is a ring bridge center, an ecological island and a storage port, wherein the upper diagram is a microorganism with statistical significance for the abundance difference between samples of different cross sections, and the lower diagram is a microorganism linear discriminant analysis Score (L DA Score) with statistical significance for the abundance difference;
FIG. 4 is a schematic diagram showing the correlation between the marker bacterial flora and water quality factors of different sections in the reservoir area of the Jinze reservoir; note: p <0.01, P < 0.05;
FIG. 5 is a graph showing the relative abundance of L immunohibitanes in samples of different cross-sections;
FIG. 6 shows the fitting results of L relative abundance of the immunohalibits and the comprehensive score of water quality in the sample, wherein the left graph shows the distribution of L relative abundance of the immunohalibits in different cross sections, and the right graph shows the fitting analysis results of L relative abundance of the immunohalibits and the comprehensive score of water quality;
FIG. 7 shows the results of L experiments for verifying the specificity of primers of the genus Imnobabitans;
FIG. 8 shows L distribution of relative abundance and pollution index along the way and fitting results, wherein the left graph shows the distribution of relative abundance of different cross sections L of the immunoalbicans, and the right graph shows the distribution of comprehensive water quality scores of different cross sections L of the immunoalbicans;
FIG. 9 is a plot of the relative abundance of L immunohalobutans versus the overall water quality score as a result of fluorescent quantitative PCR;
FIG. 10 is a residual analysis between the actual value and the fitting value of the water quality comprehensive score; wherein, the left graph is a residual error schematic diagram between the water quality comprehensive score actual value and the fitting value, and the right graph is a residual error value frequency distribution statistical diagram.
Detailed Description
The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It will be apparent to those skilled in the art that various changes and modifications can be made without departing from the spirit and scope of the invention. The present invention will be described in detail with reference to the following specific examples:
compared with the prior water environment and water quality indication technology, the water quality microorganism indication method based on the big data and the molecular biotechnology is realized by adopting a specific scheme to overcome the technical problem; the specific analysis is as follows:
1. how to link the information of microbial communities with the water quality of a water body and then screen water quality indicating microorganisms?
The water quality indicating microorganisms are screened, the characteristic water quality information of the water is judged firstly, a water quality comprehensive score capable of quantitatively measuring the water quality condition of the water is established through principal component analysis and evaluation criteria, and bacterial population information sensitive to different environmental conditions is obtained through linear discriminant analysis (L EfSe). The bacteria sensitive to different environmental conditions are used as investigation objects, and the bacteria most closely related to the water quality change are found through big data correlation analysis and fitting analysis to serve as the water quality indicating microorganisms.
2. How effectively to apply the screened water quality indicating microorganisms to actual water quality prediction work?
This is the biggest improvement of this invention compared with CN 201910380096, the number of microbes in water is large, and it is difficult to find out the information of specific water quality indicating microbe number from complex environmental water. After the water quality indicating microorganisms are screened out, the specific primers of the water quality indicating microorganisms are designed. In actual prediction work, direct information of water quality characteristic indication microorganisms in a complex sample can be obtained through molecular biology technologies (fluorescent quantitative PCR, digital chip, fluorescent in-situ hybridization and the like), so that the experiment cost is reduced, and the experiment period is shortened.
In the following, Taipu river and Jinze reservoir upstream of the Jinze reservoir are taken as research areas, and 'surface water environmental quality standard' (GB3838-2002) is taken as a quality standard. The whole implementation process of the invention is shown in figure 1, and the specific steps are as follows:
step one, performing water quality detection on a research area according to a surface water environment quality standard, and analyzing the water quality characteristics of the research area. The method comprises the following specific steps:
seven sampling points (shown in figure 2) are arranged at the Taipu river along the way to the Jinze reservoir outlet in 2017, 10 to 2018 and 9, are distributed on an upstream section (Taipu gate), a river section (a flat bridge, a ruin bridge and a storage inlet) and a storage section (a bridge center, an ecological island and a storage outlet), are sampled once a month, and are used for detecting basic items, supplementary items and TOC in the quality standard. Wherein the water temperature, the dissolved oxygen and the pH are measured on site by a multi-parameter analyzer (Hash), a TOC instrument (Shimadzu Japan) is adopted to detect the TOC, and other indexes are measured according to a standard test method (GB/T5750-. The results of the detected items are shown in Table 1.
The main pollutants are nitrogen, phosphorus, organic matters and petroleum pollutants. Except the Taipu gate, TN of other sample points exceed IV standard of surface water, petroleum indexes of most sample points exceed III, and other index mean values meet III requirement of surface water. Wherein NH4-N, petroleum, permanganate index, TOC, TN, BOD5Intestine for mixing excrementThe flora shows the tendency of low upstream and high downstream, the water pollution is obviously and rapidly changed along the way from the upstream section to the river section, and the Jinze reservoir has a certain improvement effect on the water quality but is not obvious.
TABLE 1 Tepuhe water quality variation/mg-L along the course and reservoir area-1
And step two, performing mathematical treatment on the main pollution factors, and obtaining a sample water quality comprehensive score based on principal component analysis.
The method comprises the following specific steps:
1) the single index often cannot reflect the water quality information comprehensively, and in order to reflect the water quality change characteristics of each section in different months comprehensively, the main pollution indexes (TN and petroleum) in the region and the water quality index (NH) with obvious change along the way are researched4-N, petroleum, permanganate index, TOC, TN, BOD5Faecal coliform group bacteria) as an evaluation index, and performing principal component analysis;
2) the main pollution factors are standardized, and the difference of dimension and magnitude is eliminated. The correlation coefficient matrix was then calculated and the results are shown in table 2.
3) Calculating the eigenvalue lambdai(i ═ 1,2 ·, 7) and the eigenvectors, according to the formulaCalculating principal component FiThe contribution rate. Determining the principal component F based on the principle that the characteristic value is greater than 1iThe number of (2). The eigenvalue and eigenvector calculations are shown in tables 3 and 4.
4) According to(in the formula a)ijIs a principal component feature vector, ZjIs a value of the raw variable that has been normalized,i is 1, 2; j ═ 1,2, ·, 7) and the formula(i-1, 2) calculating the comprehensive score of the water quality of the sample. The results are shown in Table 5
5) And (3) calculating the standard value of each water quality grade of the main pollution factors in the quality standard (GB3838-2002) according to the 1-4 process to obtain each water quality grade discrimination standard (table 5), and determining the water quality grades of all samples according to the discrimination standard. Judging the water quality condition of the sample based on the comprehensive water quality score of the sample, and if the comprehensive water quality score is lower than the corresponding score (-0.099) of the class II standard limit value, the water quality grade is good; the water quality grade is qualified when the water quality grade is between the standard limit value of class II and class III (-0.099-1.822); if the score is larger than the corresponding score (1.822) of the class III standard limit value, the water quality grade is to be improved.
Table 2: matrix of correlation coefficients
Table 3: eigenvalue and cumulative variance of correlation coefficient matrix
Principal component factor | Characteristic value | Percentage of variance | Accumulation of |
1 | 4.266 | 60.95% | 60.95% |
2 | 1.115 | 15.93% | 76.88% |
3 | 0.764 | 10.91% | 87.79% |
4 | 0.337 | 4.81% | 92.61% |
5 | 0.280 | 4.00% | 96.61% |
6 | 0.190 | 2.71% | 99.32% |
Table 4: first two principal component factor eigenvectors
Principal component factor 1 | |
|
NH4-N | 0.446 | 0.066 |
Petroleum products | 0.455 | -0.053 |
Permanganate index | 0.439 | -0.235 |
TOC | 0.262 | 0.143 |
TN | 0.058 | 0.909 |
BOD5 | 0.415 | 0.217 |
Fecal coliform group | 0.397 | -0.208 |
TABLE 5 pollution index of water quality grade limit value principal component calculation result
Table 6: water quality comprehensive score and water quality grade of different section sampling points in each month
According to the data in table 6, from the space, the water quality change rule is consistent with that in table 1, the water quality of the water discharged from the Taipu gate is the best, the water quality of the Taipu river course is poor, the water quality fluctuates after the water enters the Jinze reservoir, but the total water discharged from the reservoir is slightly lower than the total water discharged from the reservoir in storage, and the ecological storage regulation reservoir has a certain water quality improvement function. In the months, the water quality of the 2018-02 water body is obviously inferior to that of other months.
Step three, acquiring a microbial community structure in a research area through high-throughput sequencing, and screening out microorganisms having an indicating effect on water quality through linear discriminant analysis (L EfSe), correlation analysis and fitting analysis, wherein the method comprises the following steps:
FIG. 3 is bacterial population information sensitive to different environmental conditions obtained by linear discriminant analysis, and a specific distribution rule is shown in Table 7, wherein the upstream section is the effluent of east Tai lake, the effluent is lake-reservoir type water quality, microorganisms with obvious abundance difference mainly comprise Cyanobacteria (Cyanobacteria), subsection I and synechococcus (Synechococcus), the river section water environment is influenced by branch flow, ship and sewage, the water quality is deteriorated, the relative abundance of Proteobacteria in the water body is obviously increased, and the microorganisms with obvious abundance difference mainly comprise Betaproteobacteria, Burkholderiales, Commamonoaceae, Malikia and L immunohalobians.
Based on the correlation analysis (figure 4) and the statistical result (figure 5), L immunohalibans are found to have obvious relative abundance change on different sections, the comprehensive score of the relative abundance and the water quality and the main pollution factor NH in the research area4-N, petroleum, permanganate index and BOD5Exhibit a significant positive correlation (P)<0.01) from the upstream section to the river section (fig. 6 left), the abundance is reduced to a certain extent after entering the reservoir section, and the fitting result shows that the relative abundance of L immunohalobutans is basically consistent with the distribution trend of the comprehensive water quality score (fig. 6 right)The quality indicating microorganism predicts the water quality change of the water coming from the upstream of the Jinze reservoir and the reservoir area.
Table 7: relative abundance of bacterial populations sensitive to different environmental conditions in the reservoir area of the Jinze reservoir
Note: displaying only the marker microorganisms with the average relative abundance of more than 1%
Designing a specific primer of the water quality indicating microorganism, taking the relative abundance of the water quality indicating microorganism in the sample as an independent variable, taking the comprehensive score of the water quality of the sample as a dependent variable, constructing a fitting regression equation, and predicting the water quality grade of the sample. The method comprises the following specific steps:
the ITS gene sequence of L immunohibitanes strain was obtained from a review of the literature (Vojt ě ch)Jezbera J,Hahn M W,et al.The Diversity of the Limnohabitans Genus,an Important Group ofFreshwater Bacterioplankton,by Characterization of 35Isolated Strains[J]P L OSONE,2013,8.), a Primer design software Primer 6.0 is adopted to design corresponding specific primers L imn F and L imn R according to the ITS gene sequence of L immunohalobinans strain, the specific information of the primers is shown in table 8, a verification experiment is carried out on the Primer specificity, only one product band is found, the specificity is good, the product size is about 100bp (fig. 7), and a quantitative PCR experiment can be carried out.
TABLE 8 primer design results specific for L immunohalitanes
Sequence | Position | Length bp | Tm(℃) | GC% |
Limn F:TACGCTAATACCGCATACG | 104 | 19 | 50.3 | 47.4 |
Limn R:TACGCTAATACCGCATACG | 197 | 21 | 50.3 | 42.9 |
Product | 94 | 75 |
The relative abundance of the water indicator microorganisms in the sample is obtained based on the fluorescent quantitative PCR technology by taking specific primers (L imn F and L imn R) and a bacterial universal primer (27F: 5'-AGAGTTTGATCCTGGCTCAG-3'; 1492R: 5'-TACCTTGTTACGACTT-3') as tools:
water quality indicator microorganism relative abundance-target gene copy number/bacterial gene copy number
CT1Number of cycles of primer for the target gene, CT2For the number of cycles of bacterial universal primers, the detection limit was set at 31;
the left part of the graph 8 is L immunohalobutans relative abundance obtained by a fluorescence PCR experiment, the relative abundance is obviously increased from an upstream section to a river section, the abundance is reduced to a certain extent after entering a reservoir section, and the trend of the comprehensive water quality score is basically consistent (the right part of the graph 8).
The regression equation is fitted by making the comprehensive water quality score and L relative abundance of the immunohalobinans to predict the water quality grade of the sample according to the relative abundance of L immunohalobinans in the sample (fig. 9). fig. 9 shows that there is a very significant linear correlation (P <0.001) between the relative abundance of the water quality indicator microorganisms in the sample and the comprehensive water quality score, the higher the relative abundance of the water quality indicator microorganisms in the sample indicates the higher the comprehensive water quality score, the poorer the water quality, the graph shows that the residual between the comprehensive water quality score predicted by the relative abundance of the water quality indicator microorganisms L and the actual value (fig. 10 left) is subjected to frequency distribution statistics (fig. 10 right) on the residual of the comprehensive water quality score, fig. 10 shows that the residual distribution of the water quality score is more random, there is no significant change in the difference between the predicted value and the actual value of the comprehensive water quality score when the relative abundance of the water quality indicator microorganisms is in a certain range, the K-S (kolmioroglogror-Smirnov) test is performed on the residual, the result shows that the distribution of the normal water quality score (P > 0.05).
Fitting equation for water quality indication microorganism and water quality comprehensive score
y=2.923x-0.654;R2=0.492
According to the statistical correlation coefficient test method, if the line to be fitted to the regression equation has a 99.9% confidence, a correlation coefficient greater than 0.356 is required. Obviously, the fitting effect of the equation on the comprehensive water quality score is better.
The relation between the relative abundance of water quality microorganisms in the sample and the water quality grade is determined according to the table 6 and the regression equation (table 9), the prediction result of the sample is counted (table 10), the water quality grade is predicted by taking the relative abundance of the water quality indication microorganisms in the sample as an independent variable according to the table 10, and the accuracy of prediction can reach 75 percent. In addition, since the number of actual samples to be improved is small, the relative abundance of microorganisms to be improved (fig. 9) reaching the water quality level in the upper interval of the 95% prediction interval can be used as a critical value in the actual water quality prediction work, so as to ensure the prediction of the water body to be improved.
Table 9: correlation between relative abundance of water quality microorganisms and water quality grade
Range of relative abundance of microorganisms in water | Water quality grade |
Less than 0.166 | Good effect |
Between 0.166 and 0.847 | In general |
Greater than 0.847 | To be improved |
Table 10: accuracy rate for predicting water quality grade by using water quality indicative microorganism relative abundance as independent variable
Good (actual) | In general (practice) | To be improved (actual) | Rate of accuracy | |
Good (prediction) | 46 | 14 | 0 | 76.70% |
In general (prediction) | 6 | 17 | 1 | 70.8 |
To be improved (predicted) | 0 | 0 | 0 | 0 |
Total rate of accuracy | 75.0% |
In conclusion, the water quality microorganism indication method based on big data and molecular biotechnology provided by the invention can ensure that the indication result is consistent with the detection result of a large amount of conventional water quality indexes, and simultaneously avoids the detection work of a large amount of water quality indexes in conventional water quality detection, thereby reducing the detection cost and improving the detection efficiency. In addition, the invention designs specific primers for the screened water quality indicating microorganisms, and in practical application, the water quality indicating microorganism information does not need to be processed by a high-throughput sequencing technology, but a molecular biology technology (fluorescent quantitative PCR, a digital chip, fluorescent in-situ hybridization and the like) is adopted. The water quality prediction working cost and the experimental period are reduced, and meanwhile, the efficiency of prediction work is improved. The invention provides a technology for indicating water quality of a water environment by using biological factors, in particular water planktonic bacteria. The screened indicative microorganisms can be widely applied to water quality indication and early warning in various water environments such as rivers, lakes, reservoirs and the like, and provide powerful support for pollution prevention and control.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.
Sequence listing
<110> Shanghai university of transportation
<120> water quality microorganism indication method based on big data and molecular biotechnology
<130>KAG43519
<160>4
<170>SIPOSequenceListing 1.0
<210>1
<211>19
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>1
tacgctaata ccgcatacg 19
<210>2
<211>19
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>2
tacgctaata ccgcatacg 19
<210>3
<211>20
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>3
agagtttgat cctggctcag 20
<210>4
<211>16
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>4
Claims (5)
1. A water quality microorganism indication method based on big data and molecular biotechnology is characterized by comprising the following steps:
(1) collecting a research area sample, carrying out water quality detection according to the quality standard of the national surface water environment, and analyzing an index detection result to obtain main pollution factors and water quality characteristics of the research area;
(2) performing mathematical treatment on the main pollution factors, and obtaining a sample water quality comprehensive score based on principal component analysis;
(3) obtaining a microbial flora structure in a research area through high-throughput sequencing, and screening out microorganisms with an indicating effect on water quality through linear discriminant analysis, correlation analysis and fitting analysis;
(4) designing a specific primer of the water quality indicating microorganism, taking the relative abundance of the water quality indicating microorganism in the sample as an independent variable, taking the comprehensive score of the water quality of the sample as a dependent variable, constructing a fitting regression equation, and predicting the grade of the water quality of the sample.
2. The method for indicating water quality microorganisms based on big data and molecular biotechnology according to claim 1, wherein the step (1) is specifically: setting n water quality sampling points i, i is 1,2 …, n according to different peripheral environments or water quality characteristics of a research area, and continuously collecting surface water samples of each sample in a certain period; performing water quality analysis on all samples according to the regulation of the national surface water environment quality standard to obtain detection values of different water quality indexes of each sample; and counting the water quality factors exceeding the standard in all samples, analyzing the change rule of all water quality indexes among different sampling points, and determining the main pollution factors and the water quality characteristics in the research area.
3. The method for indicating water quality microorganisms based on big data and molecular biotechnology according to claim 1, wherein the step (2) is specifically:
A. carrying out standardized treatment on the main pollution factors, and excluding the difference of dimensions and magnitude;
1,2, ·, p; j ═ 1,2, ·, n; wherein p is the number of main pollution factors;
xijthe original data of the jth monitoring point of the ith index is obtained;and σiRespectively is the mean value and standard deviation of the ith index;
B. calculating a correlation coefficient matrix; according to a standardized data matrix (x)ij*)p×nCalculating a correlation coefficient matrix R ═ (R)ij)p×p
C. Calculating a characteristic value and a characteristic vector; according to the characteristic equation | λi-R | ═ 0, and the characteristic value λ is determinedi(i ═ 1,2,. cndot., p), and arranged by sizeSequencing; then, the characteristic value lambda is obtainediCorresponding feature vector ai(i | | 1,2, ·, p), requiring | | | ai1, |; wherein λiAs a main component FiThe greater the variance, the greater the contribution of the variance to the total variance;
D. calculating the contribution rate and determining the number of the principal components; according to the formulaCalculating principal component FiA contribution rate; determining the principal component F based on the principle that the characteristic value is greater than 85% or the characteristic value is greater than 1iThe number m of (2);
E. calculating the comprehensive score of the water quality of the samples, and determining the discrimination standard of each different water quality grade to obtain the water quality grade of each sample; according to(in the formula a)ijIs a principal component feature vector, ZjIs the normalized value of the original variable, i ═ 1,2, ·, m; j ═ 1,2, ·, p and the formulaAnd calculating the comprehensive score of the water quality of the sample.
4. The method for indicating water quality microorganisms based on big data and molecular biotechnology according to claim 1, wherein the step (3) comprises the following steps: obtaining the composition information of microbial communities in samples with different surrounding environments or water quality characteristics by adopting a high-throughput sequencing technology method, and screening out the information of the bacterial communities sensitive to different environmental conditions through linear discrimination analysis; bacteria sensitive to different environments are taken as research objects, and the bacteria which are most closely related to water quality change are found as water quality indicating microorganisms based on correlation analysis and fitting analysis.
5. The method for indicating water quality microorganisms based on big data and molecular biotechnology according to claim 1, wherein the step (4) comprises the following steps: searching gene sequence information of the water quality indicating microorganism, and designing a specific Primer of the water quality indicating microorganism through Primer design software Primer 6.0;
and (2) obtaining the relative abundance of the water quality indication microorganisms in the sample based on a fluorescent quantitative PCR technology by taking the specific primer and the bacterial universal primer as tools:
water quality indicator microorganism relative abundance-target gene copy number/bacterial gene copy number
CT1Number of cycles of primer for the target gene, CT2For the number of cycles of bacterial universal primers, the detection limit was set at 31;
and (3) taking the relative abundance of the water quality indicating microorganisms in the sample as an independent variable, taking the comprehensive score of the water quality of the sample as a dependent variable, constructing a fitting regression equation, and predicting the grade of the water quality of the sample.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010157647.0A CN111398539A (en) | 2020-03-09 | 2020-03-09 | Water quality microorganism indication method based on big data and molecular biotechnology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010157647.0A CN111398539A (en) | 2020-03-09 | 2020-03-09 | Water quality microorganism indication method based on big data and molecular biotechnology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111398539A true CN111398539A (en) | 2020-07-10 |
Family
ID=71432404
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010157647.0A Pending CN111398539A (en) | 2020-03-09 | 2020-03-09 | Water quality microorganism indication method based on big data and molecular biotechnology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111398539A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113189291A (en) * | 2021-04-30 | 2021-07-30 | 广州绿曦生物科技有限公司 | Natural water system water quality condition assessment method and application thereof |
CN113718019A (en) * | 2021-09-03 | 2021-11-30 | 上海城市水资源开发利用国家工程中心有限公司 | Method for evaluating biological safety of water supply system by using sodB gene |
CN116660486A (en) * | 2023-05-24 | 2023-08-29 | 重庆交通大学 | Water quality evaluation standard determining method based on large benthonic animal BI index |
CN117171597A (en) * | 2023-11-02 | 2023-12-05 | 北京建工环境修复股份有限公司 | Method, system and medium for analyzing polluted site based on microorganisms |
CN117391613A (en) * | 2023-10-08 | 2024-01-12 | 菏泽单州数字产业发展有限公司 | Agricultural industry garden management system based on Internet of things |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016159154A1 (en) * | 2015-04-03 | 2016-10-06 | 住友化学株式会社 | Prediction-rule generating system, prediction system, prediction-rule generating method, and prediction method |
CN107045053A (en) * | 2017-06-19 | 2017-08-15 | 河海大学 | A kind of surface water quality overall evaluation system based on controllable standard |
CN107122927A (en) * | 2017-06-27 | 2017-09-01 | 河海大学 | A kind of water transfer drainage water environment improvement integrated evaluating method |
CN107273711A (en) * | 2017-06-22 | 2017-10-20 | 宁波大学 | A kind of shrimp disease quantitative forecasting technique based on enteron aisle bacterial indicator |
CN107679676A (en) * | 2017-10-27 | 2018-02-09 | 河海大学 | A kind of city based on numerical simulation is low to influence exploitation Optimal Configuration Method |
CN110055343A (en) * | 2019-04-30 | 2019-07-26 | 清华大学 | A kind of specific primer polluted for monitoring the quick-acting phosphorus loads of deposit |
CN110070144A (en) * | 2019-04-30 | 2019-07-30 | 云南师范大学 | A kind of lake water quality prediction technique and system |
CN110308255A (en) * | 2019-05-08 | 2019-10-08 | 宁波大学 | One kind is based on Pollution indicating bacteria group to coastal waters degree of water pollution quantitative forecasting technique |
-
2020
- 2020-03-09 CN CN202010157647.0A patent/CN111398539A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016159154A1 (en) * | 2015-04-03 | 2016-10-06 | 住友化学株式会社 | Prediction-rule generating system, prediction system, prediction-rule generating method, and prediction method |
CN107045053A (en) * | 2017-06-19 | 2017-08-15 | 河海大学 | A kind of surface water quality overall evaluation system based on controllable standard |
CN107273711A (en) * | 2017-06-22 | 2017-10-20 | 宁波大学 | A kind of shrimp disease quantitative forecasting technique based on enteron aisle bacterial indicator |
CN107122927A (en) * | 2017-06-27 | 2017-09-01 | 河海大学 | A kind of water transfer drainage water environment improvement integrated evaluating method |
CN107679676A (en) * | 2017-10-27 | 2018-02-09 | 河海大学 | A kind of city based on numerical simulation is low to influence exploitation Optimal Configuration Method |
CN110055343A (en) * | 2019-04-30 | 2019-07-26 | 清华大学 | A kind of specific primer polluted for monitoring the quick-acting phosphorus loads of deposit |
CN110070144A (en) * | 2019-04-30 | 2019-07-30 | 云南师范大学 | A kind of lake water quality prediction technique and system |
CN110308255A (en) * | 2019-05-08 | 2019-10-08 | 宁波大学 | One kind is based on Pollution indicating bacteria group to coastal waters degree of water pollution quantitative forecasting technique |
Non-Patent Citations (1)
Title |
---|
肖亦农等: "《环境微生物学实验基础》", 31 May 2018 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113189291A (en) * | 2021-04-30 | 2021-07-30 | 广州绿曦生物科技有限公司 | Natural water system water quality condition assessment method and application thereof |
CN113189291B (en) * | 2021-04-30 | 2023-10-27 | 广州绿曦生物科技有限公司 | Natural water system water quality condition assessment method and application thereof |
CN113718019A (en) * | 2021-09-03 | 2021-11-30 | 上海城市水资源开发利用国家工程中心有限公司 | Method for evaluating biological safety of water supply system by using sodB gene |
CN116660486A (en) * | 2023-05-24 | 2023-08-29 | 重庆交通大学 | Water quality evaluation standard determining method based on large benthonic animal BI index |
CN117391613A (en) * | 2023-10-08 | 2024-01-12 | 菏泽单州数字产业发展有限公司 | Agricultural industry garden management system based on Internet of things |
CN117391613B (en) * | 2023-10-08 | 2024-03-15 | 菏泽单州数字产业发展有限公司 | Agricultural industry garden management system based on Internet of things |
CN117171597A (en) * | 2023-11-02 | 2023-12-05 | 北京建工环境修复股份有限公司 | Method, system and medium for analyzing polluted site based on microorganisms |
CN117171597B (en) * | 2023-11-02 | 2024-01-02 | 北京建工环境修复股份有限公司 | Method, system and medium for analyzing polluted site based on microorganisms |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111398539A (en) | Water quality microorganism indication method based on big data and molecular biotechnology | |
Sims et al. | Toward the development of microbial indicators for wetland assessment | |
Xiong et al. | Biological consequences of environmental pollution in running water ecosystems: A case study in zooplankton | |
Van Nevel et al. | Flow cytometric bacterial cell counts challenge conventional heterotrophic plate counts for routine microbiological drinking water monitoring | |
Brandt et al. | Investigation of detection limits and the influence of DNA extraction and primer choice on the observed microbial communities in drinking water samples using 16S rRNA gene amplicon sequencing | |
Gelsomino et al. | Changes in chemical and biological soil properties as induced by anthropogenic disturbance: A case study of an agricultural soil under recurrent flooding by wastewaters | |
Van Rossum et al. | Year-long metagenomic study of river microbiomes across land use and water quality | |
Mele et al. | Application of self-organizing maps for assessing soil biological quality | |
Li et al. | Statistical determination of crucial taxa indicative of pollution gradients in sediments of Lake Taihu, China | |
Dorigo et al. | Molecular approaches to the assessment of biodiversity in aquatic microbial communities | |
Sivaganesan et al. | Improved strategies and optimization of calibration models for real-time PCR absolute quantification | |
Stuetz et al. | Characterisation of wastewater using an electronic nose | |
Duarte et al. | Denaturing gradient gel electrophoresis (DGGE) in microbial ecology-insights from freshwaters | |
Wang et al. | Keystone taxa of water microbiome respond to environmental quality and predict water contamination | |
Lear et al. | A comparison of bacterial, ciliate and macroinvertebrate indicators of stream ecological health | |
CN110308255B (en) | Pollution indication flora based method for quantitatively predicting pollution degree of offshore water body | |
Deepnarain et al. | Artificial intelligence and multivariate statistics for comprehensive assessment of filamentous bacteria in wastewater treatment plants experiencing sludge bulking | |
Scholes et al. | A review of practical tools for rapid monitoring of membrane bioreactors | |
Zhu et al. | Development of microbial community–based index of biotic integrity to evaluate the wetland ecosystem health in Suzhou, China | |
Sharuddin et al. | Shift of low to high nucleic acid bacteria as a potential bioindicator for the screening of anthropogenic effects in a receiving river due to palm oil mill effluent final discharge | |
CN113393081A (en) | Health evaluation method suitable for reclaimed water supply river | |
CN114707786A (en) | Method for evaluating health of river ecosystem based on co-occurrence network | |
Liu et al. | Improved method for benthic ecosystem health assessment by integrating chemical indexes into multiple biological indicator species—A case study of the Baiyangdian Lake, China | |
Zhu et al. | Determination of the direct and indirect effects of bend on the urban river ecological heterogeneity | |
Eldridge et al. | Using high-throughput DNA sequencing, genetic fingerprinting, and quantitative PCR as tools for monitoring bloom-forming and toxigenic cyanobacteria in Upper Klamath Lake, Oregon, 2013 and 2014 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200710 |