[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Genes with stable DNA methylation levels show higher evolutionary conservation than genes with fluctuant DNA methylation levels
--Supplementary information: source code, supplementary file and links to data


Abstract

Different human genes often exhibit different degrees of robustness or stability in their DNA methylation levels between tissues, samples or cell types. This may be related to the evolution of the human genome. Thus, we compared the evolutionary conservation between two types of genes: genes with stable DNA methylation levels (SM genes) and genes with fluctuant DNA methylation levels (FM genes). For long-term evolutionary characteristics between species, we compared the percentage of the orthologous genes, evolutionary rate dn/ds and protein sequence identity. We found that the SM genes had a greater percentage of the orthologous genes, lower dn/ds, and a higher protein sequence identity in all the 21 species. These results indicated that the SM genes were more evolutionarily conserved than the FM genes. For short-term evolutionary characteristics among human populations, we compared the single nucleotide polymorphism (SNP) density, and the linkage disequilibrium (LD) degree in HapMap populations and 1000 genomes project populations. We observed that the SM genes had a lower SNP densities, and a higher degree of LD in all the 11 HapMap populations and 13 1000 genomes project populations. These results mean that the SM genes had more stable chromosome genetic structures, and were more conserved than the FM genes.

Adjustment for batch effect among different series

The batch effect among different series is adjusted for in different series using the Empirical Bayes (EB) batch correction method.
>>adjustment_for_batch_effect.r     >>
Download

Calculate the percentage of orthologous genes for each of the 21 species

This file aims to calculate the percentage of orthologous genes for each of the 21 species
>>calculation_of_ortholog_rate.pl     >>
Download

Calculate the variance (fluctuate coefficient, FC) of methylation levels of genes

For a gene , the variance of is calculated to measure the variations of DNA methylation levels is calculated.
>>calculation_of_variance_of_methylation_status.pl     >>
Download

Determine FM and SM gene sets

Genes are sorted according to the increase in values. The top 20% of sorted genes are used as SM genes, and the bottom 20% of sorted genes are used as FM genes.
>>filter_FMG_and_SMG.pl     >>
Download

The source code of box plots, bar plots and statistic test

The box plots, bar plots for each of the evolution features and statistical tests were performed using this code file.
>>boxplot_barplot_wilcox_test.r     >>
Download

The source code of principal component analysis using methylation levels and methylation stability

To further investigate the effect of both methylation levels and methylation fluctuation on the results of evolutionary conservation, we performed a principal component analysis (PCA) which converted the methylation level and fluctuation coefficient (FC) into two linearly uncorrelated principal components represented by PC1 and PC2.
>>pca.r     >>
Download

Summary statistics for the comparisons of evolutionary features in 21 species

It is used to calculate the upper and lower quantile and median values for the features including evolutionary rate (dn/ds), sequence identity (SI) and LD coefficient r2 in populations of HapMap and 1000 Genomes Project.
>>calculation_of_upper_and_lower_quantile.r     >>
Download

Comparison of evolutionary conservation between SM (top 50%) and FM genes (bottom 50%)

We also extracted the top and bottom 50% genes of the whole sorted list as SM and FM genes and compared the evolutionary features between the two groups of genes. We found that these results still supported that the SM genes had higher evolutionary conservation than the FM genes.
>>Supplementary_beta_percent50gene_result.pdf     >>
Download


The supplementary figures and tables of median beta results

Description: For genes with multiple probes, we took the median beta value of the probes annotated to the gene region as the methylation status of the gene. Then we calculated the variance of methylation level (fluctuation coefficient, FC) and extracted FM and SM genes. At last, the evolutionary conservation features were compared betweeen FM genes and SM genes.
>>Supplementary_MedianBeta_result.pdf     >>
Download

Supplementary Table 1, the types of data are shown below:

data type
download
Available
DNA methylation data of GPL13534
download
YES
Orthologous genes
download
YES
SNP data
download
YES
Annotation file of genome
download
YES
The HapMap Project data
download
YES
The 1000 Genomes Project data
download
YES