10000 GitHub - yuxuanliao/DeepPIC
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

yuxuanliao/DeepPIC

Repository files navigation

DeepPIC

This is the code repo for the paper Highly Automatic and Universal Approach for Extracting Features from LC-MS Data Using Deep Learning. We developed a deep learning-based pure ion chromatogram method (DeepPIC) for extracting PICs from raw data files directly and automatically. The DeepPIC method has already been integrated into the KPIC2 framework. The combination can provide the entire pipeline from raw data to discriminant models for metabolomic datasets.

Installation

1. Install Anaconda for python 3.8.13.

2. Install R 4.2.1.

3. Install KPIC2 in R language.

The method of installing KPIC2 can refer to https://github.com/hcji/KPIC2.

  • First install the depends of KPIC2.
    install.packages(c("BiocManager", "devtools", "Ckmeans.1d.dp", "Rcpp", "RcppArmadillo", "mzR", "parallel", "shiny", "plotly", "data.table", "GA", "IRanges",  "dbscan", "randomForest"))
    BiocManager::install(c("mzR","ropls"))
  • Then, download the source package of KPIC2 at url and install the package locally.

4. Create environment and install main packages.

  • Open commond line, create environment.

    conda create --name DeepPIC python=3.8.13
    conda activate DeepPIC
  • Clone the repository and enter.

    git clone https://github.com/yuxuanliao/DeepPIC.git
    cd DeepPIC
  • Install main packages in requirements.txt with following commands.

    pyt
    8FCB
    hon -m pip install -r requirements.txt
  • Set environment variables for calling R language using rpy2.

    R_HOME represents the installation location of the R language.

    R_USER represents the installation location of the rpy2 package.

    setx "R_HOME" "C:\Program Files\R\R-4.2.1"
    setx "R_USER" "C:\Users\yxliao\anaconda3\Lib\site-packages\rpy2"

DeepPIC

The following files are in the DeepPIC folder:

  • train.py. for model training
  • extract.py. extract PICs from raw LC-MS files
  • predict.py. define the IoU metric for PICs and evalute the DeepPIC model

KPIC2

The following files are in the KPIC2 folder:

  • KPIC2.py. for integrating DeepPIC into KPIC2 to implement the whole process of metabolomics processing
  • KPIC2.R. the code for the feature detection, alignment, grouping, missing value filling, and building classification models
  • permutation_vip.py. define some functions for file format conversion, permutation test, and biomarkers selection
  • files:

Others

The following files are in the others folder:

  • metabolomics.py. the code for the OPLS-DA scores plot, permutation test, biomarkers selection and hierarchical cluster analysis
  • quantitative.py. evaluate the quantitative ability of feature extraction methods
  • XCMS.R. the code for XCMS to detect peaks
  • Simulation:

Dataset

The dataset with 200 input-label pairs used to train, validate, and test the DeepPIC model is in the dataset folder. As the model and the data exceeded the limits, we have uploaded the optimized model and the datasets (MM48, simulated MM48, quantitative, metabolomics and different instrumental datasets) to Github release page.

Usage

The example code for model training is included in the train.ipynb.

The example code for feature extraction is included in the extract.ipynb.

The example code for integrating DeepPIC into KPIC2 to implement the whole process of metabolomics processing is included in the Integration_into_KPIC2.ipynb.

Start from raw LC-MS dataset to discriminant model

By running extract.py, user can use DeepPIC to extract PICs from each LC-MS file in the metabolomics dataset. The whole process of metabolomics processing can be implemented by running KPIC2.py directly. Please refer to extract.ipynb and Integration_into_KPIC2.ipynb for details. Thus, you can use DeepPIC+KPIC2 to process your data.

Information of maintainers

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published
0