8000 GitHub - isarnassiri/scDIV: scDIV: Single Cell RNA Sequencing Data Demultiplexing using Interindividual Variations
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

isarnassiri/scDIV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scDIV: Single Cell RNA Sequencing Data Demultiplexing using Interindividual Variations

Introduction

This documentation gives an introduction and user manual of scDIV (acronym of the Single Cell RNA sequencing data Demultiplexing using Interindividual Variations) an R package to use inter-individual differential co-expression patterns for demultiplexing the pooled samples without any extra experimental steps.

Please see the manual for usage of scDIV, including 9 steps.

Installation

  1. Install the R (LINK)
  2. Run the following command in R/rStudio to install scDIV as an R package:
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
    
library(devtools)
install_github("isarnassiri/scDIV")

You can find sample input files in system.file("extdata", package = "scDIV") folder.

Step 1: Infer genetic variants from scRNA-seq data

cellsnp-lite is used to pileup the expressed alleles in single-cell data, which can be directly used for donor deconvolution in multiplexed single-cell RNA-seq data, which assigns cells to donors without genotyping reference (LINK).

cellsnp-lite gets bam file and list of barcodes as variable inputs, a Variant Call Format (vcf) file listing all candidate SNPs (regionsVCF) as backend input variable:

cellsnp-lite -s possorted_genome_bam.bam -b barcodes.tsv.gz -O FOLDER-NAME -R regionsVCF -p 22 --minMAF 0.05 --minCOUNT 10 --gzip 

cellsnp-lite generates a vcf file including called genetic variants.

Step 2: Demultiplex pooled samples

We use Vireo (Variational Inference for Reconstructing Ensemble Origin) for donor deconvolution using expressed SNPs in multiplexed scRNA-seq data (LINK).

Vireo gets s variants info file provided by cellsnp-lite as an input:

vireo -c input-vcf-file -o output-folder --randSeed 2 -N Number-of-donors -t GP  

We use "donor_ids.tsv" file from outputs of Vireo for downstream analysis.

Step 3: Generate gene-cell count matrix

Raw count data from 10X CellRanger (outs/read_count.csv) or other single-cell experiments has the gene as a row (the gene name should be the human or mouse Ensembl gene ID) and the cell as a column. You can convert an HDF5 Feature-Barcode Matrix (LINK) to a gene-cell count matrix using the cellranger mat2csv (LINK) command provided by 10Xgenomics. The cells in the read_count.csv file are from the filtered feature-barcode matrix generated by the cell ranger.

A filtered feature-barcode matrix generated by the cell ranger can be converted from HDF5 feature-barcode matrix to a gene-cell count matrix (read_count.csv) using the cellranger mat2csv (command provided by 10Xgenomics) as follows:

cellranger mat2csv filtered_feature_bc_matrix.h5 read_count.csv

Step 4: Gene Expression Recovery

In this step, we start to use a function from scDIV package. We use GeneExpressionRecovery() function for Gene Expression Recovery as follows. The GeneExpressionRecovery() function uses SAVER (single-cell analysis via expression recovery), an expression recovery method for unique molecule index (UMI)-based scRNA-seq data to provide accurate expression estimates for all genes in a scRNA-seq profile.

library("scDIV")
csQCEAdir <- system.file("extdata", package = "scDIV")
Donors='donor6_donor2'
FC='FAI5649A17'
GeneExpressionRecovery( InputDir = csQCEAdir, Donors = Donors, FC = FC )

You can find the results in the SAVER/ folder with 'AssignedCells.txt' and 'AllCells.txt' extensions.

Step 5: Inter-individual Differential gene Correlation Analysis (IDCA)

The IDCA() function uses correlation coefficients and performed Inter-individual Differential gene Correlation Analysis (IDCA) for two donors (D1 and D2) and genes (G1 and G2).

library("scDIV")
ERP = "donor6_donor2_FAI5649A17_AssignedCells.txt"
Donors='donor6_donor2'
FC='FAI5649A17'
InputDir = system.file("extdata", package = "scDIV")
IDCA( InputDir, Donors, FC, ERP, TEST = T )

You can find the results in the IDCA_Analysis/ folder.

Step 6: Visualization of IDCA outputs

The IDCAvis() function visualizes the outputs of IDC analysis.

library("scDIV")
InputDir = system.file("extdata", package = "scDIV")
IDCAvis( InputDir )

You can find the results in the IDCA_Analysis/IDCA_Plots/ as pdf file(s).

Step 7: Expression Aware Demultiplexing per Donor Pair

The EADDonorPair() function uses inter-individual differential co-expression patterns for demultiplexing per donor pair.

library("scDIV")
InputDir = system.file("extdata", package = "scDIV")
EADDonorPair( InputDir )

You can find the results in the IDCA_Analysis/Expression_Aware_Cell_Assignment/ folder called "Expression_Aware_Cell_Assignment.txt".

Step 8: Expression Aware Demultiplexing per sample pool

The EADDonorPair() function uses inter-individual differential co-expression patterns for demultiplexing the pooled samples

library("scDIV")
InputDir = system.file("extdata", package = "scDIV")
EADDonorPair( InputDir )

You can find the results in the IDCA_Analysis/Expression_Aware_Cell_Assignment/ folder called "Results_Expression_Aware_Cell_Assignment.txt" and "Summary.txt".

Citation

Nassiri I, Andrew J Kwok, Aneesha Bhandari, Katherine R Bull, Lucy C Garner, Paul Klenerman, Caleb Webber, Laura Parkkinen, Angela W Lee, Yanxia Wu, Benjamin Fairfax, Julian C Knight, David Buck, Paolo Piazza. Demultiplexing of Single Cell RNA Sequencing Data using Interindividual Variation in Gene Expression. Bioinformatics Advances - Oxford Academic. 2024;4(1).

About

scDIV: Single Cell RNA Sequencing Data Demultiplexing using Interindividual Variations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0