This documentation gives an introduction and user manual of scDIV (acronym of the Single Cell RNA sequencing data Demultiplexing using Interindividual Variations) an R package to use inter-individual differential co-expression patterns for demultiplexing the pooled samples without any extra experimental steps.
Please see the manual
for usage of scDIV, including 9 steps.
- Install the R (LINK)
- Run the following command in R/rStudio to install scDIV as an R package:
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
library(devtools)
install_github("isarnassiri/scDIV")
You can find sample input files in system.file("extdata", package = "scDIV")
folder.
cellsnp-lite is used to pileup the expressed alleles in single-cell data, which can be directly used for donor deconvolution in multiplexed single-cell RNA-seq data, which assigns cells to donors without genotyping reference (LINK).
cellsnp-lite gets bam file and list of barcodes as variable inputs, a Variant Call Format (vcf) file listing all candidate SNPs (regionsVCF) as backend input variable:
cellsnp-lite -s possorted_genome_bam.bam -b barcodes.tsv.gz -O FOLDER-NAME -R regionsVCF -p 22 --minMAF 0.05 --minCOUNT 10 --gzip
cellsnp-lite generates a vcf file including called genetic variants.
We use Vireo (Variational Inference for Reconstructing Ensemble Origin) for donor deconvolution using expressed SNPs in multiplexed scRNA-seq data (LINK).
Vireo gets s variants info file provided by cellsnp-lite as an input:
vireo -c input-vcf-file -o output-folder --randSeed 2 -N Number-of-donors -t GP
We use "donor_ids.tsv" file from outputs of Vireo for downstream analysis.
Raw count data from 10X CellRanger (outs/read_count.csv) or other single-cell experiments has the gene as a row (the gene name should be the human or mouse Ensembl gene ID) and the cell as a column. You can convert an HDF5 Feature-Barcode Matrix (LINK) to a gene-cell count matrix using the cellranger mat2csv (LINK) command provided by 10Xgenomics. The cells in the read_count.csv
file are from the filtered feature-barcode matrix generated by the cell ranger.
A filtered feature-barcode matrix generated by the cell ranger can be converted from HDF5 feature-barcode matrix to a gene-cell count matrix (read_count.csv) using the cellranger mat2csv (command provided by 10Xgenomics) as follows:
cellranger mat2csv filtered_feature_bc_matrix.h5 read_count.csv
In this step, we start to use a function from scDIV package. We use GeneExpressionRecovery()
function for Gene Expression Recovery as follows. The GeneExpressionRecovery()
function uses SAVER (single-cell analysis via expression recovery), an expression recovery method for unique molecule index (UMI)-based scRNA-seq data to provide accurate expression estimates for all genes in a scRNA-seq profile.
library("scDIV")
csQCEAdir <- system.file("extdata", package = "scDIV")
Donors='donor6_donor2'
FC='FAI5649A17'
GeneExpressionRecovery( InputDir = csQCEAdir, Donors = Donors, FC = FC )
You can find the results in the SAVER/ folder with 'AssignedCells.txt' and 'AllCells.txt' extensions.
The IDCA()
function uses correlation coefficients and performed Inter-individual Differential gene Correlation Analysis (IDCA) for two donors (D1 and D2) and genes (G1 and G2).
library("scDIV")
ERP = "donor6_donor2_FAI5649A17_AssignedCells.txt"
Donors='donor6_donor2'
FC='FAI5649A17'
InputDir = system.file("extdata", package = "scDIV")
IDCA( InputDir, Donors, FC, ERP, TEST = T )
You can find the results in the IDCA_Analysis/ folder.
The IDCAvis()
function visualizes the outputs of IDC analysis.
library("scDIV")
InputDir = system.file("extdata", package = "scDIV")
IDCAvis( InputDir )
You can find the results in the IDCA_Analysis/IDCA_Plots/ as pdf file(s).
The EADDonorPair()
function uses inter-individual differential co-expression patterns for demultiplexing per donor pair.
library("scDIV")
InputDir = system.file("extdata", package = "scDIV")
EADDonorPair( InputDir )
You can find the results in the IDCA_Analysis/Expression_Aware_Cell_Assignment/ folder called "Expression_Aware_Cell_Assignment.txt".
The EADDonorPair()
function uses inter-individual differential co-expression patterns for demultiplexing the pooled samples
library("scDIV")
InputDir = system.file("extdata", package = "scDIV")
EADDonorPair( InputDir )
You can find the results in the IDCA_Analysis/Expression_Aware_Cell_Assignment/ folder called "Results_Expression_Aware_Cell_Assignment.txt" and "Summary.txt".
Nassiri I, Andrew J Kwok, Aneesha Bhandari, Katherine R Bull, Lucy C Garner, Paul Klenerman, Caleb Webber, Laura Parkkinen, Angela W Lee, Yanxia Wu, Benjamin Fairfax, Julian C Knight, David Buck, Paolo Piazza. Demultiplexing of Single Cell RNA Sequencing Data using Interindividual Variation in Gene Expression. Bioinformatics Advances - Oxford Academic. 2024;4(1).