8000 GitHub - LingyiC/adaptiveAttractor: The adaptive version of attractor algorithm.
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

LingyiC/adaptiveAttractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

The repository contains the code for the computational studies in The fibro-adipogenic progenitor APOD+DCN+LUM+ cell population in aggressive carcinomas.

Adaptive Attractor

An adaptive method for finding co-expression signatures in single-cell/bulk RNA-seq datasets.

Overview

An adaptive version of attractor algorithm.

cafr::findAttractor finds a converged attractor based on the seed gene provided. The findAttractor.adaptive gradually decreased the exponent parameter to maximize the strength of the Nth-ranked genes in the converged attractor.

The attractor algorithm was first proposed for identifying co-expression signatures from bulk expression values in samples [Ref.1]. A detailed description and application of the attractor algorithm on single-cell RNA-seq data can be found in [Ref.2]. Users can refer to our recent manuscript [Ref.3] for a detailed description and application of adaptive attractor algorithm.

Tutorials

Quick start

Please see Seed selection for more information about the seed.

source("./findAttractor.adaptive.r")
data <- as.matrix(data)  # a normalized expression matrix with genes in the rows, cells in the columns.
seed <- "LUM"
attr <- findAttractor.adaptive(data, seed)

Examples

R script for generating Supplementary Table 1 in our recent manuscript [link].

## install cafr package 
devtools::install_github("weiyi-bitw/cafr")

## load metadata
# Data website: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE129363
metadata <- data.table::fread("./GSE129363/raw/GSE129363_Discovery_Cohort_CellAnnotation.txt")

## normalize the count matrix 
library("Seurat")
adata <- Read10X("./GSE129363/raw") #including barcodes.tsv, genes.tsv, matrix.mtx
adata <- CreateSeuratObject(counts = adata[, metadata$CellID])    
adata <- NormalizeData(adata)
data_all <- adata[["RNA"]]@data; rm(adata); gc()

## select one sample
sampleID <- "Ate-05-SAT"
data <- data_all[, metadata[which(metadata$SampleName == sampleID), ]$CellID]
  
## apply adaptive attractor algorithm
source("./findAttractor.adaptive.r")
data <- as.matrix(data)  #normalized matrix for each sample
seed <- "LUM"
attr <- findAttractor.adaptive(data, seed, exponent.max = 5, exponent.min = 2, step.large = 1, step.small = 0.1, dominantThreshold = 0.2, seedTopN = 50, compareNth = 10, verbose = FALSE)

## results
head(attr$attractor.final)
attr$final.exponent

Parameters

data An expression matrix with genes in the rows, samples in the columns.
seed Seed gene. Please see Seed selection for more information.
exponent.max The maximum exponent of the mutual information, used to create weight vector for metagenes.
exponent.min The minimum exponent of the mutual information, used to create weight vector for metagenes.
step.large Decreasing step of the first round of scanning.
step.small Decreasing step of the second round of scanning.
dominantThreshold Threshold of dominance.
seedTopN Threshold of divergence.
compareNth It will maxmize the weight of the top Nth gene.
verbose When TRUE, it will show the top 20 genes of the metagene in each iteration.
epsilon Threshold of convergence.
minimize Default FALSE,when minimize = TRUE, the algorithm will minize the difference of metagene[first.idx] and metagene[second.idx] instead of maxmizing the weight of the top Nth gene.
first.idx Default first.idx = 1, the first gene of the metagene in each iteration.
second.idx Default second.idx = 2, the second gene of the metagene in each iteration.

Seed selection

Users can choose some general markers as seed genes, such as:

Fibroblast Pericyte Macrophage Endothelial Mitotic
LUM RGS5 AIF1 PECAM1 TOP2A
DCN

If the dataset has many cells of one cell type (e.g. fibroblasts) and the attractor exponent (a) is fixed, then identical attractors will be found using different general markers for one cell type, like DCN, LUM, and COL1A1.

Demonstration

Here, we applied the attractor finding algorithm using the findAttractor function implemented in the cafr (v0.312) R package [Ref.1] with the general fibroblastic marker gene LUM and DCN as seed.

Description of original attractor algorithm

Detailed descriptions can be found in Ref.1.

The exponent (a) is fixed. For the analysis of UMI based (e.g. 10x) and full-length-based (e.g. Smart-seq2) datasets, we would suggest using a = 3 and a = 5, respectively.

## install cafr package 
devtools::install_github("weiyi-bitw/cafr")
cafr::findAttractor(data, vec, a = 3, epsilon = 1e-7)

About

The adaptive version of attractor algorithm.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0