LOFTK (Loss-of-Function ToolKit)

This readme

This readme accompanies the paper "LOFTK: a framework for fully automated calculation of predicted Loss-of-Function variants." by Alasiri A. et al. bioRxiv 2021.

Background

Predicted Loss-of-Function (LoF) variants in human genes are important due to their impact on clinical phenotypes and frequent occurrence in the genomes of healthy individuals. Current approaches predict high-confidence LoF variants without identifying the specific genes or the number of copies they affect. Here we present an open source tool, the Loss-of-Function ToolKit (LoFTK), which allows efficient and automated prediction of LoF variants from both genotyped and sequenced genomes, identifying genes that are inactive in one or two copies, and providing summary statistics for downstream analyses.

LoFTK is a pipeline written in the BASH and Perl languages to identify loss-of function (LoF) variants using VEP and LOFTEE efficiently. It will aid in annotating LoF variants, select high confidence (HC) variants, state the homozygous and heterozygous LoF variants, and calculate statistics.

The Loss-of-Function ToolKit Workflow: finding knockouts using genotyped and sequenced genomes.

Installation and Requirements

Install LoFTK

LoFTK has been developed to work under the environment of two cluster managers; Simple Linux Utility for Resource Management (SLURM) and Sun Grid Engine (SGE). Each cluster manager (SLURM/SGE) has LoFTK verison for installation. Look at Instillation and Requirements in the wiki.

Requirements

All scripts are annotated for debugging purposes - and future reference. The scripts will work within the context of a certain Linux environment - in this case we have tested LoFTK on CentOS7 with a SLURM Grid Engine background.

Perl >= 5.10.1
Bash
Ensembl Variant Effect Predictor (VEP)
LOFTEE for GRCh37
- Ancestral sequence (human_ancestor.fa[.gz|.rz])
- PhyloCSF database (phylocsf.sql) for conservation filters
LOFTEE for GRCh38
- GERP scores bigwig (gerp_bigwig)
- Ancestral sequence (human_ancestor_fa)
- PhyloCSF database (loftee.sql.gz)
samtools (must be on path)

Usage

The only script the user should use is the run_loftk.sh script in conjunction with a configuration file LoF.config. It is required to set up the configuration file LoF.config before run any analysis, follow the instruction in the wiki.

You can run LoFTK using the following command:

bash run_loftk.sh $(pwd)/LoF.config

Always Remember

To set all options in the LoF.config file before the run
To use the full path to the configuration file, e.g. use $(pwd).
You can run LoFTK steps all in one run or separately by setting analysis type in the LoF.config file.
VEP and LOFTEE options can be added and modified in one of these configuration files in ./bin/:
- VEP_LOFTEE_GRCh37.config
- VEP_LOFTEE_GRCh38.config

Description of files

File	Description	Usage
README.md	Description of project	Human editable
LICENSE	User permissions	Read only
LoF.config	Configuration file	Human editable
run_loftk.sh	Main LoFTK script	Read only
LoF_annotation.sh	Annotation of LoF variants/genes	Read only
allele_to_vcf.sh	Converting IMPUT2 format to VCF	Read only
descriptive_stat.sh	Descriptive analysis	Read only

Post LoFTK

Merge the counts files of multiple cohorts

This scripts allows you to merge the counts files of different cohorts. By default it only includes genes that were present in both files but you can use the union function to include genes that are present in at least 1 cohort. This means that for the other cohorts, the gene LoF counts will be set to 0 for every individual (which is tricky if the gene was not tested), or to a self-specified value

perl merge_gene_lof_counts.pl -i cohortX.counts,cohortY.counts,cohortZ.counts -o merged_cohorts.counts -c

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
bin		bin
data		data
docs		docs
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
LoF.config		LoF.config
README.md		README.md
_config.yml		_config.yml
allele_to_vcf.sh		allele_to_vcf.sh
filter_counts.sh		filter_counts.sh
gene_lof_counts_to_dyad_lofs.pl		gene_lof_counts_to_dyad_lofs.pl
merge_gene_lof_counts.pl		merge_gene_lof_counts.pl
run_loftk.sh		run_loftk.sh
stat_desc.sh		stat_desc.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LOFTK (Loss-of-Function ToolKit)

This readme

Background

Installation and Requirements

Install LoFTK

Requirements

Usage

Description of files

Post LoFTK

Merge the counts files of multiple cohorts

Mismatched genes between samples

Inputs

Outputs

Changes log

Contact

CC-BY-SA-4.0 License

Copyright (c) 2020 University Medical Center Utrecht

About

Uh oh!

Releases

Packages

Languages

License

munytre/LoFTK

Folders and files

Latest commit

History

Repository files navigation

LOFTK (Loss-of-Function ToolKit)

This readme

Background

Installation and Requirements

Install LoFTK

Requirements

Usage

Description of files

Post LoFTK

Merge the counts files of multiple cohorts

Mismatched genes between samples

Inputs

Outputs

Changes log

Contact

CC-BY-SA-4.0 License

Copyright (c) 2020 University Medical Center Utrecht

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages