DepoScope - Predict and annotate phage depolymerases

General information

This is the repository related to our manuscript DepoScope: accurate phage depolymerase annotation and domain delineation using large language models, published in PLOS Computational Biology (link).

Quick start

The easiest way to get started with DepoScope to predict depolymerases from your phage genomes or genes is to run the Google Colab that we provide here. You only need a zip file of your phage genomes to get started! Alternatively, go to the the scripts_clean folder and run the VII.DpoDetectionTool.ipynb notebook.

To run the benchmarking against other depolymerase detection tools, go to the benchmark folder and run the benchmark_notebook.ipynbnotebook.

Running DepoScope as a script

To run DepoScope as a script, a few steps are required. The following instructions are for a Unix-based system, but they can be adapted to Windows with minimal changes. First, install uv with:

curl -LsSf https://astral.sh/uv/install.sh | less

Then, make sure that clang is installed (it is required by phanotate) using which clang. If it is not installed, install it with sudo apt install clang.

Now we need to download the pre-trained model weights. First we download and extract the fine-tuned ESM-2 model weights:

wget https://zenodo.org/records/10957073/files/esm2_t12_finetuned_depolymerases.zip
unzip esm2_t12_finetuned_depolymerases.zip

Then we download the pre-trained DepoScope model:

wget https://zenodo.org/records/10957073/files/Deposcope.esm2_t12_35M_UR50D.2203.full.model

Finally, clone the repository and run deposcope-predict.py through uv with the following command:

uv run deposcope-predict.py -i <input_fasta_file> -o <output_file_name> --esm2 <path_to_esm2_checkpoint> --Dpo <path_to_Dpo_model>

where:

<input_fasta_file> is the path to the input fasta file with a single phage genome to be annotated.
<output_file_name> is the desired name for the output file.
<path_to_esm2_checkpoint> is the path to the folder containing the fine-tuned ESM-2 model checkpoint files.
<path_to_Dpo_model> is the path to the pre-trained DepoScope .model file.

Name		Name	Last commit message	Last commit date
Latest commit History 110 Commits
Data		Data
Data_Collection		Data_Collection
Other		Other
Training		Training
benchmark		benchmark
esmfold		esmfold
scripts		scripts
scripts_clean		scripts_clean
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
deposcope-predict.py		deposcope-predict.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DepoScope - Predict and annotate phage depolymerases

General information

Quick start

Running DepoScope as a script

About

Releases 1

Packages

Contributors 3

Languages

License

dimiboeckaerts/DepoScope

Folders and files

Latest commit

History

Repository files navigation

DepoScope - Predict and annotate phage depolymerases

General information

Quick start

Running DepoScope as a script

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages