MotifScope

A tool for motif annotation and visualization in tandem repeats.

Motifscope is also available online at https://motifscope.holstegelab.eu.

Installa 8B3F tion

To install with conda
```
cd install/conda
sh INSTALL.sh
```
Conda will install an environment called motifscope, in which the necessary dependencies are installed.
The conda environment is activated by executing conda activate motifscope in the shell.

See the usage section on how to run MotifScope once the conda environment is activated.
To install with docker
```
cd install/docker
sh build.sh
```
Docker will create an image called motifscope, in which the necessary dependencies are installed.

An example command for running motifscope within this docker image is available in run_docker.sh.
Please adapt the options in run_docker.sh to your specific use case.

To run it (e.g. with example files in Motifscope/example folder):
```
   sh run_docker.sh path/to/example_sequence.fa path/to/example_population.txt output_prefix
```

Usage

For running MotifScope on a set of sequences (reads or assemblies):
```
motifscope  [-i input.fa] [-mink 2] [-maxk 10] [-o output.prefix]
```
To annotate sequences with class labels, one can use the -p option to provide an annotation file.
```
motifscope [-i input.fa] [-mink 2] [-maxk 10] [-p classes.txt] [-o output.prefix]
```
The class information will be shown as a separate color-coded column in the figure.
- The header of the sequences in input.fa should start with >sample#hap_number#, for example, for HG002, it could start with >HG002#1# .
- The class annotation file classes.txt should be a tab separated file with the first column being the sample ids and the second column being the sample class. E.g. HG002 EUR
- The class annotation file can contain a header, which should read sample <class_name>. The label of the second column <class_name> can be adapted, and will be shown in the figure. When there is no header, the default is 'population'.
To disable sequence clustering and the dendrogram (e.g. in case of a single sequence), use the -c option:
```
motifscope -c False [-i input.fa] [-mink 2] [-maxk 10] [-o output.prefix]
```

To run multiple sequence alignment on the compressed representation of the sequence, set -msa to POAMotif (aligns complete motifs) or POANucleotide (aligns nucleotides).
To guide the algorithm with a set of known motifs, provide the motifs with -motifs motifs.txt. The motif file motifs.txt should contain the motifs separated with a tab.
To use random categorical colors for motifs, set -e to random. To project motifs onto a color scale, set -e to UMAP or MDS for dimension reduction based on motif similarities.
To characterize motif composition without generating a figure, set -figure to False.
To use the reverse complement of the input fasta, set -reverse to True.

Output

The repeat compositions are output in a fasta file. For example,

>HG002#2#JAHKSD010000034.1:9910981-9913041/rc
G1 A1 G1 C1 A2 G1 A1 C1 T1 C1 T1 G1 T3 C1 A2 AAAAG12 A1 AAAAG1 C1 A1 T1 G1 T2 C1 T1 A3 G1 A1 G1

The motifs are separated by spaces. Each string represents a motif, and the following number indicates how many consecutive copies of that motif occur.

The motif summary per sequence is output in a tab-separated file. The first column is the sequence header, the second column is the motif, the third column is the amount of sequence covered by the motif, and the fourth column is the count of the motif.

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
example		example
install		install
mscope		mscope
paper_scripts		paper_scripts
LICENSE		LICENSE
README.md		README.md
motifscope		motifscope

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MotifScope

Installa 8B3F tion

Usage

Output

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

holstegelab/MotifScope

Folders and files

Latest commit

History

Repository files navigation

MotifScope

Installa 8B3F tion

Usage

Output

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages