This notebook contains a procedure to prepare the sequencing data for visualization of the region plots. The workflow describes the tools to convert the files into the formats: BED6, BED12, BAM, and BIGWIG. Additionally, we formed the helper functions to create .ini files from scratch.
The PyGenomeTrack aims to produce high-quality genome browser tracks.
The main steps:
- Installation of requirements.
- Prepare your GFF3 files.
- Convert GFF3 into BED6.
- Convert GFF3 into BED12.
- Sort your BED6/BED12 file
- Make BigWig file from BAM/SAM format.
- Prepare the .INI files - from scrach or by edition of the example file.
- PyGenomeTracks - make tracks file.
- PyGenomeTracks - make region plot
HINT: Exclamation mark (!) at the beginning of line allows to use bash commands from jupyter notebook.
- Python 2.7 or Python 3.x
- PyGenomeTrack
- samtools
- bedtools
- sortbed
- gff3togenepred
- genepredtobed
- numpy >= 1.8.0
- scipy >= 0.17.0
- py2bit >= 0.1.0
- pyBigWig >= 0.2.1
- pysam >= 0.8
- matplotlib >= 1.4.0
- deeptools
- GFF3 - General Feature Format Version 3
- BED6/BED12 - Browser Extensible Data
- The BIGWIG format is useful for dense, continuous data that will be displayed in the Genome Browser as a graph.
- The INI file format is an informal standard for configuration files.
- SAM - Sequence Alignment Map
- BAM is the compressed binary version of the Sequence Alignment/Map (SAM) format, a compact and index-able representation of nucleotide sequence alignment.
- genePred is a table format commonly used for gene prediction tracks in the Genome Browser.