8000 GitHub - sasha42/Bio-geolocation: Looks up the location of sequences in GenBank and adds it to a FASTA file
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

sasha42/Bio-geolocation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bio location

Looks up the location of sequences in genbank and adds it to a FASTA file. In this repo there is the bio-geolocation.py script, a demo jupyter notebook as well as unprocessed and processed files as an example. To use the script itself you'll need to invoke it with python3 and your .fas input file:

$ python3 bio-geolocation.py suillus.fas

Bio geolocation
extracting data from 104 sequences
getting geolocation data (this takes a while)
saving into processed.fas
done!

You will then have a nice processed.fas file.

Workflow

You will need to sequence the genome and get an output from your lab. For example: Mushroom Observer observation with DNA sequence of a new species of Suillus.

Then BLAST is used to find closely related sequences. These are downloaded and then fed into the GenBank database to get the country.

All of this data is then used to generate phylogenetic trees


Jupyter notebook setup

You will need python3 and pip as a prerequisite. Virtualenv is highly recommended to keep these packages away from messing around with your other packages.

pip install jupyter
jupyter notebook

open browser and go to Bio geolocation.ipynb

Next steps

Probably write a server that accepts .fas files as uploads, and then renders it on a map.

About

Looks up the location of sequences in GenBank and adds it to a FASTA file

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0