This project is part of the Protein Codeathon at ISMB 2022, and is mainly creating some useful python and/or node.js scripts to:-
(i) add annotatation of protein residue, especially post-translational modifications (PTMs);
(ii) scrap the existing information, including the 3d domain, interactions and distance table
iCn3D (https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html) is a web-based molecular structure visualizing interactive tool, which can also be accessed programmatically using NodeJS script. There are a few protein annotations available in the current version of iCn3D, such as SNPs, ClinVar, conserved domains, functional sites, 3D domains, interactions, disulfide bonds and cross linkages. In order to extracting the result for a large set of 3D structure, users can use either Python or Node.js scripts. Hence, we focus on creating some user-friendly python/node.js scripts to ease running a bulk analysis. Additionally, novel PTMs annotation is now added to improve the protein annotation function in iCn3D.
Scrapper - From UI to Data
by Jiyao Wang, Raphael Trevizani, Li Chuin Chong
In order to download 3d domain, interactions and distance table automatically from UI, a node.js is created for former annotation while two python scripts are created for latter annotations using selenium and chromedriver.
by Sachendra Kumar The input PTMs files were prepared using the downloaded PTMs sites information for Acetylation, Methylation, Phosphorylations, Sumoylation, and Ubiquitination from PhosphoSitePlus database using python script. For example, Header: UniprotId,PTMaa,PTMpos,Metadata and its associated data Data: Q12888,T,100,https://www.phosphosite.org/siteAction.action?id=31887780 For proof of the concept, we did custom annotation for PTMs in iCn3D for user input based annotation.
-
Scrapper script for 3D domain: annotation.js
Example output:
For more details, please visit the3d_domain_annotation
sub-directory. -
Scrapper script for interactions: download_interactions.py
Example output: 1enh_line_graph.json
For more details, please visit theinteraction
sub-directory. -
Scrapper script for distance table: iCn3D_scapper_forDistance.py
Example output: distance_table_7JMO.csv
For more details, please visit thedistance_table
sub-directory.
- Annotator script for PTMs: PTMsite_annotation_inputfile_prep.py
Example output: Processed_Phosphorylation.csv
For more details, please visit thePTM_annotation
sub-directory.
For the preliminary study, we processed PTMs annotation sites for Acetylation (48279), Methylation (20554), Phosphorylation (378481), Sumoylation (8832), and Ubiquitination (126051) to annotate/display in iCn3D using node.js script based user interface.
Scrapper: expand this functionality to all other menus
Annotator: implementing the PTMS annotation using node.js script-based user interface and adding novel annotations, such as somatic mutations (COSMIC database), other PTMs Glycosylation, Succinylation, and automating using REST API user interface.
Jiyao Wang (Team Lead), National Center for Biotechnology Information (NCBI), USA
Li Chuin Chong (Writer), TWINCORE GmbH, Germany
Sachendra Kumar (Writer), Indian Institute of Science, India
Raphael Trevizani, IEDB and Fiocruz, Brazil
David Enoma (Technology Support), Noma Integrated Technology Solutions, Nigeria.
Jack Lin
Or would like a feature added? Or maybe drop some feedback? Just open a new issue.