Capel et al., 2022 - Google Patents
ProteinGLUE multi-task benchmark suite for self-supervised protein modelingCapel et al., 2022
View HTML- Document ID
- 7120109621260709175
- Author
- Capel H
- Weiler R
- Dijkstra M
- Vleugels R
- Bloem P
- Feenstra K
- Publication year
- Publication venue
- Scientific Reports
External Links
Snippet
Self-supervised language modeling is a rapidly developing approach for the analysis of protein sequence data. However, work in this area is heterogeneous and diverse, making comparison of models and methods difficult. Moreover, models are often evaluated only on …
- 102000004169 proteins and genes 0 title abstract description 123
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G06F17/30864—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems
- G06F17/30867—Retrieval from the Internet, e.g. browsers by querying, e.g. search engines or meta-search engines, crawling techniques, push systems with filtering and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/28—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/30—Medical informatics, i.e. computer-based analysis or dissemination of patient or disease data
- G06F19/32—Medical data management, e.g. systems or protocols for archival or communication of medical images, computerised patient records or computerised general medical references
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/16—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for molecular structure, e.g. structure alignment, structural or functional relations, protein folding, domain topologies, drug targeting using structure data, involving two-dimensional or three-dimensional structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Systems or methods specially adapted for a specific business sector, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/22—Health care, e.g. hospitals; Social work
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
- G06N5/02—Knowledge representation
- G06N5/022—Knowledge engineering, knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by the preceding groups
- G01N33/48—Investigating or analysing materials by specific methods not covered by the preceding groups biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce, e.g. shopping or e-commerce
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Gupta et al. | Feedback GAN for DNA optimizes protein functions | |
Huang et al. | MolTrans: molecular interaction transformer for drug–target interaction prediction | |
Clifford et al. | BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models | |
Høie et al. | NetSurfP-3.0: accurate and fast prediction of protein structural features by protein language models and deep learning | |
Li et al. | Cumulus provides cloud-based data analysis for large-scale single-cell and single-nucleus RNA-seq | |
Baldassarre et al. | GraphQA: protein model quality assessment using graph convolutional networks | |
Wang et al. | Protein secondary structure prediction using deep convolutional neural fields | |
Deng et al. | META-DDIE: predicting drug–drug interaction events with few-shot learning | |
Capel et al. | ProteinGLUE multi-task benchmark suite for self-supervised protein modeling | |
Zhang et al. | Application of artificial intelligence in drug–drug interactions prediction: a review | |
Wang et al. | Multitask joint strategies of self-supervised representation learning on biomedical networks for drug discovery | |
Shoombuatong et al. | Towards the revival of interpretable QSAR models | |
Hou et al. | Learning the protein language of proteome-wide protein-protein binding sites via explainable ensemble deep learning | |
Zhu et al. | DataDTA: a multi-feature and dual-interaction aggregation framework for drug–target binding affinity prediction | |
Diaz-Flores et al. | Evolution of artificial intelligence-powered technologies in biomedical research and healthcare | |
Islam et al. | Molecular-evaluated and explainable drug repurposing for COVID-19 using ensemble knowledge graph embedding | |
Stringer et al. | PIPENN: protein interface prediction from sequence with an ensemble of neural nets | |
Alzahrani et al. | Identification of stress response proteins through fusion of machine learning models and statistical paradigms | |
Shen et al. | Unbiased organism-agnostic and highly sensitive signal peptide predictor with deep protein language model | |
Feng et al. | A bioactivity foundation model using pairwise meta-learning | |
Ibtehaz et al. | Domain-PFP allows protein function prediction using function-aware domain embedding representations | |
Hong et al. | S-Pred: protein structural property prediction using MSA transformer | |
Ahmed et al. | Computational identification of multiple lysine PTM sites by analyzing the instance hardness and feature importance | |
Wang et al. | Deciphering the protein landscape with ProtFlash, a lightweight language model | |
Wang et al. | SADeepcry: a deep learning framework for protein crystallization propensity prediction using self-attention and auto-encoder networks |