It is our great pleasure to welcome you to the ACM Eighth International Workshop on Data and Text Mining in Biomedical Informatics (DTMBIO'14), in conjunction with ACM Conference of Information and Knowledge Management (CIKM'14). The rapid development of medical informatics techniques is tightly coupled with development within several fields in computer science, including data mining, information retrieval, and database management systems, among others. A fundamental topic of research within medical informatics is how to make effective use of the tremendous amount of biological and biomedical data to improve the understanding of biological systems. Such data include, but are not limited to, gene and protein sequences, gene expression profiles from microarray experiments, protein structure predictions resulting from high-throughput computational methods, protein-protein interactions from proteomic studies, single nucleotide polymorphisms profiles from SNP arrays, and much bibliographic information from electronic medical journals. The need to automatically and effectively extract, understand, integrate, and make use of information embedded in such heterogeneous unstructured data drives the current research in bioinformatics.
This year's workshop continues the tradition of bringing together researchers that work in the field of data mining, text mining, and computational biology and providing a forum to present and discuss current research topics at the interface of the related fields. The mission of DTMBIO is to promote a tighter connection between literature search and data analysis within biomedical informatics. In particular, this year we focus on the following two themes: (1) Data and text mining for biomedical applications, and the identification of relevant background knowledge in database annotations or in text documents, such as scientific publications.. And (2) Knowledge discovery through the integration of heterogeneous biomedical data collected from electronic bulletin boards, scientific publications, and any type of experiments. Furthermore, we put more focus this year on the integration of bioinformatics and medical informatics toward translational research.
The papers accepted for presentation and publication in this volume cover a variety of topics, including bio-text mining, translational bioinformatics, systems bioinformatics, bio-ontology management, sequence analysis for massively parallel sequencing, protein-protein interactions, biomedical data classification, and biomedical information retrieval. We hope that these proceedings will serve as a valuable and up-to-date reference about the application of data- and text-mining techniques within biomedical informatics.
Proceeding Downloads
Parsing Clinical Text: How Good are the State-of-the-Art Parsers?
This paper investigated and compared the performance of three state-of-the art parsers on two annotated clinical corpus, as a result, The Stanford parser achieved highest F-score among the three parsers.
Entity Linking for Biomedical Literature
The Entity Linking (EL) task links entity mentions from an unstructured document to entities in a knowledge base. Although this problem is well-studied in news and social media, this problem has not received much attention in the life science domain. ...
Biomedical Named Entity Recognition Based on the Combination of Regional and Global Text Features
The biomedical information extraction, especially Named Entity Recognition (NER), is a primary task in biomedical text-mining due to the rapid growth of large-scale literature. Extracting biomedical entities aims at identifying specific entities (words ...
Injury Narrative Text Classification: A Preliminary Study
Description of a patient's injuries is recorded in narrative text form by hospital emergency departments. For statistical reporting, this text data needs to be mapped to pre-defined codes. Existing research in this field uses the Naïve Bayes ...
Systematic Identification of Context-dependent Conflicting Information in Biological Pathways
- Seyeol Yoon,
- Jinmyung Jung,
- Hasun Yu,
- Mijin Kwon,
- Sungji Choo,
- Kyunghyun Park,
- Dongjin Jang,
- Sangwoo Kim,
- Doheon Lee
Interactions between biological entities such as genes, proteins and metabolites, so called pathways, are key features to understand molecular mechanisms of life. As pathway information is being accumulated rapidly through various knowledge resources, ...
Inference of Disease E3s from Integrated Functional Relation Network
Recently, the potential of E3 ligase as a therapeutic target is increasing. The systematic method to derive disease-related E3s can provide significant contribution for this demand. Several disease gene prediction methods have been introduced but it is ...
Identification of Coexpressed Gene Modules across Multiple Brain Diseases by a Biclustering Analysis on Integrated Gene Expression Data
It has been reported that several brain diseases could share symptoms at clinical level, suggesting the necessity and possibility to develop therapeutics. In this paper, we carried out an integrated gene expression analysis on several microarray ...
Identifying Cancer Subtypes based on Somatic Mutation Profile
Tumor stratification is one of the basic tasks in cancer genomics for a better understanding of the tumor heterogeneity and better targeted treatments. There are various biological data that can be used to stratify tumors including gene expression and ...
Identification of Genomic Features in the Classification of Loss- and Gain-of-Function Mutation: [Extended Abstract]
In this work, we propose a comprehensive analysis of the genomic features of the human in mutations to classify loss-of-function (LoF) and gain-of-function (GoF) mutations. Through these genetic mutations, a protein can lose its native function, or it ...
Identification of a Specific Base Sequence of Pathogenic E. Coli through a Genomic Analysis
E. coli sequence type 131 (ST131) is one of pathogens that causes resistant infections. Comparative genome analyses allow interpretations of the virulence factors of pathogens. Thus, in this study, we analysis the genomic differences between the ...
Grounded Feature Selection for Biomedical Relation Extraction by the Combinative Approach
Relation extraction is an important task in biomedical areas such as protein-protein interaction, gene-disease interactions, and drug-disease interactions. In recent years, it has been widely researched to automatically extract biomedical relations in a ...
An Exploration of the Collaborative Networks for Clinical and Academic Domains in AIDS Research: A Spatial Scientometric Approach
This study investigates the world-wide collaborative networks from a geographical perspective based on clinical tests (CT) and academic researches (AR) on Acquired immune deficiency syndrome or acquired immunodeficiency syndrome (AIDS). By applying text ...
A Display of Conceptual Structures in the Epidemiologic Literature
Biomedical literature from PubMed contains various types of entities such as diseases or organisms. The rapid growth of their size makes it harder to conceptualized; however, displaying the natural terms that occurred in the text is more effective in ...
Inferring Undiscovered Public Knowledge by Using Text Mining-driven Graph Model
Due to the recent development of Information Technology, the number of publications is increasing exponentially. In response to the increasing number of publications, there has been a sharp surge in the demand for replacing the existing manual text data ...
Mining the Main Health Trend of the General Public based on Opinion Mining of Korean Blogsphere
These days, social media usually becomes a reasonable standard for understanding the public's thought. Especially, people increasingly use internet media and SNS (twitter, facebook, blog, and etc.), to share opinions, news, advice, interests, moods, ...
Integrative Database for Exploring Compound Combinations of Natural Products for Medical Effects
- Suhyun Ha,
- Sunyong Yoo,
- Moonshik Shin,
- Jin Sook Kwak,
- Oran Kwon,
- Min Chang Choi,
- Keon Wook Kang,
- Hojung Nam,
- Doheon Lee
Natural products used in dietary supplements, complementary and alternative medicine (CAM) and conventional medicine are composites of multiple chemical compounds. These chemical compounds potentially offer an extensive source for drug discovery with ...
Mining Context-Specific Rules from the Literature for Virtual Human Model Simulation
Computer-based virtual human model is believed to be the promising solution for drug response identification. Literature mining is competitive method to extract those biological rules for human model simulation, since existing public databases provide ...
Visualization of Zoomable Network for Multi-Compounds and Multi-Targets Analysis
Recent explosively increased bio-data enable to simulate the metabolism on whole body scale and it bring about needs of bioinformatics tools for visualizing and analyzing it. For such tools zooming is a key method for visualizing large and complex ...
Construction of Multi-level Networks Incorporating Molecule, Cell, Organ and Phenotype Properties for Drug-induced Phenotype Prediction
Inferring drug-induced phenotypes via computational approaches can give a substantial support to drug discovery procedure. However, existing computational models that are mainly based on a single cell or a single organ model are thought to be limited ...
Detecting Phosphorylation Determined Active Protein Interaction Network during Cancer Development by Robust Network Component Analysis
Motivation: In recent disease study, many key pathogen genes/proteins are found to have not significant differential expressions, and thus, they tend to be disregarded in conventional differential expression analysis or network analysis. Meanwhile, the ...
TILD: A Strategy to Identify Cancer-related Genes Using Title Information in Literature Data
After genome project in 1990s, researches which are involved with gene have been progressed. These studies unearthed that gene is cause of disease, and relations between gene and disease are important. In this reason, we proposed a strategy called TILD ...
Recommendations
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
DTMBIO '14 | 211 | 22 | 10% |
DTMBIO '13 | 18 | 11 | 61% |
DTMBIO '09 | 18 | 8 | 44% |
Overall | 247 | 41 | 17% |