Keyword: document spanners : Search

Applied Filters

Publication Date

13 Results for: Keyword: document spannersEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,856,489 records)|Limit your search to The ACM Full-Text Collection (778,916 records)

Showing 1 - 13of13 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
Open Access
November 2024
Revisiting Weighted Information Extraction: A Simpler and Faster Algorithm for Ranked Enumeration
Proceedings of the ACM on Management of Data (PACMMOD), Volume 2, Issue 5Article No.: 222, Pages 1–19https://doi.org/10.1145/3695840

Information extraction from textual data, where the query is represented by a finite transducer and the task is to enumerate all results without repetition, and its extension to the weighted case, where each output element has a weight and the output ...
0
119
Metrics
Total Citations0
Total Downloads119
Last 12 Months119
Last 6 weeks30
View online with eReader
PDF
research-article
Open Access
May 2024
Generalized Core Spanner Inexpressibility via Ehrenfeucht-Fraïssé Games for FC
- Sam M. Thompson,
- Dominik D. Freydenberger
Proceedings of the ACM on Management of Data (PACMMOD), Volume 2, Issue 2Article No.: 80, Pages 1–18https://doi.org/10.1145/3651143

Despite considerable research on document spanners, little is known about the expressive power of generalized core spanners. In this paper, we use Ehrenfeucht-Fraïssé games to obtain general inexpressibility lemmas for the logic FC (a finite model ...
1
288
Metrics
Total Citations1
Total Downloads288
Last 12 Months288
Last 6 weeks37
View online with eReader
PDF
research-article
June 2022
Efficient Enumeration for Annotated Grammars
PODS '22: Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 291–300https://doi.org/10.1145/3517804.3526232

We introduce annotated grammars, an extension of context-free grammars which allows annotations on terminals. Our model extends the standard notion of regular spanners, and is more expressive than the extraction grammars recently introduced by ...
4
104
Metrics
Total Citations4
Total Downloads104
Last 12 Months13
Last 6 weeks2
Get Access
research-article
Open Access
June 2022
Document Spanners - A Brief Overview of Concepts, Results, and Recent Developments
- Markus L. Schmid,
- Nicole Schweikardt
PODS '22: Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 139–150https://doi.org/10.1145/3517804.3526069

The information extraction framework of document spanners was introduced by Fagin, Kimelfeld, Reiss, and Vansummeren (PODS 2013, J. ACM 2015) as a formalisation of the query language AQL, which is used in IBM's information extraction engine SystemT. ...
3
185
Metrics
Total Citations3
Total Downloads185
Last 12 Months93
Last 6 weeks20
1
Supplementary Material
PODS22-gm86.mp4
View online with eReader
PDF
research-article
Open Access
June 2022
Query Evaluation over SLP-Represented Document Databases with Complex Document Editing
- Markus L. Schmid,
- Nicole Schweikardt
PODS '22: Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 79–89https://doi.org/10.1145/3517804.3524158

It is known that the query result of a regular spanner over a single document D can be enumerated after O(|D|) preprocessing and with constant delay in data complexity (Florenzano et al., ACM TODS 2020, Amarilli et al., ACM TODS 2021). It has been shown ...
2
196
Metrics
Total Citations2
Total Downloads196
Last 12 Months89
Last 6 weeks17
1
Supplementary Material
PODS22-fp81.mp4
View online with eReader
PDF
research-article
June 2021
Spanner Evaluation over SLP-Compressed Documents
- Markus L. Schmid,
- Nicole Schweikardt
PODS'21: Proceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 153–165https://doi.org/10.1145/3452021.3458325

We consider the problem of evaluating regular spanners over compressed documents, i.e., we wish to solve evaluation tasks directly on the compressed data, without decompression. As compressed forms of the documents we use straight-line programs (SLPs) --...
4
88
Metrics
Total Citations4
Total Downloads88
Last 12 Months16
Last 6 weeks3
1
Supplementary Material
PODS 2021 ACM DL video.mp4
Get Access
research-article
June 2019
Complexity Bounds for Relational Algebra over Document Spanners
PODS '19: Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 320–334https://doi.org/10.1145/3294052.3319699

We investigate the complexity of evaluating queries in Relational Algebra (RA) over the relations extracted by regex formulas (i.e., regular expressions with capture variables) over text documents. Such queries, also known as the regular document ...
6
192
Metrics
Total Citations6
Total Downloads192
Last 12 Months26
Last 6 weeks6
Get Access
research-article
May 2018
Document Spanners for Extracting Incomplete Information: Expressiveness and Complexity
PODS '18: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 125–136https://doi.org/10.1145/3196959.3196968

Rule-based information extraction has lately received a fair amount of attention from the database community, with several languages appearing in the last few years. Although information extraction systems are intended to deal with semistructured data, ...
26
219
Metrics
Total Citations26
Total Downloads219
Last 12 Months11
Last 6 weeks2
Get Access
research-article
May 2018
Joining Extractions of Regular Expressions
PODS '18: Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database SystemsPages 137–149https://doi.org/10.1145/3196959.3196967

Regular expressions with capture variables, also known as "regex formulas,'' extract relations of spans (interval positions) from text. These relations can be further manipulated via the relational Algebra as studied in the context of "document spanners,...
35
319
Metrics
Total Citations35
Total Downloads319
Last 12 Months20
Last 6 weeks4
Get Access
column
May 2016
A Relational Framework for Information Extraction
ACM SIGMOD Record (SIGMOD), Volume 44, Issue 4Pages 5–16https://doi.org/10.1145/2935694.2935696

Information Extraction commonly refers to the task of populating a relational schema, having predefined underlying semantics, from textual content. This task is pervasive in contemporary computational challenges associated with Big Data. In this article ...
6
277
Metrics
Total Citations6
Total Downloads277
Last 12 Months4
Last 6 weeks0
Get Access
research-article
May 2015
Document Spanners: A Formal Approach to Information Extraction
Journal of the ACM (JACM), Volume 62, Issue 2Article No.: 12, Pages 1–51https://doi.org/10.1145/2699442

An intrinsic part of information extraction is the creation and manipulation of relations extracted from text. In this article, we develop a foundational framework where the central construct is what we call a document spanner (or just spanner for short)...
59
919
Metrics
Total Citations59
Total Downloads919
Last 12 Months71
Last 6 weeks5
Get Access
tutorial
June 2014
Database principles in information extraction
- Benny Kimelfeld
PODS '14: Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsPages 156–163https://doi.org/10.1145/2594538.2594563

Information Extraction commonly refers to the task of populating a relational schema, having predefined underlying semantics, from textual content. This task is pervasive in contemporary computational challenges associated with Big Data. This tutorial ...
9
464
Metrics
Total Citations9
Total Downloads464
Last 12 Months5
Last 6 weeks0
Get Access
research-article
June 2014
Cleaning inconsistencies in information extraction via prioritized repairs
PODS '14: Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsPages 164–175https://doi.org/10.1145/2594538.2594540

The population of a predefined relational schema from textual content, commonly known as Information Extraction (IE), is a pervasive task in contemporary computational challenges associated with Big Data. Since the textual content varies widely in nature ...
14
397
Metrics
Total Citations14
Total Downloads397
Last 12 Months9
Last 6 weeks0
Get Access

Search Results

Applied Filters

Publication Date

People

Authors

Institutions

Publications

Journal/Magazine Names

All Publications

Content Type

Supplemental Material Type

Publisher

Proceedings Series

ACM SIG Sponsors

Results

Caption

Revisiting Weighted Information Extraction: A Simpler and Faster Algorithm for Ranked Enumeration

Generalized Core Spanner Inexpressibility via Ehrenfeucht-Fraïssé Games for FC

Efficient Enumeration for Annotated Grammars

Document Spanners - A Brief Overview of Concepts, Results, and Recent Developments

Query Evaluation over SLP-Represented Document Databases with Complex Document Editing

Spanner Evaluation over SLP-Compressed Documents

Complexity Bounds for Relational Algebra over Document Spanners

Document Spanners for Extracting Incomplete Information: Expressiveness and Complexity

Joining Extractions of Regular Expressions

A Relational Framework for Information Extraction

Document Spanners: A Formal Approach to Information Extraction

Database principles in information extraction

Cleaning inconsistencies in information extraction via prioritized repairs