SIGMOD: Vol 27, No 2

Volume 27, Issue 2June 1998

Volume 27, Issue 2

June 1998

Editor:

Ashutosh Tiwary
Boeing Co.; and Univ. of Washington, Seattle

Publisher:

Association for Computing Machinery
New York
NY
United States

ISSN:0163-5808

Tags:

Bibliometrics

Select All

Export Citations Save to Binder

article

Free

Query flocks: a generalization of association-rule mining

Pages 1–12https://doi.org/10.1145/276305.276306

Association-rule mining has proved a highly successful technique for extracting useful information from very large databases. This success is attributed not only to the appropriateness of the objectives, but to the fact that a number of new query-...

article

Free

Exploratory mining and pruning optimizations of constrained associations rules

Pages 13–24https://doi.org/10.1145/276305.276307

From the standpoint of supporting human-centered discovery of knowledge, the present-day model of mining association rules suffers from the following serious shortcomings: (i) lack of user exploration and control, (ii) lack of focus, and (iii) rigid ...

article

Free

Parallel mining algorithms for generalized association rules with classification hierarchy

Pages 25–36https://doi.org/10.1145/276305.276308

Association rule mining recently attracted strong attention. Usually, the classification hierarchy over the data items is available. Users are interested in generalized association rules that span different levels of the hierarchy, since sometimes more ...

article

Free

Reusing invariants: a new strategy for correlated queries

Pages 37–48https://doi.org/10.1145/276305.276309

Correlated queries are very common and important in decision support systems. Traditional nested iteration evaluation methods for such queries can be very time consuming. When they apply, query rewriting techniques have been shown to be much more ...

article

Free

Query unnesting in object-oriented databases

Leonidas Fegaras

Pages 49–60https://doi.org/10.1145/276305.276310

There is already a sizable body of proposals on OODB query optimization. One of the most challenging problems in this area is query unnesting, where the embedded query can take any form, including aggregation and universal quantification. Although there ...

article

Free

Changing the rules: transformations for rule-based optimizers

Pages 61–72https://doi.org/10.1145/276305.276311

Rule-based optimizers are extensible because they consist of modifiable sets of rules. For modification to be straightforward, rules must be easily reasoned about (i.e., understood and verified). At the same time, rules must be expressive and efficient (...

article

Free

CURE: an efficient clustering algorithm for large databases

Pages 73–84https://doi.org/10.1145/276305.276312

Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the ...

article

Free

Efficiently mining long patterns from databases

Roberto J. Bayardo

Pages 85–93https://doi.org/10.1145/276305.276313

We present a pattern-mining algorithm that scales roughly linearly in the number of maximal patterns embedded in a database irrespective of the length of the longest pattern. In comparison, previous algorithms based on Apriori scale exponentially with ...

article

Free

Automatic subspace clustering of high dimensional data for data mining applications

Pages 94–105https://doi.org/10.1145/276305.276314

Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical ...

article

Free

Efficient mid-query re-optimization of sub-optimal query execution plans

Pages 106–117https://doi.org/10.1145/276305.276315

For a number of reasons, even the best query optimizers can very often produce sub-optimal query execution plans, leading to a significant degradation of performance. This is especially true in databases used for complex decision support queries and/or ...

article

Free

Interaction of query evaluation and buffer management for information retrieval

Pages 118–129https://doi.org/10.1145/276305.276316

The proliferation of the World Wide Web has brought information retrieval (IR) techniques to the forefront of search technology. To the average computer user, “searching” now means using IR-based systems for finding information on the WWW or in other ...

article

Free

Cost-based query scrambling for initial delays

Pages 130–141https://doi.org/10.1145/276305.276317

Remote data access from disparate sources across a wide-area network such as the Internet is problematic due to the unpredictable nature of the communications medium and the lack of knowledge about the load and potential delays at remote sites. ...

article

Free

The pyramid-technique: towards breaking the curse of dimensionality

Pages 142–153https://doi.org/10.1145/276305.276318

In this paper, we propose the Pyramid-Technique, a new indexing method for high-dimensional data spaces. The Pyramid-Technique is highly adapted to range query processing using the maximum metric L_max. In contrast to all other index structures, the ...

article

Free

Optimal multi-step k-nearest neighbor search

Pages 154–165https://doi.org/10.1145/276305.276319

For an increasing number of modern database applications, efficient support of similarity search becomes an important task. Along with the complexity of the objects such as images, molecules and mechanical parts, also the complexity of the similarity ...

article

Free

Dimensionality reduction for similarity searching in dynamic databases

Pages 166–176https://doi.org/10.1145/276305.276320

Databases are increasingly being used to store multi-media objects such as maps, images, audio and video. Storage and retrieval of these objects is accomplished using multi-dimensional index structures such as R*-trees and SS-trees. As dimensionality ...

article

Free

Your mediators need data conversion!

Pages 177–188https://doi.org/10.1145/276305.276321

Due to the development of the World Wide Web, the integration of heterogeneous data sources has become a major concern of the database community. Appropriate architectures and query languages have been proposed. Yet, the problem of data conversion which ...

article

Free

Using schematically heterogeneous structures

Reée J. Miller

Pages 189–200https://doi.org/10.1145/276305.276322

Schematic heterogeneity arises when information that is represented as data under one schema, is represented within the schema (as metadata) in another. Schematic heterogeneity is an important class of heterogeneity that arises frequently in integrating ...

article

Free

Integration of heterogeneous databases without common domains using queries based on textual similarity

William W. Cohen

Pages 201–212https://doi.org/10.1145/276305.276323

Most databases contain “name constants” like course numbers, personal names, and place names that correspond to entities in the real world. Previous work in integration of heterogeneous databases has assumed that local name constants can be mapped into ...

article

Free

The DEDALE system for complex spatial queries

Pages 213–224https://doi.org/10.1145/276305.276324

This paper presents DEDALE, a spatial database system intended to overcome some limitations of current systems by providing an abstract and non-specialized data model and query language for the representation and manipulation of spatial objects. DEDALE ...

article

Free

Similarity query processing using disk arrays

Pages 225–236https://doi.org/10.1145/276305.276325

Similarity queries are fundamental operations that are used extensively in many modern applications, whereas disk arrays are powerful storage media of increasing importance. The basic trade-off in similarity query processing in such a system is that ...

article

Free

Incremental distance join algorithms for spatial databases

Pages 237–248https://doi.org/10.1145/276305.276326

Two new spatial join operations, distance join and distance semi-join, are introduced where the join output is ordered by the distance between the spatial attribute values of the joined tuples. Incremental algorithms are presented for computing these ...

article

Free

An alternative storage organization for ROLAP aggregate views based on cubetrees

Pages 249–258https://doi.org/10.1145/276305.276327

The Relational On-Line Analytical Processing (ROLAP) is emerging as the dominant approach in data warehousing with decision support applications. In order to enhance query performance, the ROLAP approach relies on selecting and materializing in summary ...

article

Free

Caching multidimensional queries using chunks

Pages 259–270https://doi.org/10.1145/276305.276328

Caching has been proposed (and implemented) by OLAP systems in order to reduce response times for multidimensional queries. Previous work on such caching has considered table level caching and query level caching. Table level caching is more suitable ...

article

Free

Simultaneous optimization and evaluation of multiple dimensional queries

Pages 271–282https://doi.org/10.1145/276305.276329

Database researchers have made significant progress on several research issues related to multidimensional data analysis, including the development of fast cubing algorithms, efficient schemes for creating and maintaining precomputed group-bys, and the ...

article

Free

NoDoSE—a tool for semi-automatically extracting structured and semistructured data from text documents

Brad Adelberg

Pages 283–294https://doi.org/10.1145/276305.276330

Often interesting structured or semistructured data is not in database systems but in HTML pages, text files, or on paper. The data in these formats is not usable by standard query processing engines and hence users need a way of extracting data from ...

article

Free

Extracting schema from semistructured data

Pages 295–306https://doi.org/10.1145/276305.276331

Semistructured data is characterized by the lack of any fixed and rigid schema, although typically the data has some implicit structure. While the lack of fixed schema makes extracting semistructured data fairly easy and an attractive goal, presenting ...

article

Free

Enhanced hypertext categorization using hyperlinks

Pages 307–318https://doi.org/10.1145/276305.276332

A major challenge in indexing unstructured hypertext databases is to automatically extract meta-data that enables structured search using topic taxonomies, circumvents keyword ambiguity, and improves the quality of search and profile-based routing and ...

article

Free

Cost-based optimization of decision support queries using transient-views

Pages 319–330https://doi.org/10.1145/276305.276333

Next generation decision support applications, besides being capable of processing huge amounts of data, require the ability to integrate and reason over data from multiple, heterogeneous data sources. Often, these data sources differ in a variety of ...

article

Free

New sampling-based summary statistics for improving approximate query answers

Pages 331–342https://doi.org/10.1145/276305.276334

In large data recording and warehousing environments, it is often advantageous to provide fast, approximate answers to queries, whenever possible. Before DBMSs providing highly-accurate approximate answers can become a reality, many new techniques for ...

article

Free

Integrating association rule mining with relational database systems: alternatives and implications

Pages 343–354https://doi.org/10.1145/276305.276335

Data mining on large data warehouses is becoming increasingly important. In support of this trend, we consider a spectrum of architectural alternatives for coupling mining with database systems. These alternatives include: loose-coupling through a SQL ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Sections

Save to Binder

Subjects

Comments