[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/375663.375664acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Efficient computation of Iceberg cubes with complex measures

Published: 01 May 2001 Publication History

Abstract

It is often too expensive to compute and materialize a complete high-dimensional data cube. Computing an iceberg cube, which contains only aggregates above certain thresholds, is an effective way to derive nontrivial multi-dimensional aggregations for OLAP and data mining.
In this paper, we study efficient methods for computing iceberg cubes with some popularly used complex measures, such as average, and develop a methodology that adopts a weaker but anti-monotonic condition for testing and pruning search space. In particular, for efficient computation of iceberg cubes with the average measure, we propose a top-k average pruning method and extend two previously studied methods, Apriori and BUC, to Top-k Apriori and Top-k BUC. To further improve the performance, an interesting hypertree structure, called H-tree, is designed and a new iceberg cubing method, called Top-k H-Cubing, is developed. Our performance study shows that Top-k BUC and Top-k H-Cubing are two promising candidates for scalable computation, and Top-k H-Cubing has better performance in most cases.

References

[1]
S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. VLDB'96.
[2]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB'94.
[3]
R. J. Bayardo, R. Agrawal, and D. Gunopulos. Constraint-based rule mining on large, dense data sets. ICDE'99.
[4]
K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. SIGMOD'99.
[5]
S. Chaudhmi and U. Dayal. An overview of data warehousing and OLAP technology. ACM SIGMOD Record, 26:65-74, 1997.
[6]
M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. D. Ullman. Computing iceberg queries efficiently. VLDB'98.
[7]
G. Grahne, L. Lakshmanan, and X. Wang. Efficient mining of constrained correlated sets. ICDE'00.
[8]
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Veikkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery, 1:29-54, 1997.
[9]
J. Han, J. Pei, and Y. Yhl. Mining frequent patterns without candidate generation. SIGMOD'00.
[10]
V. Hmqnarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. SIGMOD'96.
[11]
L. V. S. Lakshmmlml, R. Ng, J. Hml, and A. Pang. Optimization of constrained frequent set queries with 2-variable constraints. SIGMOD'99.
[12]
R. Ng, L. V. S. Lakshmmlml, J. Hml, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. SIGMOD'98.
[13]
J. Pei and J. Han. Can we push more constraints into frequent pattern mining? KDD'00.
[14]
K. Ross and D. Srivastava. Fast computation of sparse datacubes. VLDB'97.
[15]
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. KDD'97.
[16]
Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. SIGMOD'97.

Cited By

View all
  • (2022)Enabling efficient and general subpopulation analytics in multidimensional data streamsProceedings of the VLDB Endowment10.14778/3551793.355186715:11(3249-3262)Online publication date: 1-Jul-2022
  • (2021)A Complete Index Base for Querying Data CubeIntelligent Systems and Applications10.1007/978-3-030-82196-8_36(486-500)Online publication date: 3-Aug-2021
  • (2020)Turbocharging Geospatial Visualization Dashboards via a Materialized Sampling Cube Approach2020 IEEE 36th International Conference on Data Engineering (ICDE)10.1109/ICDE48307.2020.00105(1165-1176)Online publication date: Apr-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data
May 2001
630 pages
ISBN:1581133324
DOI:10.1145/375663
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2001

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS01
Sponsor:

Acceptance Rates

SIGMOD '01 Paper Acceptance Rate 44 of 293 submissions, 15%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Enabling efficient and general subpopulation analytics in multidimensional data streamsProceedings of the VLDB Endowment10.14778/3551793.355186715:11(3249-3262)Online publication date: 1-Jul-2022
  • (2021)A Complete Index Base for Querying Data CubeIntelligent Systems and Applications10.1007/978-3-030-82196-8_36(486-500)Online publication date: 3-Aug-2021
  • (2020)Turbocharging Geospatial Visualization Dashboards via a Materialized Sampling Cube Approach2020 IEEE 36th International Conference on Data Engineering (ICDE)10.1109/ICDE48307.2020.00105(1165-1176)Online publication date: Apr-2020
  • (2019)Distributed graph cube generation using Spark frameworkThe Journal of Supercomputing10.1007/s11227-019-02746-4Online publication date: 10-Jan-2019
  • (2018)Frequent items counter based on binary decodersIEICE Electronics Express10.1587/elex.15.2018080815:20(20180808-20180808)Online publication date: 2018
  • (2018)An Optimal Algorithm for ℓ1-Heavy Hitters in Insertion Streams and Related ProblemsACM Transactions on Algorithms10.1145/326442715:1(1-27)Online publication date: 22-Oct-2018
  • (2018)VLSI Design of Frequent Items Counting Using Binary Decoders Applied to 8-bit per Item Case-study2018 14th Conference on Ph.D. Research in Microelectronics and Electronics (PRIME)10.1109/PRIME.2018.8430308(161-164)Online publication date: Jul-2018
  • (2018)Cost effective, rule based, big data analytical aggregation engine for investment portfoliosWireless Networks10.1007/s11276-018-01904-528:3(1203-1209)Online publication date: 11-Dec-2018
  • (2018)Efficient OLAP algorithms on GPU-accelerated Hadoop clustersDistributed and Parallel Databases10.1007/s10619-018-7239-z37:4(507-542)Online publication date: 31-Jul-2018
  • (2018)Scalable distributed data cube computation for large-scale multidimensional data analysis on a Spark clusterCluster Computing10.1007/s10586-018-1811-1Online publication date: 1-Feb-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media