Abstract
Data cube pre-computation is an important concept for supporting OLAP (Online Analytical Processing) and has been studied extensively. It is often not feasible to compute a complete data cube due to the huge storage requirement. Recently proposed quotient cube addressed this issue through a partitioning method that groups cube cells into equivalence partitions. Such an approach not only is useful for distributive aggregate functions such as SUM but also can be applied to the maintenance of holistic aggregate functions like MEDIAN which will require the storage of a set of tuples for each equivalence class. Unfortunately, as changes are made to the data sources, maintaining the quotient cube is non-trivial since the partitioning of the cube cells must also be updated. In this paper, the authors design incremental algorithms to update a quotient cube efficiently for both SUM and MEDIAN aggregate functions. For the aggregate function SUM, concepts are borrowed from the principle of Galois Lattice to develop CPU-efficient algorithms to update a quotient cube. For the aggregate function MEDIAN, the concept of a pseudo class is introduced to further reduce the size of the quotient cube. Coupled with a novel sliding window technique, an efficient algorithm is developed for maintaining a MEDIAN quotient cube that takes up reasonably small storage space. Performance study shows that the proposed algorithms are efficient and scalable over large databases.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agarwal S, Agrawal R, Deshpande P M et al. On the computation of multidimensional aggregates. In Proc. 1996 Int. Conf. Very Large Data Bases (VLDB'96), Bombay, India, Sept. 1996, pp.506–521.
Barbara D. Quasi-cubes: Exploiting approximation in multidimensional databases. SIGMOD Record, 1997, 26: 12–17.
Beyer K, Ramakrishnan R. Bottom-up computation of sparse and iceberg cubes. In Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'99), Philadelphia, PA, June 1999, pp.359–370.
Davey A, Priestley H A. Introduction to Lattices and Order. Cambridge University Press, 1990.
Ganter B, Wille R, Franzke C. Formal Concept Analysis: Mathematical Foundations. Springer-Verlag, 1999.
Godin R, Missaoui R, Alaoui H. Incremental concept formation algorithms based on Galois lattices. Computational Intelligence, 1991, 11: 246–267.
Gray J, Chaudhuri S, Bosworth A et al. Data cube: A relational aggregation operator generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery, 1997, 1: 29–54.
Harinarayan V, Rajaraman A, Ullman J D. Implementing data cubes efficiently. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'96), Montreal, Canada, June 1996, pp.205–216.
Labio W J, Yang J, Cui Y et al. Performance issues in incremental warehouse maintenance. In Proc. the 26th Int. Conference on Very Large Data Bases (VLDB'00), Cairo, Egypt, 2000, pp.461–472.
Lakshmanan L V S, Pei J, Han J W. Quotient cube: How to summarize the semantics of a data cube. In Proc. 2002 Int. Conf. Very Large Data Bases (VLDB'02), Hong Kong, China, 2002, pp.227–238.
Mumick I S, Quass D, Mumick B S. Maintenance of data cubes and summary tables in a warehouse. In Proc. ACM-SIGMOD Int. Conference on Management of Data, Tucson, Arizona, USA, 1997, pp.100–111.
Palpanas T, Sidle R, Cochrane R, Pirahesh H. Incremental maintenance for non-distributive aggregate functions. In Proc. 2002 Int. Conf. Very Large Data Bases (VLDB'02), Hong Kong, China, 2002, pp.177–188.
Ross K, Srivastava D. Fast computation of sparse datacubes. In Proc. 1997 Int. Conf. Very Large Data Bases (VLDB'97), Athens, Greece, Aug. 1997, pp.116–125.
Ross K A, Srivastava D, Sudarshan S. Materialized view maintenance and integrity constraint checking: Trading space for time. In Proc. ACM-SIGMOD Int. Conf. Management of Data, Montreal, Quebec, Canada, 1996, pp.447–458.
Shanmugasundaram J, Fayyad U, Bradley P S. Compressed data cubes for OLAP aggregate query approximation on continuous dimensions. In Proc. ACM-SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Diego, CA, USA, 1999, pp.223–232.
Sismanis Y, Deligiannakis A, Roussopoulos N, Kotidis Y. Dwarf: Shrinking the petacube. In Proc. ACM-SIGMOD Int. Conf. Management of Data, Madison, Wisconsin, USA, 2002, pp.464–475.
Anthony K H Tung, Cuiping Li, Gao Cong. Incremental maintenance of quotient cube for Sum and Median. Technical Report 02-1, School of Computing, NUS, Dec. 2, 2002, pp.1–18.
Wang W, Feng J L, Lu H J, Yu J X. Condensed cube: An effective approach to reducing data cube size. In Proc. 2002 Int. Conf. Data Engineering (ICDE'02), San Jose, CA, USA, 2002, pp.155–165.
Yi K, Yu H, Yang J, Xia G, Chen Y. Efficient maintenance of materialized top-k views. In Proc. 2003 Int. Conf. Data Engineering (ICDE'03), Bangalore, India, 2003, pp.189–200.
Zhao Y, Deshpande P M, Naughton J F. An array-based algorithm for simultaneous multidimensional aggregates. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD'97), Tucson, Arizona, May 1997, pp.159–170.
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is supported by the National Natural Science Foundation of China (Grant Nos. 60473069, 60496325, 60273071).
Cui-Ping Li received her Ph.D. degree from the Institute of Computing Technology, the Chinese Academy of Sciences in 2003. She is now an assistant professor of the information School, Renmin University of China. Her research interests include database systems, data warehouse, and data mining.
Shan Wang received her B.S. degree from Peking University in 1968 and her M.S. degree from the Renmin University of China in 1981. She is now the dean and a professor of the Information School, Renmin University of China. She is also a research fellow and a Ph.D. supervisor of the Institute of Computing Technology, the Chinese Academy of Sciences. Her research interests include database systems and knowledge engineering, mobile data management, and data warehousing.
Rights and permissions
About this article
Cite this article
Li, CP., Wang, S. Efficient Incremental Maintenance for Distributive and Non-Distributive Aggregate Functions. J Comput Sci Technol 21, 52–65 (2006). https://doi.org/10.1007/s11390-006-0052-6
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s11390-006-0052-6