[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/308386.308455acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article
Free access

Statistical estimators for relational algebra expressions

Published: 01 March 1988 Publication History

Abstract

Present database systems process all the data related to a query before giving out responses. As a result, the size of the data to be processed becomes excessive for real-time/time-constrained environments. A new methodology is needed to cut down systematically the time to process the data involved in processing the query. To this end, we propose to use data samples and construct an approximate synthetic response to a given query.
In this paper, we consider only COUNT(E) type queries, where E is an arbitrary relational algebra expression. We make no assumptions about the distribution of attribute values and ordering of tuples in the input relations, and propose consistent and unbiased estimators for arbitrary COUNT(E) type queries. We design a sampling plan based on the cluster sampling method to improve the utilization of sampled data and to reduce the cost of sampling. We also evaluate the performance of the proposed estimators.

References

[1]
Cochmn. w o, "~g T~que,". T~rd ~d John Wdey & Sons, 1977
[2]
Chnstodoulakas, S. "Estunaung Record SelecuvtJes", Informaucm Systems, Vol 8, 1983
[3]
Devote, J L, "Probabthty & Staumcs for Eng, neermg and Sclences", Brook/Cole, 1984
[4]
Datta, A, Fourmer, B. Hc~. W-C, and Ozsoyoglu, G, "The Implementat~n of SSDB". Proc Thml Internauonal Workshop oa Stausucal Databam Management", July 1986.
[5]
Goodman, LA., "On the Esamauon of the N~ of Classes m a Populaum", Ann Math Sta., 1949
[6]
Hou, W-C, Ozsoyoglu, G, and Taneja, B k., "Sta- U~cal Emmators for RelalLtonal Algebra Expres. mons", Tech Rpt. CES-87-15, CWRU, 1987
[7]
Lm, C L, "Intmdu~m to Ccxnbmatonal Mathematics", McGraw-Hall, 196&
[8]
Morgenstem, J P, "Compmer Based Management lnfommtton Systems Embodymg Answer Accuracy As a User Parameter", Ph D Thems, Umv of Cahfornm, Berkeley, 1981
[9]
Olken, F, "Phymcal Database Support for Sc~enufic and Statlmcal Databases", Third Int. Scsenttfic and Statlmcal Databases Workshop, 1986
[10]
Olken, F and Rotem, D, "Ssmple Random Samphng from Relational Databases", VLDB Conf 1986
[11]
Ross, S M, "lntroductloa to Probabshty Models", 2nd Ed., Acndenac Press, 1980
[12]
Rowe, N C, "Antlsamphn8 for Emmattom An overvsew", IEE Trans on Software Eng., Oct 1985
[13]
Scheaffer, MendenhaH, and Ott, "Elementary Survey Samphn8", 3rd Ed, Duxbury press, 1986
[14]
Sukhatme, P V, etc, "Samphng ~ry of Surveys Apphcauon", 3rd Ed, New Delhi, Indm and Iowa State Umv Press, 1984

Cited By

View all
  • (2024)Learning-based Property Estimation with PolynomialsProceedings of the ACM on Management of Data10.1145/36549942:3(1-27)Online publication date: 30-May-2024
  • (2022)SMARTEN—A Sample-Based Approach towards Privacy-Friendly Data RefinementJournal of Cybersecurity and Privacy10.3390/jcp20300312:3(606-628)Online publication date: 15-Aug-2022
  • (2022)Sampling-based Estimation of the Number of Distinct Values in Distributed EnvironmentProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539390(893-903)Online publication date: 14-Aug-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '88: Proceedings of the seventh ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
March 1988
352 pages
ISBN:0897912632
DOI:10.1145/308386
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 1988

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS05

Acceptance Rates

Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)5
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Learning-based Property Estimation with PolynomialsProceedings of the ACM on Management of Data10.1145/36549942:3(1-27)Online publication date: 30-May-2024
  • (2022)SMARTEN—A Sample-Based Approach towards Privacy-Friendly Data RefinementJournal of Cybersecurity and Privacy10.3390/jcp20300312:3(606-628)Online publication date: 15-Aug-2022
  • (2022)Sampling-based Estimation of the Number of Distinct Values in Distributed EnvironmentProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3534678.3539390(893-903)Online publication date: 14-Aug-2022
  • (2021)Rapid Approximate Aggregation with Distribution-Sensitive Interval Guarantees2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00150(1703-1714)Online publication date: Apr-2021
  • (2021)Join cardinality estimation by combining operator-level deep neural networksInformation Sciences10.1016/j.ins.2020.09.065546(1047-1062)Online publication date: Feb-2021
  • (2017)I've seen "enough"Proceedings of the VLDB Endowment10.14778/3137628.313763710:11(1262-1273)Online publication date: 1-Aug-2017
  • (2015)Rapid sampling for visualizations with ordering guaranteesProceedings of the VLDB Endowment10.14778/2735479.27354858:5(521-532)Online publication date: 1-Jan-2015
  • (2015)Multidimensional Cluster Sampling View on Large Databases for Approximate Query ProcessingProceedings of the 2015 IEEE 19th International Enterprise Distributed Object Computing Conference10.1109/EDOC.2015.24(104-111)Online publication date: 21-Sep-2015
  • (2014)Uncertainty aware query execution time predictionProceedings of the VLDB Endowment10.14778/2733085.27330927:14(1857-1868)Online publication date: 1-Oct-2014
  • (2013)CS2Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data10.1145/2463676.2463701(469-480)Online publication date: 22-Jun-2013
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media