[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1755913.1755937acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Boom analytics: exploring data-centric, declarative programming for the cloud

Published: 13 April 2010 Publication History

Abstract

Building and debugging distributed software remains extremely difficult. We conjecture that by adopting a data-centric approach to system design and by employing declarative programming languages, a broad range of distributed software can be recast naturally in a data-parallel programming model. Our hope is that this model can significantly raise the level of abstraction for programmers, improving code simplicity, speed of development, ease of software evolution, and program correctness.
This paper presents our experience with an initial large-scale experiment in this direction. First, we used the Overlog language to implement a "Big Data" analytics stack that is API-compatible with Hadoop and HDFS and provides comparable performance. Second, we extended the system with complex distributed features not yet available in Hadoop, including high availability, scalability, and unique monitoring and debugging facilities. We present both quantitative and anecdotal results from our experience, providing some concrete evidence that both data-centric design and declarative languages can substantially simplify distributed systems programming.

References

[1]
A. Abouzeid et al. HadoopDB: An architectural hybrid of MapReduce and DBMS technologies for analytical workloads. In VLDB, 2009.
[2]
P. Alvaro et al. BOOM: Data-centric programming in the datacenter. Technical Report UCB/EECS-2009-113, EECS Department, University of California, Berkeley, Jul 2009.
[3]
P. Alvaro et al. Dedalus: Datalog in time and space. Technical Report UCB/EECS-2009-173, EECS Department, University of California, Berkeley, Dec 2009.
[4]
P. Alvaro et al. I Do Declare: Consensus in a logic language. In NetDB, 2009.
[5]
M. P. Ashley-Rollman et al.Declarative Programming for Modular Robots. In Workshop on Self-Reconfigurable Robots/Systems and Applications, 2007.
[6]
N. Belaramani et al. PADS: A policy architecture for data replication systems. In NSDI, 2009.
[7]
M. Burrows. The Chubby lock service for loosely-coupled distributed systems. In OSDI, 2006.
[8]
D. Cabrero et al. ARMISTICE: an experience developing management software with Erlang. In ACM SIGPLAN Workshop on Erlang, 2003.
[9]
T. D. Chandra et al. Paxos made live: an engineering perspective. In PODC, 2007.
[10]
T. Condie et al. Evita Raced: metacompilation for declarative networks. In VLDB, 2008.
[11]
J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In OSDI, 2004.
[12]
G. DeCandia et al. Dynamo: Amazon's highly available key-value store. In SOSP, 2007.
[13]
J. Eisner et al. Dyna: a declarative language for implementing dynamic programs. In ACL, 2004.
[14]
S. Ghemawat et al. The Google file system. In SOSP, 2003.
[15]
H. S. Gunawi et al. SQCK: A Declarative File System Checker. In OSDI, 2008.
[16]
A. Gupta et al. Constraint checking with partial information. In PODS, 1994.
[17]
M. Isard et al. Dryad: distributed data-parallel programs from sequential building blocks. In EuroSys, 2007.
[18]
M. B. Jones. Interposition agents: transparently interposing user code at the system interface. In SOSP, 1993.
[19]
E. Kohler et al. The Click modular router. ACM Transactions on Computer Systems, 18(3):263--297, August 2000.
[20]
M. S. Lam et al. Context-sensitive program analysis as database queries. In PODS, 2005.
[21]
L. Lamport. The part-time parliament. ACM Transactions on Computer Systems, 16(2):133--169, 1998.
[22]
LATE Hadoop Jira. Hadoop jira issue tracker, July 2009. http://issues.apache.org/jira/browse/HADOOP.
[23]
B. T. Loo et al. Declarative networking: language, execution and optimization. In SIGMOD, 2006.
[24]
B. T. Loo et al. Implementing declarative overlays. In SOSP, 2005.
[25]
N. A. Lynch. Distributed Algorithms. Morgan Kaufmann, 1997.
[26]
W. R. Marczak et al. Declarative reconfigurable trust management. In CIDR, 2009.
[27]
F. Marguerie et al. LINQ In Action. Manning Publications Co., 2008.
[28]
Nokia Corporation. disco: massive data -- minimal code, 2009. http://discoproject.org/.
[29]
T. Schutt et al. Scalaris: Reliable transactional P2P key/value store. In ACM SIGPLAN Workshop on Erlang, 2008.
[30]
R. Sears and E. Brewer. Stasis: flexible transactional storage. In OSDI, 2006.
[31]
A. Singh et al. Using queries for distributed monitoring and forensics. In EuroSys, 2006.
[32]
A. Singh et al. BFT protocols under fire. In NSDI, 2008.
[33]
M. Stonebraker. Inclusion of new types in relational data base systems. In ICDE, 1986.
[34]
B. Szekely and E. Torres, Dec. 2005.http://www.klinewoods.com/papers/p2paxos.pdf.
[35]
A. Thusoo et al. Hive -- a warehousing solution over a Map-Reduce framework. In VLDB, 2009.
[36]
J. D. Ullman. Principles of Database and Knowledge-Base Systems: Volume II: The New Technologies. W. H. Freeman & Company, 1990.
[37]
W. White et al. Scaling games to epic proportions. In SIGMOD, 2007.
[38]
F. Yang et al. Hilda: A high-level language for data-driven web applications. In ICDE, 2006.
[39]
Y. Yu et al.DryadLINQ: A system for general-purpose distributed data-parallel computing using a high-level language. In OSDI, 2008.
[40]
M. Zaharia et al. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In EuroSys, 2010.
[41]
M. Zaharia et al. Improving MapReduce performance in heterogeneous environments. In OSDI, 2008.

Cited By

View all
  • (2024)Making Formulog Fast: An Argument for Unconventional Datalog EvaluationProceedings of the ACM on Programming Languages10.1145/36897548:OOPSLA2(1219-1248)Online publication date: 8-Oct-2024
  • (2024)Object-Oriented Fixpoint Programming with DatalogProceedings of the ACM on Programming Languages10.1145/36897138:OOPSLA2(60-86)Online publication date: 8-Oct-2024
  • (2024)Separate Compilation and Partial Linking: Modules for Datalog IRProceedings of the 23rd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3689484.3690737(94-106)Online publication date: 21-Oct-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '10: Proceedings of the 5th European conference on Computer systems
April 2010
388 pages
ISBN:9781605585772
DOI:10.1145/1755913
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cloud computing
  2. datalog
  3. mapreduce

Qualifiers

  • Research-article

Conference

EuroSys '10
Sponsor:
EuroSys '10: Fifth EuroSys Conference 2010
April 13 - 16, 2010
Paris, France

Acceptance Rates

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Making Formulog Fast: An Argument for Unconventional Datalog EvaluationProceedings of the ACM on Programming Languages10.1145/36897548:OOPSLA2(1219-1248)Online publication date: 8-Oct-2024
  • (2024)Object-Oriented Fixpoint Programming with DatalogProceedings of the ACM on Programming Languages10.1145/36897138:OOPSLA2(60-86)Online publication date: 8-Oct-2024
  • (2024)Separate Compilation and Partial Linking: Modules for Datalog IRProceedings of the 23rd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3689484.3690737(94-106)Online publication date: 21-Oct-2024
  • (2024)CyberDS: Auditable Monitoring in the CloudComputer Safety, Reliability, and Security10.1007/978-3-031-68606-1_7(100-115)Online publication date: 9-Sep-2024
  • (2023)Enhancing datalog reasoning with hypertree decompositionsProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/377(3383-3393)Online publication date: 19-Aug-2023
  • (2023)Scaling a Declarative Cluster Manager Architecture with Query Optimization TechniquesProceedings of the VLDB Endowment10.14778/3603581.360359916:10(2618-2631)Online publication date: 8-Aug-2023
  • (2023)Interactive Debugging of Datalog ProgramsProceedings of the ACM on Programming Languages10.1145/36228247:OOPSLA2(745-772)Online publication date: 16-Oct-2023
  • (2023)From SMT to ASP: Solver-Based Approaches to Solving Datalog Synthesis-as-Rule-Selection ProblemsProceedings of the ACM on Programming Languages10.1145/35712007:POPL(185-217)Online publication date: 11-Jan-2023
  • (2022)DBOSProceedings of the VLDB Endowment10.14778/3485450.348545415:1(21-30)Online publication date: 14-Jan-2022
  • (2022)Transactions across serverless functions leveraging stateful dataflowsInformation Systems10.1016/j.is.2022.102015108:COnline publication date: 3-Jun-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media