Essertel et al., 2018 - Google Patents
Flare: Optimizing Apache Spark with Native Compilation for {Scale-Up} Architectures and {Medium-Size} DataEssertel et al., 2018
View PDF- Document ID
- 9221690369370968133
- Author
- Essertel G
- Tahboub R
- Decker J
- Brown K
- Olukotun K
- Rompf T
- Publication year
- Publication venue
- 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18)
External Links
Snippet
In recent years, Apache Spark has become the de facto standard for big data processing. Spark has enabled a wide audience of users to process petabyte-scale workloads due to its flexibility and ease of use: users are able to mix SQL-style relational queries with Scala or …
- 238000000034 method 0 abstract description 14
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30442—Query optimisation
- G06F17/30445—Query optimisation for parallel queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30389—Query formulation
- G06F17/30404—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30477—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30587—Details of specialised database models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/43—Checking; Contextual analysis
- G06F8/436—Semantic checking
- G06F8/437—Type checking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/45—Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
- G06F8/456—Parallelism detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/4421—Execution paradigms
- G06F9/4436—Data-driven
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/362—Software debugging
- G06F11/3636—Software debugging by tracing the execution of the program
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/31—Programming languages or programming paradigms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Essertel et al. | Flare: Optimizing Apache Spark with Native Compilation for {Scale-Up} Architectures and {Medium-Size} Data | |
Begoli et al. | Apache calcite: A foundational framework for optimized query processing over heterogeneous data sources | |
Palkar et al. | Evaluating end-to-end optimization for data analytics applications in weld | |
Klonatos et al. | Building efficient query engines in a high-level language | |
Wanderman-Milne et al. | Runtime Code Generation in Cloudera Impala. | |
Shaikhha et al. | How to architect a query compiler | |
Jahani et al. | Automatic optimization for MapReduce programs | |
Armbrust et al. | Spark sql: Relational data processing in spark | |
Bu et al. | A bloat-aware design for big data applications | |
Alexandrov et al. | The stratosphere platform for big data analytics | |
Shaikhha et al. | Building efficient query engines in a high-level language | |
Xin et al. | Shark: SQL and rich analytics at scale | |
Freedman et al. | Compilation in the Microsoft SQL Server Hekaton Engine. | |
Guo et al. | Spotting Code Optimizations in {Data-Parallel} Pipelines through {PeriSCOPE} | |
Murray et al. | Steno: Automatic optimization of declarative queries | |
Essertel et al. | Flare: Native compilation for heterogeneous workloads in Apache Spark | |
Jungmair et al. | Designing an open framework for query optimization and compilation | |
Phani et al. | Lima: Fine-grained lineage tracing and reuse in machine learning systems | |
Cheng et al. | SCANRAW: A database meta-operator for parallel in-situ processing and loading | |
Palkar et al. | Weld: Rethinking the interface between data-intensive applications | |
Stewart et al. | Comparing high level mapreduce query languages | |
Leeka et al. | Incorporating super-operators in big-data query optimizers | |
Zou et al. | PlinyCompute: A platform for high-performance, distributed, data-intensive tool development | |
Schiavio et al. | Dynamic speculative optimizations for SQL compilation in Apache Spark | |
Onizuka et al. | Optimization for iterative queries on MapReduce |