Hernández et al., 2018 - Google Patents
Using machine learning to optimize parallelism in big data applicationsHernández et al., 2018
View PDF- Document ID
- 17138672356390349428
- Author
- Hernández
- Perez M
- Gupta S
- Muntés-Mulero V
- Publication year
- Publication venue
- Future Generation Computer Systems
External Links
Snippet
In-memory cluster computing platforms have gained momentum in the last years, due to their ability to analyse big amounts of data in parallel. These platforms are complex and difficult-to- manage environments. In addition, there is a lack of tools to better understand and optimize …
- 238000010801 machine learning 0 title abstract description 29
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G06F17/30533—Other types of queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3442—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment for planning or managing the needed capacity
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management, e.g. organising, planning, scheduling or allocating time, human or machine resources; Enterprise planning; Organisational models
- G06Q10/063—Operations research or analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hernández et al. | Using machine learning to optimize parallelism in big data applications | |
Herodotou et al. | A survey on automatic parameter tuning for big data processing systems | |
Bakshi | Considerations for big data: Architecture and approach | |
Wu et al. | A self-tuning system based on application profiling and performance analysis for optimizing hadoop mapreduce cluster configuration | |
Bautista Villalpando et al. | Performance analysis model for big data applications in cloud computing | |
US20070022142A1 (en) | System and method to generate domain knowledge for automated system management by combining designer specifications with data mining activity | |
Dartois et al. | Investigating machine learning algorithms for modeling ssd i/o performance for container-based virtualization | |
Chao et al. | A gray-box performance model for apache spark | |
Wang et al. | Actcap: Accelerating mapreduce on heterogeneous clusters with capability-aware data placement | |
Pan et al. | I/O characterization of big data workloads in data centers | |
Ganapathi | Predicting and optimizing system utilization and performance via statistical machine learning | |
Saxena et al. | Auto-WLM: Machine learning enhanced workload management in Amazon Redshift | |
Premchaiswadi et al. | Optimizing and tuning MapReduce jobs to improve the large‐scale data analysis process | |
Noel et al. | Towards self-managing cloud storage with reinforcement learning | |
Noorshams | Modeling and prediction of i/o performance in virtualized environments | |
Li et al. | Data balancing-based intermediate data partitioning and check point-based cache recovery in Spark environment | |
Khattab et al. | MAG: A performance evaluation framework for database systems | |
Kumar et al. | Replication-Based Query Management for Resource Allocation Using Hadoop and MapReduce over Big Data | |
Subedi et al. | Rise: Reducing i/o contention in staging-based extreme-scale in-situ workflows | |
Chen et al. | ALBERT: an automatic learning based execution and resource management system for optimizing Hadoop workload in clouds | |
Cano | Optimizing distributed systems using machine learning | |
Zhang et al. | Automatic Configuration Tuning on Cloud Database: A Survey | |
Bellamkonda Sathyanarayanan et al. | A novel oppositional chaotic flower pollination optimization algorithm for automatic tuning of Hadoop configuration parameters | |
Heidsieck et al. | Cache-aware scheduling of scientific workflows in a multisite cloud | |
Du et al. | Stargazer: Toward efficient data analytics scheduling via task completion time inference |