Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleSeptember 2024
Revamping Sampling-Based PGO with Context-Sensitivity and Pseudo-instrumentation
CGO '24: Proceedings of the 2024 IEEE/ACM International Symposium on Code Generation and OptimizationPages 322–333https://doi.org/10.1109/CGO57630.2024.10444807The ever increasing scale of modern data center demands more effective optimizations, as even a small percentage of performance improvement can result in a significant reduction in data-center cost and its environmental footprint. However, the diverse ...
- research-articleMarch 2022
Loner: utilizing the CPU vector datapath to process scalar integer data
CC 2022: Proceedings of the 31st ACM SIGPLAN International Conference on Compiler ConstructionPages 205–217https://doi.org/10.1145/3497776.3517767Modern CPUs utilize SIMD vector instructions and hardware extensions to accelerate code with data-level parallelism. This allows for high performance gains in select application domains such as image and signal processing. However, general purpose code ...
- research-articleFebruary 2021
PGZ: automatic zero-value code specialization
CC 2021: Proceedings of the 30th ACM SIGPLAN International Conference on Compiler ConstructionPages 36–46https://doi.org/10.1145/3446804.3446845In prior work we proposed Zeroploit, a transform that duplicates code, specializes one path assuming certain key program operands, called versioning variables, are zero, and leaves the other path unspecialized. Dynamically, depending on the versioning ...
- research-articleMay 2017
Quantitative Driven Optimization of a Time Warp Kernel
SIGSIM-PADS '17: Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete SimulationPages 27–38https://doi.org/10.1145/3064911.3064932The set of events available for execution in a Parallel Discrete Event Simulation (PDES) are known as the pending event set. In a Time Warp synchronized simulation engine, these pending events are scheduled for execution in an aggressive manner that ...
- ArticleJune 2006
Large scale Itanium® 2 processor OLTP workload characterization and optimization
DaMoN '06: Proceedings of the 2nd international workshop on Data management on new hardwarePages 3–eshttps://doi.org/10.1145/1140402.1140406Large scale OLTP workloads on modern database servers are well understood across the industry. Their runtime performance characterizations serve to drive both server side software features and processor specific design decisions but are not understood ...