- Sponsor:
- sigops
No abstract available.
Proceeding Downloads
Unikernels: The Next Stage of Linux's Dominance
- Ali Raza,
- Parul Sohal,
- James Cadden,
- Jonathan Appavoo,
- Ulrich Drepper,
- Richard Jones,
- Orran Krieger,
- Renato Mancuso,
- Larry Woodman
Unikernels have demonstrated enormous advantages over Linux in many important domains, causing some to propose that the days of Linux's dominance may be coming to an end. On the contrary, we believe that unikernels' advantages represent the next natural ...
A fork() in the road
The received wisdom suggests that Unix's unusual combination of fork() and exec() for process creation was an inspired design. In this paper, we argue that fork was a clever hack for machines and programs of the 1970s that has long outlived its ...
Can We Prove Time Protection?
Timing channels are a significant and growing security threat in computer systems, with no established solution. We have recently argued that the OS must provide time protection, in analogy to the established memory protection, to protect applications ...
Towards Automatic Inference of Inductive Invariants
Distributed systems are notoriously difficult to design and implement correctly. Formal verification provides correctness proofs, and has recently been successfully applied to various distributed systems. At the heart of a typical formal verification is ...
RedLeaf: Towards An Operating System for Safe and Verified Firmware
RedLeaf is a new operating system being developed from scratch to utilize formal verification for implementing provably secure firmware. RedLeaf is developed in a safe language, Rust, and relies on automated reasoning using satisfiability modulo ...
Synthesizing Cluster Management Code for Distributed Systems
Management planes for data-center systems are complicated to develop, test, maintain, and evolve. They routinely grapple with hard combinatorial optimization problems like load balancing, placement, scheduling, rolling upgrades and configuration ...
Comprehensive and Efficient Runtime Checking in System Software through Watchdogs
Systems software today is composed of numerous modules and exhibits complex failure modes. Existing failure detectors focus on catching simple, complete failures and treat programs uniformly at the process level. In this paper, we argue that modern ...
Automatic Virtualization of Accelerators
Applications are migrating en masse to the cloud, while accelerators such as GPUs, TPUs, and FPGAs proliferate in the wake of Moore's Law. These technological trends are incompatible. Cloud applications run on virtual platforms, but traditional I/O ...
The Case for I/O-Device-as-a-Service
Many computer systems, especially mobile and IoT systems, use a large number of I/O devices. A contemporary OS acts as a security guard for these devices, which trust the OS to correctly implement the "perimeter defense." Moreover, the OS also trusts ...
I'm Not Dead Yet!: The Role of the Operating System in a Kernel-Bypass Era
Researchers have long predicted the demise of the operating system [21, 26, 41]. As datacenter servers increasingly incorporate I/O devices that let applications bypass the OS kernel (e.g., RDMA [12] and DPDK [15] network devices or SPDK storage devices)...
I/O Is Faster Than the CPU: Let's Partition Resources and Eliminate (Most) OS Abstractions
I/O is getting faster in servers that have fast programmable NICs and non-volatile main memory operating close to the speed of DRAM, but single-threaded CPU speeds have stagnated. Applications cannot take advantage of modern hardware capabilities when ...
Towards Multiverse Databases
- Alana Marzoev,
- Lara Timbó Araújo,
- Malte Schwarzkopf,
- Samyukta Yagati,
- Eddie Kohler,
- Robert Morris,
- M. Frans Kaashoek,
- Sam Madden
A multiverse database transparently presents each application user with a flexible, dynamic, and independent view of shared data. This transformed view of the entire database contains only information allowed by a centralized and easily-auditable ...
Isolation and Beyond: Challenges for System Security
System security has historically relied on hardware-provided isolation primitives. However, Meltdown [36] and Spectre [30] demonstrate that basic user/kernel isolation could be bypassed in every widely deployed ISA for decades; they are a caution to ...
Rethinking General-Purpose Decentralized Computing
While showing great promise, smart contracts are difficult to program correctly, as they need a deep understanding of cryptography and distributed algorithms, and offer limited functionality, as they have to be deterministic and cannot operate on secret ...
Fast key-value stores: An idea whose time has come and gone
Remote, in-memory key-value (RINK) stores such as Memcached [6] and Redis [7] are widely used in industry and are an active area of academic research. Coupled with stateless application servers to execute business logic and a databaselike system to ...
Designing Far Memory Data Structures: Think Outside the Box
Technologies like RDMA and Gen-Z, which give access to memory outside the box, are gaining in popularity. These technologies provide the abstraction of far memory, where memory is attached to the network and can be accessed by remote processors without ...
Project PBerry: FPGA Acceleration for Remote Memory
- Irina Calciu,
- Ivan Puddu,
- Aasheesh Kolli,
- Andreas Nowatzyk,
- Jayneel Gandhi,
- Onur Mutlu,
- Pratap Subrahmanyam
Recent research efforts propose remote memory systems that pool memory from multiple hosts. These systems rely on the virtual memory subsystem to track application memory accesses and transparently offer remote memory to applications. We outline several ...
Nines are Not Enough: Meaningful Metrics for Clouds
Cloud customers want strong, understandable promises (Service Level Objectives, or SLOs) that their applications will run reliably and with adequate performance, but cloud providers don't want to offer them, because they are technically hard to meet in ...
The Synchronous Data Center
Today, distributed systems are typically designed to be largely asynchronous. Designers assume that the network can drop or significantly delay messages at unpredictable times, that there is no way to know how quickly a node might process a message, or ...
Granular Computing
Granular computing is a new style of computing where applications are composed of large numbers (thousands to millions) of very short-lived (10-100μs) tasks. Today's systems and infrastructure were designed to support millisecond-scale operations and ...
What bugs cause production cloud incidents?
Cloud services have become the backbone of today's computing world. Runtime incidents, which adversely affect the expected service operations, are extremely costly in terms of user impacts and engineering efforts required to resolve them. Hence, such ...
You can't debug what you can't see: Expanding observability with the OmniTable
The effectiveness of a debugging tool is fundamentally limited by what program state it can observe. Yet, for performance reasons, all current debugging tools restrict the program state that can be observed in some way. For example, tools like heap ...
Practical Safe Linux Kernel Extensibility
The ability to extend kernel functionality safely has long been a design goal for operating systems. Modern operating systems, such as Linux, are structured for extensibility to enable sharing a single code base among many environments. Unfortunately, ...
Machine Learning Systems are Stuck in a Rut
In this paper we argue that systems for numerical computing are stuck in a local basin of performance and programmability. Systems researchers are doing an excellent job improving the performance of 5-year-old benchmarks, but gradually making it harder ...
A Case for Managed and Model-less Inference Serving
The number of applications relying on inference from machine learning models, especially neural networks, is already large and expected to keep growing. For instance, Facebook applications issue tens-of-trillions of inference queries per day with ...
Why and How to Increase SSD Performance Transparency
Even on modern SSDs, I/O scheduling is a first-order performance concern. However, it is unclear how best to optimize I/O patterns for SSDs, because a complex layer of proprietary firmware hides many principal aspects of performance, as well as SSD ...
CPR for SSDs
Modern storage systems are built upon the assumption that the capacity of a storage device does not change. This capacity-invariant interface forces a flash-based storage device to trade performance for reliability when, in fact, it can maintain both if ...
When Should The Network Be The Computer?
Researchers have repurposed programmable network devices to place small amounts of application computation in the network, sometimes yielding orders-of-magnitude performance gains. At the same time, effectively using these devices requires careful use ...
In-Network Compute: Considered Armed and Dangerous
Programmable data planes promise unprecedented flexibility and innovation. But enormous management issues arise when these programmable data-planes, and the in-network compute functionality they enable, are deployed within production networks. In this ...
- Proceedings of the Workshop on Hot Topics in Operating Systems