Rethinking the guest operating system
OSv is the result of a focused effort by a company called Cloudius Systems. Many of the people working on it will be familiar to people in the Linux community; they include Glauber Costa, Pekka Enberg, Avi Kivity, and Christoph Hellwig. Together, they have taken the approach that the operating system stack used for contemporary applications "congealed into existence" and contains a lot of unneeded cruft that only serves to add complexity and slow things down. So they set out to start over and reimplement the operating system with contemporary deployment scenarios in mind.
What that means, in particular, is that they have designed a system that is intended to be run in a virtualized mode under a hypervisor. The fundamental thought appears to be that the host operating system is already handling a lot of the details, including memory management, multitasking, dealing with the hardware, and more. Running a full operating system in the guest duplicates a lot of that work. If that duplication can be cut out of the picture, things should go a lot faster.
OSv is thus designed from the beginning to run under KVM (ports to other hypervisors are in the works), so it does not have to drag along a large set of device drivers. It is designed to run a single application, so a lot of the mechanisms found in a Unix-like system has been deemed to be unnecessary and tossed out. At the top of the list of casualties is the separation between the kernel and user space. By running everything within a single address space, OSv is able to cut out a lot of the overhead associated with context switches; there is no need for TLB flushes, for example, or to switch between page tables. Eliminating that overhead helps the OSv developers to claim far lower latency than Linux offers.
What about security in this kind of environment? Much of the responsibility for security appears to have been passed to the host, which will run any given virtual machine in the context of a specific user account and limit accesses accordingly. Since OSv only runs a single application, it need not worry about isolation between processes or between users; there are no other processes or users. For the rest, the system seems to target Java applications in particular, so the Java virtual machine (JVM) can also play a part in keeping, for example, a compromised application from running too far out of control.
Speaking of the JVM, the single-address-space design allows the JVM to be integrated into the operating system kernel itself. There are certain synergies that result from this combination; for example, the JVM is able to use the page tables to track memory use and minimize the amount of work that must be done at garbage collection time. Java threads can be managed directly by the core scheduler, so that switching between them is a fast operation. And so on.
The code is BSD licensed and available on GitHub. Quite a bit of it appears to have been written from scratch in C++, but, much of the core kernel (including the network stack) is taken from FreeBSD. A fresh start means that a lot of features need to be reimplemented, but it also makes it relatively easy for the system to use modern hardware features (such as huge pages) from the outset. The filesystem of choice would appear to be ZFS, but the presentation slides from CloudOpen suggest that the developers are looking forward to widespread availability of nonvolatile RAM storage systems, which, they say, will reduce the role of the filesystem in an application's management of data.
The cynical among us might be tempted to say that, with all this work, the OSv developers have managed to reimplement MS-DOS. But what they really appear to have is the ultimate expression of the "just enough operating system" concept that allows an application to run on a virtual machine anywhere in whichever cloud may be of interest at the moment. For anybody who is just looking to have a system run on somebody's cloud network, OSv may well look far more appealing than a typical Linux distribution: it does away with the configuration hassles, and claims far better performance as well.
So, in a sense, OSv might indeed be (or become) the best
operating system for cloud-based applications.
But it is not really a replacement for Linux; instead, it could be thought
of as an enhancement that allows Linux-based virtual machines to run more
efficiently and with less effort. Anybody implementing a host will still
need Linux around to manage separation between users, resource control,
hardware, and more. But those who are running as guests might just be
convinced to leave Linux and its complexity behind in favor of a minimal
system like OSv that can run their applications and no more.
Posted Sep 19, 2013 7:50 UTC (Thu)
by aleXXX (subscriber, #2742)
[Link] (5 responses)
How much does that actually differ e.g. from eCos, which also has a (big parts of) POSIX API and no memory protection ?
Alex
Posted Sep 19, 2013 10:32 UTC (Thu)
by lacos (guest, #70616)
[Link] (3 responses)
This is addressed in the presentation linked in the article, slides 38-39:
> Porting a C application to OSv
Posted Sep 19, 2013 23:33 UTC (Thu)
by Karellen (subscriber, #67644)
[Link] (2 responses)
If not, not only does that get in the way of explitly multi-threaded apps, but surely it also suddenly hobbles functional languages which offer the promise of great scalability performance using transparent parallelism over many threads (e.g. for operations like map/reduce) on todays multi-core systems?
Won't converting those apps to not just multi-process, but multi-(virtual)-machine, systems make them a heck of a lot *worse* than they are now?
Or was this mostly build to run bloody Java?
Posted Sep 20, 2013 0:44 UTC (Fri)
by dlang (guest, #313)
[Link]
As I read it, it was built _only_ to run Java
Posted Sep 20, 2013 5:01 UTC (Fri)
by glommer (guest, #15592)
[Link]
About the whole java thing, I have written a G+ post to clear that up:
We are Java focused, not java only.
Posted Sep 19, 2013 19:02 UTC (Thu)
by xman (guest, #46972)
[Link]
Posted Sep 19, 2013 9:35 UTC (Thu)
by edomaur (subscriber, #14520)
[Link]
Posted Sep 19, 2013 12:28 UTC (Thu)
by walters (subscriber, #7396)
[Link] (1 responses)
The pure virtualization target of this seems like it has the potential to make it much more widely deployed than Azul. Although for both of them, carrying lots of nontrivial kernel and JVM patches has to be difficult; presumably though the benefit to given workloads is quite large. Some benchmarks would be interesting to see.
Posted Sep 20, 2013 5:04 UTC (Fri)
by glommer (guest, #15592)
[Link]
The JVM is another story, though. So far we are running unmodified JVMs. But our goal is definitely to adapt the JVM. When that time comes, of course we will do our best to merge stuff up instead of carrying patches.
Posted Sep 19, 2013 12:29 UTC (Thu)
by bokr (subscriber, #58369)
[Link]
It would be nice to dream of trusting the CPU, the UEFI BIOS, and the
Running this hypothetical lttd would soon reveal that one is trusting quite
ISTM desirable to minimize the root and first branches of the trust tree, so I wonder
Since statically linked signed bootable applications sound like MSDOS to some, maybe
In any case, I want my trust tree rooted in my own signature, and my choice
OTOH, looking at it from the POV of a closed-source software seller/leaser,
I.e., from inside the VM bubble it should be possible to communicate securely to
Hopefully it can evolve into a thinner and thinner securely
Hm, guess it's time to wake up out of my daydream now, and try to do some work ;-)
[1] http://www.linuxfoundation.org/news-media/blogs/browse/20...
Posted Sep 19, 2013 14:46 UTC (Thu)
by jzbiciak (guest, #5246)
[Link] (3 responses)
From this article's description, I have a hard time thinking of OSV as an operating system in its own right. It seems more like a "supercharged, super-contained user space." That is, it seems like what I would end up with if I put a really strong container around a single task (taking away its ability to see other tasks in the process), but gave it much freer reign inside that container. I didn't really understand the JVM vs. Java application segmentation. It sounds like OSV relies on the presence of 2-stage (aka. nested) translation to allow exposing the guest's page tables to the application (JVM in this case), but still leans on a host OS to do 90% of the low level stuff we expect an OS to do, such as provide device drivers, hardware management, etc.
Posted Sep 20, 2013 9:25 UTC (Fri)
by intgr (subscriber, #39733)
[Link] (2 responses)
Oh wait, why not run simple Unix processes?
Posted Sep 20, 2013 17:18 UTC (Fri)
by pbonzini (subscriber, #60935)
[Link]
Posted Sep 23, 2013 23:22 UTC (Mon)
by zlynx (guest, #2285)
[Link]
I've been saying that for years.
The process appears to have been:
And then:
And soon it will be once again:
Posted Sep 19, 2013 22:17 UTC (Thu)
by jmorris42 (guest, #2203)
[Link]
So, sandbox using KVM. Compared to namespaces, containers, chroot, Java. Really big problem, really big need for something to actually work; why will this succeed where the other failed is I guess what I'm wondering.
Posted Sep 20, 2013 11:08 UTC (Fri)
by robert_s (subscriber, #42402)
[Link]
Perhaps they should call it "MULTICS".
Posted Sep 22, 2013 23:39 UTC (Sun)
by skissane (subscriber, #38675)
[Link]
Posted Sep 24, 2013 11:58 UTC (Tue)
by bergwolf (guest, #55931)
[Link] (1 responses)
Posted Oct 1, 2013 21:41 UTC (Tue)
by glommer (guest, #15592)
[Link]
Rethinking the guest operating system
What happens if I fork() ?
Rethinking the guest operating system
> [...]
> 2. May not fork() or exec()
Rethinking the guest operating system
Rethinking the guest operating system
Rethinking the guest operating system
https://plus.google.com/107787008629542080430/posts/cx4Ro...
Rethinking the guest operating system
Rethinking the guest operating system
Rethinking the guest operating system
Rethinking the guest operating system
Rethinking the guest operating system
of the kernel whose "unmodified KVM" they use to run in.
booted hypervisor kernel, and being able to have a trusted lttd utility
(accessible to a user about to launch a monolithic OSV os/app), which would
list trust tree dependencies like ldd does link dependencies for executables,
and with access to signed manifest metadata to do optional automatic signature
checking of everything the user will have to trust when s/he kicks off something
in a VM box.
a lot when one trusts a securely UEFI-booted image of the kernel, and trusting that
to implement KVM hopefully securely virtually booting the OSV monolithic os/app
and controlling its access to resources. Not to mention the interesting problem
of "trusting trust" [2] which would be part of a comprehensive trust dependency tree.
if there are plans to pare down the kernel that OSV uses to where it contains nothing but
the bare necessities for providing KVM and and controlled access to system resources
(including cloud stuff), and a trusted shell for administration and configuration, including configuring for
signed modules for access to new hardware and/or remote resources.
inter-VM comms could be modeled on the 1970s Unibus Bus Window hardware, for controlled
access to each others' memories ;-)
of delegation of trust, not some OEM's.
it would seem in their interest to support an open source UEFI/BIOS/hypervisor
trust tree root, if they could securely verify from within a KVM VM exactly what
they were trusting, and that nothing could penetrate their secure bubble.
get a trustable lttd report on one's own execution.
booted hypervisor system with extra secure special SSL administrative
control that could be configured for all the useful roles, whether
user/owner on a laptop or tamper-evidently booting on a colocated
server remotely managed, or at a library providing net-booted boxes,
or on corporate-owned laptops issued to people or projects, etc.
[2] http://en.wikipedia.org/wiki/Trusting_trust#Reflections_o...
Rethinking the guest operating system
Rethinking the guest operating system
Rethinking the guest operating system
Rethinking the guest operating system
- Supervisor Mode! Protected Memory! Yay! Now we can have security!
- Wah! Security makes programming hard! I need shared memory. I need a way to elevate my security mode. I need to write files.
- Wah! All these features I asked for have made me insecure!
- Virtual Machines! Yay! Now we can have security!
- Wah! Virtual machines are hard! How can I manage all these machines each one running a copy of my application? I need a way for them to share data with the hypervisor! Let them all share a filesystem! I want cut and paste from the consoles! Ooh, wouldn't it be nifty if my virtual machines could share some RAM!
- Wah! All these features have made my virtual machines insecure!
Rethinking the guest operating system
Rethinking the guest operating system
This is not a new idea. BEA had JRockit Virtual Edition - the JRockit JVM was ported to a thin custom OS designed to be used directly under a hypervisor, and that in turn was used to run WebLogic. Albeit, that product has since been discontinued.
[I work for Oracle but I don't speak for them]
Rethinking the guest operating system
Rethinking the guest operating system
Rethinking the guest operating system