OS Technologies To Watch
It’s the new year, and it seems to be a vibrant time for novel Operating System technologies. This is not intended to be an objective list of “the best things”, it’s just some up-and-coming technologies that I’m particularly excited about right now:
I’ve known about
nixos for a while, but not really had much cause to use it. This year, I started using
nixos to deploy some cloud services. I have been completely blown away.
I had so much to say about NixOS that I turned it into its own post: NixOS and stateless deployment, so go read that if you’re interested. But to summarize, Nixos lets you define a computer’s complete OS and configuration, declaratively. Users. Files. Software (both official and your own). Configuration. Services. Disk mounts. Kernel drivers. Every friggin’ thing. It’s all specified in a pure, lazy, strongly typed declarative language with just enough power (functions, modules, etc) to allow all the abstractions you need, but which often reads just like a trivial configuration file.
And unlike puppet (which is declarative but impure and non-exhaustive), the promise of stateless declarative configuration actually holds true. Taking so much state out of deployment really is an incredible and liberating achievement. I honestly dread the next time I have cause to deploy something that isn’t NixOS.
But while NixOS is incredibly useful for development / deployment, it’s unlikely to be useful as a desktop OS. Eliminating state for a personal desktop machine is nowhere near as critical as it is for servers, and
nix has a pretty poor desktop-specific package selection
(e.g still no Gnome3 packages). Update (04/01/2015): Multiple people have pointed out that Gnome3 is packaged and works fine, so that was a bad example ;)).
MirageOS is not your standard OS. A mirage binary is basically your application code statically linked against an OS kernel, as one big blob, called a “unikernel”.
This sounds like a crazy idea, but think about it. Many VMs these days on cloud providers run just one service, for isolation and other reasons. So you have the linux kernel, with a full multi-user multi-tasking stack, a host of installed services and only one job (your app). That’s hundreds of megabytes of code, before you even factor in your own software. The amount of incidental complexity you could avoid by cutting all of that out and just linking your app directly against a massively simpler kernel is pretty astounding.
Of course, a massively simpler system like this has limitations. MirageOS only runs OCaml code. You can’t run multiple processes (but you can use a cooperative event-loop library like
lwt to perform concurrent tasks in the one process). You don’t even necessarily have a disk (but you can set one up if you need persistent storage).
But if your application can fit in that model, the benefits are pretty exciting. Your entire OS is stateless. Your deployment process is literally just stopping one VM and starting another. And that may sound expensive, but these unikernels are in the order of tens of kilobytes - they can start up quicker than a docker container. You get the benefits of OCaml’s excellent type system and memory safety across your entire OS (no buffer overflows), along with an incredibly small attack surface (no random binaries or C libraries that were written before the internet with the belief that there is no such thing as malicious input). If there’s one thing this year has taught me, it’s that just because code is old and widely used, doesn’t mean it can’t be terribly insecure.
Obviously, writing code in OCaml doesn’t implicitly fix security bugs (aside from memory safety bugs, which is nothing to sneeze at). But the most efficient and least buggy code is code which doesn’t exist, and MirageOS can effectively trim off decades of crusty code, if you can work within its fairly strict requirements. To be honest, I haven’t actually used it myself - I’m keen, but given the restrictions I haven’t yet had anything appropriate to try it out with.
The complete lack of isolation in everyday computing is incredibly alarming. I have many things that I do on and offline - programming, work, banking, playing with new programs and tools, building software, playing games. I would be much more comfortable if (for example) some fun little game I’m trying out were not given full access to my entire user account, which could fairly trivially compromise all of the above without actually needing to subvert any security measures. Obviously when trying out suspicious software I’ll do it in a VM or with an unprivileged user account, but that’s a lot of work, and is very inconvenient.
From what I’ve seen, Qubes could be a much more convenient approach to at least maintaining some walls between activities that clearly have no business interacting with each other (e.g online banking and playing games). On the downside, I believe it comes at a fairly hefty performance (particularly RAM) cost, doesn’t provide the 3D acceleration required for gaming, and doesn’t allow much choice when it comes to the window manager (I use gnome-shell with a tiling window plugin, while Qubes uses KDE which I don’t much care for). While criticizing the window manager of a security oriented OS is clearly missing the point, it’s still going to put me off using it as my primary OS.
So despite the fact that I haven’t use Qubes, I’m hopeful that one day it could be convenient (and efficient) enough to provide vastly better security than we currently put up with.
While Qubes is A Thing That Could Work Right Now (with some annoyances), Genode feels like a thing that could be truly amazing in a handful of years’ time. And once it is generally useful, it would hopefully supersede the current half-measures like Qubes’s VM-based separation (which is not a slight on Qubes; it’s clearly more practical right now).
While Qubes requires on the user organising their actions into explicit categories (“work”, “games”, etc), Genode is instead a capability-based model. The basic idea here is that instead of having ambient authority like a regular OS (things like a heirarchical file system, network stack or inter-process-communication) which any running process can access using well-known methods, a process in a capability-based system can access only the resources that are explicitly passed to it. It’s kind of like in programming, where instead of passing a file path around and allowing anyone to access any file they wish, you might pass a very restrictive
File object instead. Except that in this analogy, it would be impossible to access the filesystem outside of individual references passed to you, making it very explicit which files a procedure can access. Doing this for anything involving authority allows you to keep processes isolated on a very granular level, by only providing them capabilities to the services / powers they actually need, rather than trying to design system-wide security policies like current OSes do.
I’ve been following this for a while, but it’s a fairly low-level project compared to where I usually spend my time, so it’s hard for me to do much with. I don’t have much love for C++ (in which the entire OS is built), and releases are still featuring low level things like improved USB & networking stacks, filesystem drivers, etc. So while I’m very interested to see where the OS goes, it’s only from the sidelines, as I can’t really do much with it myself.
Honourable mention: Sandstorm.io
Sandstorm normalizes this approach, and provides a way for users to easily run their own web services under their own control, completely sandboxed from each other and the rest of the user’s computer. It’s an interesting direction, and provides a path for users to take control of some hosted services (e.g for things like self-hosted RSS apps). But anything that’s not single-user is going to have to be heavily federated in order to work with sandstorm’s model, and I don’t think that’s terribly likely (especially if you’re dealing with federated “servers” that disappear when a user suspends their computer).
Interestingly, the core model of sandstorm (running isolated web services) is also something that I think could be completely superceded by genode, if it were to take off as a general-purpose OS Presumably that’s a long-to-infinite time away, though.
And hey, they’re talking about capability-based security for web applications, and that would definitely be an interesting development if it took off.
Less-honourable mention: Docker
I want to love docker, I really do. I was very excited when it was first announced. I’ve long been a fan of the underlying LXC technologies it uses for isolation. But it’s pretty clear now that its features are aimed just a little too far away from what I would actually want.
I do use it. But never in the way that docker seems to want me to, which is a little awkward. The main push of docker seems to be for completely self-contained applications, basically a super cheap and consistent VM. But VMs are a hack, and docker in many ways is just as hacky. Why would I want an operating system (e.g RHEL) on my host, and a different operating system (perhaps Ubuntu) inside the docker container? I can’t run
systemd inside docker1, so all of the Nice Things that you get with systemd need to be replaced with fairly weak alternatives (like
supervisord). And unlike nix, docker’s caching is optimistic (a.k.a “wrong”). You’ll rarely ever get the same actual result with the same build inputs, because docker containers rely on doing very time-relevant things like installing or updating to the “latest version” of some package. And making sure important security updates are applied to a docker container is often even harder than it is for a VM.
And then because you need a lot of containers, suddenly you need cluster management on top of your docker containers. I’ve looked at a few of these, and they’re not really appealing to me. It’s kind of like running OpenStack - a pretty huge amount of additional effort, resources and (not entirely bug-free) code which in most small deployments will just cause you more hassle than they solve.
nixos has some container functionality. But because it’s built on
nixos itself, almost all of what
docker provides is completely unnecessary - specifying a container is just like specifying the OS, because that’s already completely stateless and declarative. You don’t need a way to build a root filesystem, because
nixos already does that. You don’t need a way to cache the results of a build and overlay them, because
nixos already does that (but without needing the overlay part). And you don’t need special tricks to apply security updates - the host and container are running the same OS; rebuilding the host also rebuilds the container (but without any actual duplication, thanks to
nix’s pervasive caching).
So while I’ll still use
docker for development (e.g a cheap way to test software on an Ubuntu-like environment), I’m no longer excited about where Docker is going.
Update (06/01/2015): It’s been pointed out in the comments that you can run systemd inside docker. I tried and failed in the past, but I think things have gotten better since. ↩