I'm an embedded guy by trade, so the idea of a Unikernel is nothing new to me. But wait... use cases overlapping with general-purpose OSes? nginx benchmarks??? This is exciting.
I know DevOps for bare-metal firmware is a PITA partly because of the tightly-coupled application, kernel, and libraries. I'm hoping someone familiar with Unikraft/OSv/etc could sate my curiosity...
- Do you test your app inside a container before building your Unikraft/OSv image? Or is there a way to create a CI/CD pipeline that builds your unikernel executable + tests the whole thing as a compiled unit?
- How often do bugs appear in Unikraft that don't appear when running the app on a traditional OS? To what extent does the complexity of the app's dependencies affect this?
- In terms of convenience, how does Unikraft/OSv compare to using a highly-customizable general-purpose* OS like Gentoo?
(*edit for clarity: "general-purpose OS" in the sense that it 1) can load arbitrary data using one or more filesystems 2) can execute loaded data as a program 3) has means by which a human or human-controlled machine may cause the OS to load and execute said programs. This definition does not exclude highly-specialized Gentoo/Nix/whatever setups that are tailored to run a particular program)
1. We build unikernels using the 'kraft container' which is Docker/OCI image[0][1] which has the necessary build tools to build Unikraft unikernels. We plug this into Concourse CI which builds thousands of combinations of Unikernels[2] as part of our code review process[3]. In addition to this, we have on-going research and tooling to help automatically discover permutations of Unikernel builds[4]. We then run the unikernel natively using the target platform/hardware combination or use QEMU emulation.
2. Really great question, but mostly you can expect the same functionality of an application when it runs as a unikernel because the application "thinks" it's still running in a traditional OS environment -- as it should be. Check out this documentation[5] (after step 7) about porting, it has snippets about where the boundary sometimes breaks. Even then, you can run it in POSIX-compatibility mode[6].
3. Well, general-purposes are not suited for deployment environments (which is what Unikraft is suited for). Installing Gentoo (or Ubuntu, Debian, for that matter) is a waste of resources if you only SSH in once to install your desired application.
I edited my comment to clarify this, but I meant "general-purpose OS" in a very broad sense - i.e. not much more than "can it load, execute, and time-share between arbitrary programs loaded from an attached storage device."
The application code is compiled along with kernel code so you can think of unikernels as a single-process VM, there's nothing else running other than the application so it boots straight to `main()`. Unikraft just facilitates the runtime of the application to be able to run as a VM or on baremetal. There's no shell, so you can't instantiate another program from disk. If the application wishes to read and write from an attached storage, it can, but it can't start another process if it's reading application code to-be-executed. Starting another process is a bit tricker since there is no `fork()` to execute another process. Interesting work is being done to enable multi-threading across cores via SMP[0] however and to provide fork like ability but with regard to the application's logic[1] but not in a wider general multi-processing environment. I hope this clarifies things.
I'm familiar with unikernels as a concept - heck, I wrote (a bad) one for a research project during my undergrad with embedded systems and softcore processors as the target. It's exciting to see unikernels being used with software of a similar scale, but with a much more broad solution space.
We're working hard on adding SMP support to Unikraft which is planned for the next release. The PR you have linked has all the details about the on-going work!
From my fairly naive-POV it seems like UniKernels are the next logical step in computing. Docker being the last jump and unikernels sitting to be the next with some form of WASM as a host.
They feel a bit more orthogonal to me, given that unikernels are about sharing a VM hypervisor, whereas containers are about safely sharing a Linux kernel.
They're solving similar problems in terms of reducing image size, startup time, security surface area, and so on. But the mechanism is quite different, so it feels like basically none of the tooling will translate. Like, where is the Kubernetes of unikernel deployments? It would have to be built from scratch, and would probably end up looking more like Terraform than like the Kubernetes of today.
I went with OSv (another unikernel) for a previous pet project and, while I really loved the concept, I found the tooling to be immature. This project’s tooling and documentation does looks better so I look forward to trying it out.
One thing I find missing with these unikernels though is IPSec support and Firewalls. I’d love to throw a unikernel image on DigitalOcean and have a secure software-defined IPSec tunnel.
out of real curiosity - what would be the point of a firewall in a unikernel image? I mean presumably its to stop people from opening random ports. but if you bundle the application and you don't support a shell or forking processes in general then the only bound ports are those which the application explicitly opens.
so what value in requiring someone to run around the interface and open it in the firewall as well?
You might want to let your metrics system scrape a private endpoint published on a different port. Or you might have management task that you want to restrict to your internal network. Possible if you delegate to network hardware, but sometimes those asks are a PITA.
I've recently become thoroughly convinced of the merits of consolidating onto as few machines (physical or virtual) as possible. One reason is that I recently consolidated some of my company's infrastructure onto a single bare-metal server to reduce costs. And then in the middle of that, this post came out:
It seems to me that running lots of small VMs with unikernels is inherently wasteful compared to running many processes on a single machine with a shared kernel that can make optimal use of the machine's resources. Sure, the unikernel-based VMs can be smaller than equivalent Linux VMs, but one still has to allocate a fixed amount of RAM, storage, and (for public cloud platforms) CPU to each VM. We inevitably add some padding to those allocations to ensure that we have headroom, and the total probably adds up to more than we would need to allocate to a single machine (physical or virtual) running all of those processes on a single kernel. And on public cloud platforms, we have to pay for those padded resource allocations.
I've certainly done deployments with lots of small Linux VMs in the past; in my recent migration process, I was replacing such a setup with one big box. Creating lots of small VMs is certainly a convenient and robust way to independently deploy and update several components. But it's obviously not the only way.
The home page of the forthcoming Unikraft Cloud service says, "The cloud is essential to your business but you know you are overpaying." But I think a better answer is to consolidate onto a few big VMs, using container orchestration to keep deployment manageable.
It really depends on your usecase here. If the many application processes rely on the same OS libraries, versioned language runtimes (e.g. same python version), kernel version, etc. AND you trust the OS then it may make sense. However, unikernels offer the lightweightness of a container process with the security of a VM. In addition to this, memory ballooning[0][1] and other resource-elastic features are available to VMs too: allowng you to under-provision them and then later increase resources when load demands it.
Unikraft unikernels can also be managed using the same orchestration tools as containers, check out the talk at CNCF[2].
> memory ballooning[0][1] and other resource-elastic features are available to VMs too
If I'm using a public cloud platform's hypervisor, those features may benefit the cloud provider, but not me. Or are you targeting users running a hypervisor on bare metal?
I think it can benefit both parties -- underprovisioning for a smaller bill and then increasing when demands need it so as to prevent degregation in QoS. That said, if load increases exponentially high it can induce high costs. Some cloud providers do not offer memory (or just any resource) ballooning, like AWS, so there you will experience the problem you have discussed about over-provisioning. However, we aim to alleviate some of these problems with the uniqueness of Unikernels with our soon-to-be-released Cloud Platform at https://unikraft.io. Features like memory ballooning (with hard upper limits) and other features like deep in-kernel monitoring to understand application performance.
Definitely agree, in fact we tend to use the term "massive consolidation", where we run thousands of VMs on a single server, thus saving costs. Unikraft unikernels are a perfect fit for this since they consume little memory and boot relatively fast. In early work [0] we were booting as many as 8000 hello world VMs/unikernels on the same server on the Xen hypervisor. More recently we have been booting 1K NGINX Unikraft images on a single server.
Sort of sounds like this is a good tool to pair with things like firecracker? The goal with most of these ( aside security ) is a better ability to (bin?) pack these small "blocks" into a larger "box".
Tradeoffs between actual usage costs and devops time cost, imo
Yes, in fact we have some early support for Firecracker, where we can at least boot some basic Unikraft images with it (e.g., see page 10 of this paper[0], FIgure 10, where we get the shortest boot times with Firecracker). We're still missing networking supporto on FC, which we're working on.
This is an exciting project, congratulations. I'm looking forward to the docs on embedded usage, and also which languages are supported and how to configure them. For now there seems to be quite a few unikraft/app* repos with such examples.
Agree, we're working on an embedded page, we intend to release the code during our May release. Also agree with having a page about the different languages we currently support (c/c++, Lua, Go, Python, Ruby) and others we're working on (e.g., Rust, Java). Always interested to hear which languages and/or frameworks people are interested in.
Kernel becomes a library the application uses instead of something that jumps over the CPU context. Application runs as root with the rump kernel or unikernel liked in. Only portions of kernel actually used need to be present. System calls become function calls. Multitasking support provided by a threading library. You shouldn't run multiple applications in a unikernel.
If you have a number of servers running a hypervisor as a base OS, and your applications on its VMs are network-centric like web servers, load balancers, database services, or microservices, and you don't really use the user-level security of a traditional OS, this can enhance performance by eliminating the user-kernel CPU context switch and consume less RAM.
Everyone moved to containerize their code and then security organizations in corporations have been putting the brakes on that and pushing for virtualization layer for additional isolation in multi tenant environments. Since every container ends up being a virtual machine anyways, the only way to slim down is unikernel.
>Unikraft has been extensively evaluated in terms of performance. Evaluations of using off-the-shelf applications on Unikraft results in a 1.7x-2.7x performance improvement compared to Linux guests. In addition, Unikraft images for these apps are around 1MB, require less than 10MB of RAM to run, and boot in around 1ms on top of the VMM time (total boot time 2ms-40ms).
Says it's secure, Github shows 76% of the code is in C. I see the word "secure" in a few places but it's just stated without any indication as to what about this makes it secure.
Unikraft is based on a small trusted compute base, meaning there is nothing else running with a unikernel, no ssh, no daemons, no Linux, etc.
Towards increasing security, however, we have just introduced native support for Rust[0] in Unikraft, paving the way for more internal libraries to be based on this secure and performant language.
Thanks, I think having a "read more" would be helpful. You do a good job of quickly demonstrating performance with some numbers, but there's nothing about security on there. I think it'd go a long way for people like me who are going to be immediately skeptical of software in C claiming to be safe.
Thanks for the feedback, we're in the process of adding a security section[0] which will detail more on the on-goings, but we'll work on adding more highlights on the main page.
I need to highlight we have separate research[1][2] which will make its way upstream soon which aims to provide hardening between internal libraries (e.g. isolating the network stack or scheduler) using gates like Intel MPK or separate hardware-accelerated services.
How does the system tolerate vulnerabilities outside the TCB? I thought unikernels often didn't have protections that would shield a TCB from app vulnerabilities.
Hi, no, the statement wasn't to isolate the kernel code from the application, since it's all in the same address space. Instead, it's to reduce the possibility of bugs (but again, not in the application), and reduce the vectors for attack in the underlying stack. For separating the application from the kernel (and from components within the kernel, since Unikraft is modular) we are doing further work called FlexOS, based on Unikraft, and to appear soon at the ASPLOS conference[0]; a short version of the paper appeared at HotOS [1].
what are you trying to protect the kernel for if it only hosts in the single application? are you assuming that local root has some distinguished privilege outside this box?
Hi, support for the RPI and perhaps another device should be out by release 0.9 in May, along with the documentation at the link you posted (the code's working, but it needs clean-up).
Unikernels are interesting, but as long as people treat them like "linux without linux" they won't go far.
The real potential of unikernels comes from making apps that are more self-aware and take up some of the functions previously handled by linux (such as monitoring memory usage).
It works both ways with Unikraft, either bring an existing application and let it run with the added performance/security (and think it's on Linux) or write your application with our performance-oriented APIs[0][1].
How does one store data with Unikraft? This is the problem I hit with other unikernel projects. OSv seemed to support ZFS or NFS somehow but I couldn't quite figure out the documentation. I can't find any references to storage at all for Unikraft.
Yes you're right, it's a simple network-like protocol allowing you to mount a path on the host OS to the Unikraft unikernel VM similar to a container volume. In addition, Unikraft's abstract APIs[0] allow for more block devices such as EXT{2..4}, etc. which you mount in a similar way. Alternatively, you can put your filesystem into a CPIO format and mount it as initram and load it into RAM (great for performance and read-only file systems, like webservers).
Hi @edsiper2, if you are running into any problems I'm happy to help, we can chat directly on the Unikraft Discord server: https://bit.ly/UnikraftDiscord
Running the server in the same address space as the (uni)kernel can have major impact on performance for I/O bound apps, cutting off system calls and context switching overhead.
dietPI is still Linux, and all performance limitations that come with Linux are still there. They compare it with Alpine, which is much slimmer than dietPI, yet, Alpine still looses in application performance by quite a margin.
I presume that's for particular benchmarks where syscall overhead is significant. Which is certainly true for some real world applications but not for others.
For a while nanovms was based on Rump[0], which had terrible performance. The new version of nanovms[1] we haven't benchmarked but we should; having said that, even the founder says they "[...] have spent very little time benchmarking" [2]
I know DevOps for bare-metal firmware is a PITA partly because of the tightly-coupled application, kernel, and libraries. I'm hoping someone familiar with Unikraft/OSv/etc could sate my curiosity...
- Do you test your app inside a container before building your Unikraft/OSv image? Or is there a way to create a CI/CD pipeline that builds your unikernel executable + tests the whole thing as a compiled unit?
- How often do bugs appear in Unikraft that don't appear when running the app on a traditional OS? To what extent does the complexity of the app's dependencies affect this?
- In terms of convenience, how does Unikraft/OSv compare to using a highly-customizable general-purpose* OS like Gentoo?
(*edit for clarity: "general-purpose OS" in the sense that it 1) can load arbitrary data using one or more filesystems 2) can execute loaded data as a program 3) has means by which a human or human-controlled machine may cause the OS to load and execute said programs. This definition does not exclude highly-specialized Gentoo/Nix/whatever setups that are tailored to run a particular program)