There are two primary strengths that gVisor provides over the seccomp model, the second of which you've actually alluded to above.
1. Layered security
While seccomp allows users to limit the attack surface on the kernel, the application is still directly interacting with it and any single bug in an allowed system call will allow compromise. One of the design principles of gVisor is that no single bug should allow compromise of the host system/user data.
By intercepting and handling all application system calls, the gVisor kernel is the first layer of defense against the application. The gVisor kernel itself puts itself inside a seccomp sandbox as a second layer of defense, so if the application gets privilege escalation into the gVisor kernel its attack surface to the host is still limited.
The gVisor kernel seccomp policy [1] is much more restrictive than the system calls we implement. For example, note that "open" and friends are not allowed at all. File system access is mediated by an external agent [2] which does not trust the gVisor kernel, so even a compromised gVisor kernel has no elevated file system access.
2. Ease of use
> > Kernel features like seccomp filters can provide better isolation between the application and host kernel, but they require the user to create a predefined whitelist of system calls.
> Isn't that something you'd effectively have to do anyway if you want a sandbox?
This is something we'd like to challenge with gVisor. gVisor intends to be "secure by default" and configuration-free to the largest extent possible.
gVisor runs and sandboxes arbitrary, unmodified Linux binaries. You don't need to specify a sandbox policy because gVisor safely implements the entire Linux API [3].
Building a sandbox policy can be a difficult and time consuming. It can also be a difficult maintenance burden to update as the application changes over time, especially if you've made modifications to the application to reduce its syscall surface. Additionally, some use-cases wish to sandbox arbitrary workloads, for which a sandbox policy cannot be defined.
With gVisor, we hope to remove this painful step in sandboxing and enable developers to easily sandbox their workloads.
The former is a set of kernel libraries derived from NetBSD, and the latter is a Unikernel built based on the former. gVisor is different in a couple of ways: 1) gVisor is written from scratch using Go for its memory and type safety; 2) gVisor tends to be compatible with Linux which most people use. In theory, gVisor can be restructured as a Unikernel, but we still like to pertain the ring privilege boundary for additional isolation. We are working on an academic paper which will have more details.
This is true, Go is not memory safe in the presence of data races, and data races are possible in safe Go.
But they're also generally easy to code-review out. There's definitely a huge difference between C and Go, regardless of this one caveat to Go's memory safety guarantees.
They aren't using single threaded Go from what I can see.
Data races are not easy to "code review out". That is contrary to decades of experience. All you have to do in Go to get a race is to close over a for loop induction variable in a goroutine.
There is not a large difference between C and Go here. In fact, races might be easier in Go than in C, because it's easier for goroutines to close over mutable variables.
> I haven't really seen this as a big problem in Go.
Go certainly does have problems with data races all the time. Just Google for "golang data race": you'll find many blog posts explaining common data race gotchas in Go.
> Regardless, I think there's a world of difference between C and Go when it comes to memory safety.
1. Layered security
While seccomp allows users to limit the attack surface on the kernel, the application is still directly interacting with it and any single bug in an allowed system call will allow compromise. One of the design principles of gVisor is that no single bug should allow compromise of the host system/user data.
By intercepting and handling all application system calls, the gVisor kernel is the first layer of defense against the application. The gVisor kernel itself puts itself inside a seccomp sandbox as a second layer of defense, so if the application gets privilege escalation into the gVisor kernel its attack surface to the host is still limited.
The gVisor kernel seccomp policy [1] is much more restrictive than the system calls we implement. For example, note that "open" and friends are not allowed at all. File system access is mediated by an external agent [2] which does not trust the gVisor kernel, so even a compromised gVisor kernel has no elevated file system access.
2. Ease of use
> > Kernel features like seccomp filters can provide better isolation between the application and host kernel, but they require the user to create a predefined whitelist of system calls.
> Isn't that something you'd effectively have to do anyway if you want a sandbox?
This is something we'd like to challenge with gVisor. gVisor intends to be "secure by default" and configuration-free to the largest extent possible.
gVisor runs and sandboxes arbitrary, unmodified Linux binaries. You don't need to specify a sandbox policy because gVisor safely implements the entire Linux API [3].
Building a sandbox policy can be a difficult and time consuming. It can also be a difficult maintenance burden to update as the application changes over time, especially if you've made modifications to the application to reduce its syscall surface. Additionally, some use-cases wish to sandbox arbitrary workloads, for which a sandbox policy cannot be defined.
With gVisor, we hope to remove this painful step in sandboxing and enable developers to easily sandbox their workloads.
(Note: I work on gVisor.)
[1] https://github.com/google/gvisor/blob/master/runsc/boot/filt...
[2] https://github.com/google/gvisor#file-system-access
[3] Note we don't technically fully implement Linux, as work is ongoing, but missing features are simply unimplemented, not left out for security reasons. See https://github.com/google/gvisor#will-my-container-work-with...