2018-01-24 07:36+0100, Martin Schwidefsky: > On Tue, 23 Jan 2018 21:32:24 +0100 > Radim Krčmář <rkrcmar@xxxxxxxxxx> wrote: > > > 2018-01-23 15:21+0100, Christian Borntraeger: > > > Paolo, Radim, > > > > > > this patch not only allows to isolate a userspace process, it also allows us > > > to add a new interface for KVM that would allow us to isolate a KVM guest CPU > > > to no longer being able to inject branches in any host or other guests. (while > > > at the same time QEMU and host kernel can run with full power). > > > We just have to set the TIF bit TIF_ISOLATE_BP_GUEST for the thread that runs a > > > given CPU. This would certainly be an addon patch on top of this patch at a later > > > point in time. > > > > I think that the default should be secure, so userspace will be > > breaking the isolation instead of setting it up and having just one > > place to screw up would be better -- the prctl could decide which > > isolation mode to pick. > > The prctl is one direction only. Once a task is "secured" there is no way back. Good point, I was thinking of reversing the direction and having TIF_NOT_ISOLATE_BP_GUEST prctl, but allowing tasks to subvert security would be even worse. > If we start with a default of secure then *all* tasks will run with limited > branch prediction. Right, because all of them are untrusted. What is the performance impact of BP isolation? This design seems very fragile to me -- we're forcing userspace to care about some arcane hardware implementation and isolation in the system is broken if a task running malicious code doesn't do that for any reason. > > Maybe we can change the conditions and break logical connection between > > TIF_ISOLATE_BP and TIF_ISOLATE_BP_GUEST, to make a separate KVM > > interface useful. > > The thinking here is that you use TIF_ISOLATE_BP to make use space secure, > but you need to close the loophole that you can use a KVM guest to get out of > the secured mode. That is why you need to run the guest with isolated BP if > TIF_ISOLATE_BP is set. But if you want to run qemu as always and only the > KVM guest with isolataed BP you need a second bit, thus TIF_ISOLATE_GUEST_BP. I understand, I was following the misguided idea where we have reversed logic and then use just TIF_NOT_ISOLATE_GUEST_BP for sie switches. > > > Do you think something similar would be useful for other architectures as well? > > > > It goes against my idea of virtualization, but there probably are users > > that don't care about isolation and still use virtual machines ... > > I expect most architectures to have a fairly similar resolution of > > branch prediction leaks, so the idea should be easily abstractable on > > all levels. (At least x86 is.) > > Yes. > > > > In that case we should try to come up with a cross-architecture interface to enable > > > that. > > > > Makes me think of a generic VM control "prefer performance over > > security", which would also take care of future problems and let arches > > decide what is worth the code. > > VM as in virtual machine or VM as in virtual memory? Virtual machine. (But could be anywhere really, especially the kernel/user split slowed applications down for too long already. :]) > > A main drawback is that this will introduce dynamic branches to the > > code, which are going to slow down the common case to speed up a niche. > > Where would you place these additional branches? I don't quite get the idea. The BP* macros contain a branch in them -- avoidable if we only had isolated virtual machines. Thanks.