+arm64 KVM folks On Mon, Nov 15, 2021, Marc Orr wrote: > On Mon, Nov 15, 2021 at 10:26 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > On Mon, Nov 15, 2021, Dr. David Alan Gilbert wrote: > > > * Sean Christopherson (seanjc@xxxxxxxxxx) wrote: > > > > On Fri, Nov 12, 2021, Borislav Petkov wrote: > > > > > On Fri, Nov 12, 2021 at 09:59:46AM -0800, Dave Hansen wrote: > > > > > > Or, is there some mechanism that prevent guest-private memory from being > > > > > > accessed in random host kernel code? > > > > > > > > Or random host userspace code... > > > > > > > > > So I'm currently under the impression that random host->guest accesses > > > > > should not happen if not previously agreed upon by both. > > > > > > > > Key word "should". > > > > > > > > > Because, as explained on IRC, if host touches a private guest page, > > > > > whatever the host does to that page, the next time the guest runs, it'll > > > > > get a #VC where it will see that that page doesn't belong to it anymore > > > > > and then, out of paranoia, it will simply terminate to protect itself. > > > > > > > > > > So cloud providers should have an interest to prevent such random stray > > > > > accesses if they wanna have guests. :) > > > > > > > > Yes, but IMO inducing a fault in the guest because of _host_ bug is wrong. > > > > > > Would it necessarily have been a host bug? A guest telling the host a > > > bad GPA to DMA into would trigger this wouldn't it? > > > > No, because as Andy pointed out, host userspace must already guard against a bad > > GPA, i.e. this is just a variant of the guest telling the host to DMA to a GPA > > that is completely bogus. The shared vs. private behavior just means that when > > host userspace is doing a GPA=>HVA lookup, it needs to incorporate the "shared" > > state of the GPA. If the host goes and DMAs into the completely wrong HVA=>PFN, > > then that is a host bug; that the bug happened to be exploited by a buggy/malicious > > guest doesn't change the fact that the host messed up. > > "If the host goes and DMAs into the completely wrong HVA=>PFN, then > that is a host bug; that the bug happened to be exploited by a > buggy/malicious guest doesn't change the fact that the host messed > up." > ^^^ > Again, I'm flabbergasted that you are arguing that it's OK for a guest > to exploit a host bug to take down host-side processes or the host > itself, either of which could bring down all other VMs on the machine. > > I'm going to repeat -- this is not OK! Period. Huh? At which point did I suggest it's ok to ship software with bugs? Of course it's not ok to introduce host bugs that let the guest crash the host (or host processes). But _if_ someone does ship buggy host software, it's not like we can wave a magic wand and stop the guest from exploiting the bug. That's why they're such a big deal. Yes, in this case a very specific flavor of host userspace bug could be morphed into a guest exception, but as mentioned ad nauseum, _if_ host userspace has bug where it does not properly validate a GPA=>HVA, then any such bug exists and is exploitable today irrespective of SNP. > Again, if the community wants to layer some orchestration scheme > between host userspace, host kernel, and guest, on top of the code to > inject the #VC into the guest, that's fine. This proposal is not > stopping that. In fact, the two approaches are completely orthogonal > and compatible. > > But so far I have heard zero reasons why injecting a #VC into the > guest is wrong. Other than just stating that it's wrong. It creates a new attack surface, e.g. if the guest mishandles the #VC and does PVALIDATE on memory that it previously accepted, then userspace can attack the guest by accessing guest private memory to coerce the guest into consuming corrupted data. > Again, the guest must be able to detect buggy and malicious host-side > writes to private memory. Or else "confidential computing" doesn't > work. That assertion assumes the host _hypervisor_ is untrusted, which does not hold true for all use cases. The Cc'd arm64 folks are working on a protected VM model where the host kernel at large is untrusted, but the "hypervisor" (KVM plus a few other bits), is still trusted by the guest. Because the hypervisor is trusted, the guest doesn't need to be hardened against event injection attacks from the host. Note, SNP already has a similar concept in it's VMPLs. VMPL3 runs a confidential VM that is not hardened in any way, and fully trusts VMPL0 to not inject bogus faults. And along the lines of arm64's pKVM, I would very much like to get KVM to a point where it can remove host userspace from the guest's TCB without relying on hardware. Anything that can be in hardware absolutely can be done in the kernel, and likely can be done with significantly less performance overhead. Confidential computing is not a binary thing where the only valid use case is removing the host kernel from the TCB and trusting only hardware. There are undoubtedly use cases where trusting the host kernel but not host userspace brings tangible value, but the overhead of TDX/SNP to get the host kernel out of the TCB is the wrong tradeoff for performance vs. security. > Assuming that's not true is not a valid argument to dismiss > injecting a #VC exception into the guest.