On Mon, Nov 15, 2021, Joerg Roedel wrote: > On Sat, Nov 13, 2021 at 06:34:52PM +0000, Sean Christopherson wrote: > > I'm not treating it nonchalantly, merely acknowledging that (a) some flavors of kernel > > bugs (or hardware issues!) are inherently fatal to the system, and (b) crashing the > > host may be preferable to continuing on in certain cases, e.g. if continuing on has a > > high probablity of corrupting guest data. > > The problem here is that for SNP host-side RMP faults it will often not > be clear at fault-time if it was caused by wrong guest or host behavior. > > I agree with Marc that crashing the host is not the right thing to do in > this situation. Instead debug data should be collected to do further > post-mortem analysis. Again, I am not saying that any RMP #PF violation is an immediate, "crash the host". It should be handled exactly like any other #PF due to permission violation. The only wrinkle added by the RMP is that the #PF can be due to permissions on the GPA itself, but even that is not unique, e.g. see the proposed KVM XO support that will hopefully still land someday. If the guest violates the current permissions, it (indirectly) gets a #VC. If host userspace violates permissions, it gets SIGSEGV. If the host kernel violates permissions, then it reacts to the #PF in whatever way it can. What I am saying is that in some cases, there is _zero_ chance of recovery in the host and so crashing the entire system is inevitable. E.g. if the host kernel hits an RMP #PF when vectoring a #GP because the IDT lookup somehow triggers an RMP violation, then the host is going into triple fault shutdown. [*] https://lore.kernel.org/linux-mm/20191003212400.31130-1-rick.p.edgecombe@xxxxxxxxx/