On Wed, Jul 28, 2021 at 11:38:35AM +0100, Marc Zyngier wrote: > On Tue, 27 Jul 2021 19:11:33 +0100, > Will Deacon <will@xxxxxxxxxx> wrote: > > > > On Thu, Jul 15, 2021 at 05:31:49PM +0100, Marc Zyngier wrote: > > > As we now keep information in the S2PT, we must be careful not > > > to keep it across a VM reboot, which could otherwise lead to > > > interesting problems. > > > > > > Make sure that the S2 is completely discarded on reset of > > > a vcpu, and remove the flag that enforces the MMIO check. > > > > > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > > --- > > > arch/arm64/kvm/arm.c | 8 +++++++- > > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > > index 97ab1512c44f..b0d2225190d2 100644 > > > --- a/arch/arm64/kvm/arm.c > > > +++ b/arch/arm64/kvm/arm.c > > > @@ -1096,12 +1096,18 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu, > > > * ensuring that the data side is always coherent. We still > > > * need to invalidate the I-cache though, as FWB does *not* > > > * imply CTR_EL0.DIC. > > > + * > > > + * If the MMIO guard was enabled, we pay the price of a full > > > + * unmap to get back to a sane state (and clear the flag). > > > */ > > > if (vcpu->arch.has_run_once) { > > > - if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB)) > > > + if (!cpus_have_final_cap(ARM64_HAS_STAGE2_FWB) || > > > + test_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags)) > > > stage2_unmap_vm(vcpu->kvm); > > > else > > > icache_inval_all_pou(); > > > + > > > + clear_bit(KVM_ARCH_FLAG_MMIO_GUARD, &vcpu->kvm->arch.flags); > > > > What prevents this racing with another vCPU trying to set the bit? > > Not much. We could take the kvm lock on both ends to serialize it, but > that's pretty ugly. And should we care? What is the semantic of > resetting a vcpu while another is still running? It's definitely weird, but given that this is an attack vector I don't think we can rule out attackers trying whacky stuff like this (although maybe we end up forbidding vcpu reset in pKVM -- dunno). > If we want to support this sort of behaviour, then our tracking is > totally bogus, because it is VM-wide. And you don't even have to play > with that bit from another vcpu: all the information is lost at the > point where we unmap the S2 PTs. > > Maybe an alternative is to move this to the halt/reboot PSCI handlers, > making it clearer what we expect? I think that's probably worth looking at. The race is quite hard to reason about otherwise, so if clearing the bit can be done on the teardown path in single-threaded context then I think that's better. It looks like kvm_prepare_system_event() has all the synchronisation we need there. Will