Re: [PATCH 2/3] KVM: arm64: Handle S1PTW translation with TCR_HA set as a write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Marc,

On Wed, Dec 21, 2022 at 05:43:03PM +0000, Marc Zyngier wrote:
> Hi Ricardo,
> 
> On Wed, 21 Dec 2022 16:46:06 +0000,
> Ricardo Koller <ricarkol@xxxxxxxxxx> wrote:
> > 
> > Hello,
> > 
> > On Tue, Dec 20, 2022 at 08:09:22PM +0000, Marc Zyngier wrote:
> > > As a minor optimisation, we can retrofit the "S1PTW is a write
> > > even on translation fault" concept *if* the vcpu is using the
> > > HW-managed Access Flag, as setting TCR_EL1.HA is guaranteed
> > > to result in an update of the PTE.
> > > 
> > > However, we cannot do the same thing for DB, as it would require
> > > us to parse the PTs to find out if the DBM bit is set there.
> > > This is not going to happen.
> > > 
> > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> > > ---
> > >  arch/arm64/include/asm/kvm_emulate.h | 20 +++++++++++++++++++-
> > >  1 file changed, 19 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
> > > index fd6ad8b21f85..4ee467065042 100644
> > > --- a/arch/arm64/include/asm/kvm_emulate.h
> > > +++ b/arch/arm64/include/asm/kvm_emulate.h
> > > @@ -374,6 +374,9 @@ static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
> > >  static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
> > >  {
> > >  	if (kvm_vcpu_abt_iss1tw(vcpu)) {
> > > +		unsigned int afdb;
> > > +		u64 mmfr1;
> > > +
> > >  		/*
> > >  		 * Only a permission fault on a S1PTW should be
> > >  		 * considered as a write. Otherwise, page tables baked
> > > @@ -385,12 +388,27 @@ static inline bool kvm_is_write_fault(struct kvm_vcpu *vcpu)
> > >  		 * to map the page containing the PT (read only at
> > >  		 * first), then a permission fault to allow the flags
> > >  		 * to be set.
> > > +		 *
> > > +		 * We can improve things if the guest uses AF, as this
> > > +		 * is guaranteed to result in a write to the PTE. For
> > > +		 * DB, however, we'd need to parse the guest's PTs,
> > > +		 * and that's not on. DB is crap anyway.
> > >  		 */
> > >  		switch (kvm_vcpu_trap_get_fault_type(vcpu)) {
> > 
> > Nit: fault_status is calculated once when taking the fault, and passed
> > around to all users (like user_mem_abort()). Not sure if this is because
> > of the extra cycles needed to get it, or just style. Anyway, maybe it
> > applies here.
> 
> All these things are just fields in ESR_EL2, which we keep looking at
> all the time. The compiler actually does a pretty good job at keeping
> that around, specially considering that this function is inlined (at
> least here, kvm_handle_guest_abort and kvm_user_mem_abort are merged
> into a single monster).
> 
> So passing the parameter wouldn't change a thing, and I find the above
> more readable (I know that all the information in this function are
> derived from the same data structure).
>

Got it, thanks for the info.

> > 
> > >  		case ESR_ELx_FSC_PERM:
> > >  			return true;
> > >  		default:
> > > -			return false;
> > > +			/* Can't introspect TCR_EL1 with pKVM */
> > > +			if (kvm_vm_is_protected(vcpu->kvm))
> > > +				return false;
> > > +
> > > +			mmfr1 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> > > +			afdb = cpuid_feature_extract_unsigned_field(mmfr1, ID_AA64MMFR1_EL1_HAFDBS_SHIFT);
> > > +
> > > +			if (afdb == ID_AA64MMFR1_EL1_HAFDBS_NI)
> > > +				return false;
> > > +
> > > +			return (vcpu_read_sys_reg(vcpu, TCR_EL1) & TCR_HA);
> > 
> > Also tested this specific case using page_fault_test when the PT page is
> > marked for dirty logging with and without AF. In both cases there's a
> > single _FSC_FAULT (no PERM_FAUT) as expected, and the PT page is marked dirty
> > in the AF case. The RO and UFFD cases also work as expected.
> 
> Ah, thanks for checking this.
> 
> > 
> > Need to send some changes for page_fault_test as many tests assume that
> > any S1PTW is always a PT write, and are failing. Also need to add some new
> > tests for PTs in RO memslots (as it didn't make much sense before this
> > change).
> 
> I think this is what I really quite didn't grok in these tests. They
> seem to verify the KVM behaviour, which is not what we should check
> for.
> 
> Instead, we should check for the architectural behaviour, which is
> that if HAFDBS is enabled, we can observe updates to the PTs even when
> we do not write to them directly.

There are some tests checking that case (e.g., AF set by HW), but they
also do it while interacting with dirty-logging, userfaultfd, and/or RO
memslots. Some checks are clearly dealing with architectural behavior,
while others are not that clear. Let me use this sample test to get more
specific.  This test deals with HW setting the AF bit on a punched hole
backed by userfaultfd:

TEST_UFFD(guest_exec, with_af, CMD_HOLE_PT, ...):
	1. set TCR_EL1.HA and clear the AF in the test-data PTE,
	2. punch a hole on the test-data PTE page, and register it for
	   userfaultfd,
	3. execute code in the test-data page; this triggers a S1PTW,
	4. assert that there is a userfaultfd _write_ fault on the PT page,
	5. assert that the test-data instruction was executed,
	6. assert that the AF is set on the test-data PTE,

IIUC, only checking for architectural behavior implies skipping step 4.
And I agree, it might be a good idea to skip 4, as it actually depends
on whether the kernel has only the previous commit, or both the previous
and this one. The fault is a _write_ at this commit, and a _read_ at the
previous.

The issue is that there are some cases where checking KVM behavior could
be useful. For example, this dirty_logging test can also help (besides
testing dirty-logging) for checking regressions of commit c4ad98e4b72c
("KVM: arm64: Assume write fault on S1PTW permission fault on
instruction fetch"):

TEST_DIRTY_LOG(guest_exec, with_af, ...)
	1. set TCR_EL1.HA and clear the AF in the test-data PTE,
	2. enable dirty logging on the PT memslot,
	3. execute code in the test-data page; this triggers a S1PTW,
	4. assert that the test-data PTE page is dirty on the log,
	5. assert that the test-data instruction was executed,
	6. assert that the AF is set on the test-data PTE,

Step 4 above is not exactly checking architectural behavior, but I think
it's still useful.

So, regarding what to do with page_fault_test. As a start, I need to go
through all tests and make sure they pass at both this and the previous
commit. Next, I have to identify other tests that also need to be
relaxed a bit (removing some test asserts).

Thanks,
Ricardo

> 
> > 
> > >  		}
> > >  	}
> > >  
> > > -- 
> > > 2.34.1
> > > 
> > > _______________________________________________
> > > kvmarm mailing list
> > > kvmarm@xxxxxxxxxxxxxxxxxxxxx
> > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > 
> > Reviewed-by: Ricardo Koller <ricarkol@xxxxxxxxxx>
> 
> Thanks!
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux