Re: [PATCH v2 8/8] iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 30, 2024 at 04:09:04PM +0000, Mostafa Saleh wrote:
> Hi Jason,
> 
> On Tue, Aug 27, 2024 at 12:51:38PM -0300, Jason Gunthorpe wrote:
> > For SMMUv3 a IOMMU_DOMAIN_NESTED is composed of a S2 iommu_domain acting
> > as the parent and a user provided STE fragment that defines the CD table
> > and related data with addresses translated by the S2 iommu_domain.
> > 
> > The kernel only permits userspace to control certain allowed bits of the
> > STE that are safe for user/guest control.
> > 
> > IOTLB maintenance is a bit subtle here, the S1 implicitly includes the S2
> > translation, but there is no way of knowing which S1 entries refer to a
> > range of S2.
> > 
> > For the IOTLB we follow ARM's guidance and issue a CMDQ_OP_TLBI_NH_ALL to
> > flush all ASIDs from the VMID after flushing the S2 on any change to the
> > S2.
> > 
> > Similarly we have to flush the entire ATC if the S2 is changed.
> > 
> 
> I am still reviewing this patch, but just some quick questions.
> 
> 1) How does userspace do IOTLB maintenance for S1 in that case?

See

https://lore.kernel.org/linux-iommu/cover.1724776335.git.nicolinc@xxxxxxxxxx

Patch 17

Really, this series and that series must be together. We have a patch
planning issue to sort out here as well, all 27 should go together
into the same merge window.

> 2) Is there a reason the UAPI is designed this way?
> The way I imagined this, is that userspace will pass the pointer to the CD
> (+ format) not the STE (or part of it).

Yes, we need more information from the STE than just that. EATS and
STALL for instance. And the cachability below. Who knows what else in
the future.

We also want to support the V=0, Bypass and Abort STE configurations
under the nesting domain (V, CFG required) so that the VIOMMU can
remain affiliated with the STE in all cases. This is necessary to
allow VIOMMU event reporting to always work.

Looking at the masks:

STRTAB_STE_0_NESTING_ALLOWED = 0xf80fffffffffffff
STRTAB_STE_1_NESTING_ALLOWED = 0x380000ff

So we do use alot of the bits. Reformatting from the native HW format
into something else doesn't seem better for VMM or kernel..

This is similar to the invalidation design where we also just forward
the invalidation command as is in native HW format, and how IDR is
done the same.

Overall this sort of direct transparency is how I prefer to see these
kinds of iommufd HW specific interfaces designed. From a lot of
experience here, arbitary marshall/unmarshall is often an
antipattern :)

> Making user space messing with shareability and cacheability of S1 CD access
> feels odd. (Although CD configure page table access which is similar).

As I understand it, the walk of the CD table will be constrained by
the S2FWB, just like all the other accesses by the guest.

So we just take a consistent approach of allowing the guest to provide
memattrs in the vSTE, CD, and S1 page table and rely on the HW's S2FWB
to police it.

As you say there are lots of memattr type bits under direct guest
control, it doesn't necessarily make alot of sense to permit
everything in those contexts and then add extra code to do something
different here.

Though I agree it looks odd, it is self-consistent.

Jason




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux