Re: [PATCH v2 8/8] iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 30, 2024 at 02:04:26PM -0300, Jason Gunthorpe wrote:
> On Fri, Aug 30, 2024 at 04:09:04PM +0000, Mostafa Saleh wrote:
> > Hi Jason,
> > 
> > On Tue, Aug 27, 2024 at 12:51:38PM -0300, Jason Gunthorpe wrote:
> > > For SMMUv3 a IOMMU_DOMAIN_NESTED is composed of a S2 iommu_domain acting
> > > as the parent and a user provided STE fragment that defines the CD table
> > > and related data with addresses translated by the S2 iommu_domain.
> > > 
> > > The kernel only permits userspace to control certain allowed bits of the
> > > STE that are safe for user/guest control.
> > > 
> > > IOTLB maintenance is a bit subtle here, the S1 implicitly includes the S2
> > > translation, but there is no way of knowing which S1 entries refer to a
> > > range of S2.
> > > 
> > > For the IOTLB we follow ARM's guidance and issue a CMDQ_OP_TLBI_NH_ALL to
> > > flush all ASIDs from the VMID after flushing the S2 on any change to the
> > > S2.
> > > 
> > > Similarly we have to flush the entire ATC if the S2 is changed.
> > > 
> > 
> > I am still reviewing this patch, but just some quick questions.
> > 
> > 1) How does userspace do IOTLB maintenance for S1 in that case?
> 
> See
> 
> https://lore.kernel.org/linux-iommu/cover.1724776335.git.nicolinc@xxxxxxxxxx
> 
> Patch 17

Thanks, I had this series on my radar, I will check it by this week.

>
> Really, this series and that series must be together. We have a patch
> planning issue to sort out here as well, all 27 should go together
> into the same merge window.
> 
> > 2) Is there a reason the UAPI is designed this way?
> > The way I imagined this, is that userspace will pass the pointer to the CD
> > (+ format) not the STE (or part of it).
> 
> Yes, we need more information from the STE than just that. EATS and
> STALL for instance. And the cachability below. Who knows what else in
> the future.

But for example if that was extended later, how can user space know
which fields are allowed and which are not?

> 
> We also want to support the V=0, Bypass and Abort STE configurations
> under the nesting domain (V, CFG required) so that the VIOMMU can
> remain affiliated with the STE in all cases. This is necessary to
> allow VIOMMU event reporting to always work.
> 
> Looking at the masks:
> 
> STRTAB_STE_0_NESTING_ALLOWED = 0xf80fffffffffffff
> STRTAB_STE_1_NESTING_ALLOWED = 0x380000ff
> 
> So we do use alot of the bits. Reformatting from the native HW format
> into something else doesn't seem better for VMM or kernel..
> 
> This is similar to the invalidation design where we also just forward
> the invalidation command as is in native HW format, and how IDR is
> done the same.
> 
> Overall this sort of direct transparency is how I prefer to see these
> kinds of iommufd HW specific interfaces designed. From a lot of
> experience here, arbitary marshall/unmarshall is often an
> antipattern :)

Is there any documentation for the (proposed) SMMUv3 UAPI for IOMMUFD?
I can understand reading IDRs from userspace (with some sanitation),
but adding some more logic to map vSTE to STE needs more care of what
kind of semantics are provided.

Also, I am working on similar interface for pKVM where we “paravirtualize”
the SMMU access for guests, it’s different semantics, but I hope we can
align that with IOMMUFD (but it’s nowhere near upstream now)

I see you are talking in LPC about IOMMUFD:
https://lore.kernel.org/linux-iommu/0-v1-01fa10580981+1d-iommu_pt_jgg@xxxxxxxxxx/T/#m2dbb08f3bf8506a492bc7dda2de662e42371e683

Do you have any plans to talk about this also?

Thanks,
Mostafa
> 
> > Making user space messing with shareability and cacheability of S1 CD access
> > feels odd. (Although CD configure page table access which is similar).
> 
> As I understand it, the walk of the CD table will be constrained by
> the S2FWB, just like all the other accesses by the guest.
> 
> So we just take a consistent approach of allowing the guest to provide
> memattrs in the vSTE, CD, and S1 page table and rely on the HW's S2FWB
> to police it.
> 
> As you say there are lots of memattr type bits under direct guest
> control, it doesn't necessarily make alot of sense to permit
> everything in those contexts and then add extra code to do something
> different here.
> 
> Though I agree it looks odd, it is self-consistent.
> 
> Jason




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux