Re: [RFC 11/20] iommu/iommufd: Add IOMMU_IOASID_ALLOC/FREE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 14, 2021 at 11:52:08AM -0300, Jason Gunthorpe wrote:
> On Thu, Oct 14, 2021 at 03:53:33PM +1100, David Gibson wrote:
> 
> > > My feeling is that qemu should be dealing with the host != target
> > > case, not the kernel.
> > > 
> > > The kernel's job should be to expose the IOMMU HW it has, with all
> > > features accessible, to userspace.
> > 
> > See... to me this is contrary to the point we agreed on above.
> 
> I'm not thinking of these as exclusive ideas.
> 
> The IOCTL interface in iommu can quite happily expose:
>  Create IOAS generically
>  Manipulate IOAS generically
>  Create IOAS with IOMMU driver specific attributes
>  HW specific Manipulate IOAS
> 
> IOCTL commands all together.
> 
> So long as everything is focused on a generic in-kernel IOAS object it
> is fine to have multiple ways in the uAPI to create and manipulate the
> objects.
> 
> When I speak about a generic interface I mean "Create IOAS
> generically" - ie a set of IOCTLs that work on most IOMMU HW and can
> be relied upon by things like DPDK/etc to always work and be portable.
> This is why I like "hints" to provide some limited widely applicable
> micro-optimization.
> 
> When I said "expose the IOMMU HW it has with all features accessible"
> I mean also providing "Create IOAS with IOMMU driver specific
> attributes".
> 
> These other IOCTLs would allow the IOMMU driver to expose every
> configuration knob its HW has, in a natural HW centric language.
> There is no pretense of genericness here, no crazy foo=A, foo=B hidden
> device specific interface.
> 
> Think of it as a high level/low level interface to the same thing.

Ok, I see what you mean.

> > Those are certainly wrong, but they came about explicitly by *not*
> > being generic rather than by being too generic.  So I'm really
> > confused aso to what you're arguing for / against.
> 
> IMHO it is not having a PPC specific interface that was the problem,
> it was making the PPC specific interface exclusive to the type 1
> interface. If type 1 continued to work on PPC then DPDK/etc would
> never learned PPC specific code.

Ok, but the reason this happened is that the initial version of type 1
*could not* be used on PPC.  The original Type 1 implicitly promised a
"large" IOVA range beginning at IOVA 0 without any real way of
specifying or discovering how large that range was.  Since ppc could
typically only give a 2GiB range at IOVA 0, that wasn't usable.

That's why I say the problem was not making type1 generic enough.  I
believe the current version of Type1 has addressed this - at least
enough to be usable in common cases.  But by this time the ppc backend
is already out there, so no-one's had the capacity to go back and make
ppc work with Type1.

> For iommufd with the high/low interface each IOMMU HW should ask basic
> questions:
> 
>  - What should the generic high level interface do on this HW?
>    For instance what should 'Create IOAS generically' do for PPC?
>    It should not fail, it should create *something*
>    What is the best thing for DPDK?
>    I guess the 64 bit window is most broadly useful.

Right, which means the kernel must (at least in the common case) have
the capcity to choose and report a non-zero base-IOVA.

Hrm... which makes me think... if we allow this for the common
kernel-managed case, do we even need to have capcity in the high-level
interface for reporting IO holes?  If the kernel can choose a non-zero
base, it could just choose on x86 to place it's advertised window
above the IO hole.

>  - How to accurately describe the HW in terms of standard IOAS objects
>    and where to put HW specific structs to support this.
> 
>    This is where PPC would decide how best to expose a control over
>    its low/high window (eg 1,2,3 IOAS). Whatever the IOMMU driver
>    wants, so long as it fits into the kernel IOAS model facing the
>    connected device driver.
> 
> QEMU would have IOMMU userspace drivers. One would be the "generic
> driver" using only the high level generic interface. It should work as
> best it can on all HW devices. This is the fallback path you talked
> of.
> 
> QEMU would also have HW specific IOMMU userspace drivers that know how
> to operate the exact HW. eg these drivers would know how to use
> userspace page tables, how to form IOPTEs and how to access the
> special features.
> 
> This is how QEMU could use an optimzed path with nested page tables,
> for instance.

The concept makes sense in general.  The devil's in the details, as usual.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux