Re: [RFC v3 14/25] intel_iommu: add virtual command capability support

Peter Xu <peterx@xxxxxxxxxx> · Thu, 13 Feb 2020 09:31:10 -0500

On Thu, Feb 13, 2020 at 02:40:45AM +0000, Liu, Yi L wrote:
> > From: Peter Xu <peterx@xxxxxxxxxx>
> > Sent: Wednesday, February 12, 2020 5:57 AM
> > To: Liu, Yi L <yi.l.liu@xxxxxxxxx>
> > Subject: Re: [RFC v3 14/25] intel_iommu: add virtual command capability support
> > 
> > On Wed, Jan 29, 2020 at 04:16:45AM -0800, Liu, Yi L wrote:
> > > +/*
> > > + * The basic idea is to let hypervisor to set a range for available
> > > + * PASIDs for VMs. One of the reasons is PASID #0 is reserved by
> > > + * RID_PASID usage. We have no idea how many reserved PASIDs in future,
> > > + * so here just an evaluated value. Honestly, set it as "1" is enough
> > > + * at current stage.
> > > + */
> > > +#define VTD_MIN_HPASID              1
> > > +#define VTD_MAX_HPASID              0xFFFFF
> > 
> > One more question: I see that PASID is defined as 20bits long.  It's
> > fine.  However I start to get confused on how the Scalable Mode PASID
> > Directory could service that much of PASID entries.
> > 
> > I'm looking at spec 3.4.3, Figure 3-8.
> > 
> > Firstly, we only have two levels for a PASID table.  The context entry
> > of a device stores a pointer to the "Scalable Mode PASID Directory"
> > page. I see that there're 2^14 entries in "Scalable Mode PASID
> > Directory" page, each is a "Scalable Mode PASID Table".
> > However... how do we fit in the 4K page if each entry is a pointer of
> > x86_64 (8 bytes) while there're 2^14 entries?  A simple math gives me
> > 4K/8 = 512, which means the "Scalable Mode PASID Directory" page can
> > only have 512 entries, then how the 2^14 come from?  Hmm??
> 
> I checked with Kevin. The spec doesn't say the dir table is 4K. It says 4K
> only for pasid table. Also, if you look at 9.4, scalabe-mode context entry
> includes a PDTS field to specify the actual size of the directory table.

Ah I see.  Then it seems to be lost then in this series.  Say, I think
vtd_sm_pasid_table_walk() should also stop walking until reaching the
size there, and you need to fetch that size info from the context
entry before walk starts.

> 
> > Apart of this: also I just noticed (when reading the latter part of
> > the series) that the time that a pasid table walk can consume will
> > depend on this value too.  I'd suggest to make this as small as we
> > can, as long as it satisfies the usage.  We can even bump it in the
> > future.
> 
> I see. This looks to be an optimization. right? Instead of modify the
> value of this macro,  I think we can do this optimization by tracking
> the allocated PASIDs in QEMU. Thus, the pasid table walk  would be more
> efficient and also no dependency on the VTD_MAX_HPASID. Does it make
> sense to you? :-)

Yeah sounds good. :)

Thanks,

-- 
Peter Xu