Re: [PATCH rfc] vfio-pci: Allow write combining

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 7 Aug 2024 11:19:10 -0300
Jason Gunthorpe <jgg@xxxxxxxx> wrote:

> On Tue, Aug 06, 2024 at 12:43:02PM -0600, Alex Williamson wrote:
> 
> > > So we don't leak this too much into the drivers? Why should all the
> > > VFIO drivers have to be changed to alter how their region indexes work
> > > just to add a single flag??  
> 
> > I don't know how you're coming to this conclusion.  
> 
> Ideally we'd want to support the WC option basically everywhere.
> 
> > > I fear we might need to do this as there may not be room in the pgoff
> > > space (at least for 32 bit) to duplicate everything....  
> 
> > We'll root out userspace drivers that hard code region offsets in doing
> > this, but otherwise it shouldn't be an issue.  
> 
> The issue is running out of pgoff bits on 32 bit. Maybe this isn't an
> issue for VFIO, but it was for RDMA. We needed tight optimal on-demand
> packing of actual requested mmaps. Allocating gigabytes of address
> space for possible mmaps ran out of pgoff bits. :\

If we only implemented WC for 64-bit, would anyone notice?
 
> > How does an "mmap cookie" not duplicate that a device range is
> > accessible through multiple offsets of the vfio device file?  
> 
> pgoff duplcation is not really an issue, from an API perspective the
> driver would call a helper to convert the pgoff into a region index
> and mmap flags. It wouldn't matter to any driver how many duplicates
> there are.

Which is exactly my point whether we call it a region or an mmap
cookie.  In one case we're trying to give a bare pgoff that effectively
aliases to a region with different mapping flags, in the other the
pgoff is exposed through a new region offset that does exactly the same
thing.

> > Well first, we're not talking about a fixed number of additional
> > regions, we're talking about defining region identifiers for each BAR
> > with a WC mapping attribute, but at worst we'd only populate
> > implemented MMIO BARs.  But then we've also mentioned that a device
> > feature could be used to allow a userspace driver to selectively bring
> > these regions into existence.  In an case, an mmap cookie also consumes
> > address space from the vfio device file, so I'm still failing to see
> > how calling them a region vs just an mmap cookie makes a substantive
> > difference.  
> 
> You'd only allocate the mmap cookie when userspace requests it.

I've suggested a mechanism using DEVICE_FEATURE that could do this for
regions.

> My original suggestion was to send a flag to REGION_INFO to
> specifically ask for the different behavior, that (and only that)
> would return new mmap cookies.

Which can't work because flags is only an output field.

> The alternative version of this might be to have a single
> 'GET_REGION_MMAP' that gives a new mmap cookie for a singular
> specified region index. Userspace would call REGION_INFO to learn the
> memory regions and then it could call GET_REGION_MMAP(REQ_WC) and will
> get back a single dynamic mmap cookie that links the WC flags.
> 
> No system call, no cookie allocation. Existing apps don't start seeing
> more regions from REGION_INFO. Drivers keep region indexes 1:1 with HW
> objects. The uAPI has room to add more mmap flags.

Please tell me how this is ultimately different from invoking a
DEVICE_FEATURE call to request that a new device specific region be
created with the desired mappings.  In the short term, if we run out of
pgoff, the user gets an -ENOSPC.  DEVICE_INFO is updated with the new
number of regions, existing region indexes are unchanged, the user
iterates new indexes with REGION_INFO to get the offset and identifies
them using the previously proposed region types.  Thanks,

Alex





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux