On 15/11/17 03:28, Alex Williamson wrote: > On Tue, 14 Nov 2017 13:29:02 +1100 > Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx> wrote: > >> On Tue, 2017-11-14 at 13:23 +1100, David Gibson wrote: >>>>>> 1. Allow msix mapping to the userspace (to address non-64k-aligned msix bar) >>> >>> We have a new plan on this - I'll discuss it over IRC. >>> >>>>>> 2. Allow write combining in vfio for the userspace (kvm guest is kinda >>>>>> special and may simply ignore mapping flags in some configs but PPC radix >>>>>> guests still rely on this) >>> >>> AIUI this isn't for radix, but for DPDK things that we need this. Ben >>> talked about it a bit, but I don't know what the outcome was. >> >> So this is not a powerpc specific issue. Other archs similarily want to >> be able to do write combine mappings. >> >> The way sysfs does it is that for prefetchable BARs, it exposes both >> a resourceN and a resourceN_wc file. >> >> For VFIO it's a bit more tricky, maybe we need to game the offset using >> some of it as flags but that's very fishy, or maybe we do some kind of >> ioctl that selects the attributes used for that fd instance for >> subsequent mappings... >> >> I'll let Alex chose what he feels most appropriate here. > > My order of preference would be something like: > > - mmap flags provide some way for the user to specify a wc mapping > within existing regions There are plenty of flags but none really matches, checked with Paul. > - some other mechanism of using the existing regions I can only think of madvise but it does not have appropriate flags either. > - additional regions provided for use exclusively with wc attributes > (generalizing PCI BAR wc regions within device specific regions) Adding VFIO_PCI_BAR0_WC_REGION_INDEX for VFIO_PCI_BAR0_REGION_INDEX (and so on for other BARs) seems a viable option. However the comment for VFIO_PCI_xxx_REGION_INDEX says: VFIO_PCI_NUM_REGIONS = 9 /* Fixed user ABI, region indexes >=9 use */ /* device specific cap to define content. */ which limits me in where I can add new indexes, I cannot just add new _WC indexes to that enum, can I? I cannot see any existing regions above 9 yet though. > - additional file descriptors provided for wc access It could be a capability + iocti(VFIO_DEVICE_GET_WC_RESOURCE) which would take a BAR index, check if the BAR is prefetchable and if so - return an fd which the userspace then could mmap(). This is won't break that ABI with 9 regions but it is the least favourable in the list... > This isn't at the top of my priority list to figure out the solution, > so whoever implements it will need to provide justification as they > move down the list from more to less preferred solutions. Thanks, I am trying... I was really counting on you guys having this discussed in Prague :( -- Alexey