On 08/12/16 13:36, Auger Eric wrote: > Hi Robin, > > On 08/12/2016 14:14, Robin Murphy wrote: >> On 08/12/16 09:36, Auger Eric wrote: >>> Hi, >>> >>> On 15/11/2016 14:09, Eric Auger wrote: >>>> Following LPC discussions, we now report reserved regions through >>>> iommu-group sysfs reserved_regions attribute file. >>> >>> >>> While I am respinning this series into v4, here is a tentative summary >>> of technical topics for which no consensus was reached at this point. >>> >>> 1) Shall we report the usable IOVA range instead of reserved IOVA >>> ranges. Not discussed at/after LPC. >>> x I currently report reserved regions. Alex expressed the need to >>> report the full usable IOVA range instead (x86 min-max range >>> minus MSI APIC window). I think this is meaningful for ARM >>> too where arm-smmu might not support the full 64b range. >>> x Any objection we report the usable IOVA regions instead? >> >> The issue with that is that we can't actually report "the usable >> regions" at the moment, as that involves pulling together disjoint >> properties of arbitrary hardware unrelated to the IOMMU. We'd be >> reporting "the not-definitely-unusable regions, which may have some >> unusable holes in them still". That seems like an ABI nightmare - I'd >> still much rather say "here are some, but not necessarily all, regions >> you definitely can't use", because saying "here are some regions which >> you might be able to use most of, probably" is what we're already doing >> today, via a single implicit region from 0 to ULONG_MAX ;) >> >> The address space limits are definitely useful to know, but I think it >> would be better to expose them separately to avoid the ambiguity. At >> worst, I guess it would be reasonable to express the limits via an >> "out-of-range" reserved region type for 0 to $base and $top to >> ULONG-MAX. To *safely* expose usable regions, we'd have to start out >> with a very conservative assumption (e.g. only IOVAs matching physical >> RAM), and only expand them once we're sure we can detect every possible >> bit of problematic hardware in the system - that's just too limiting to >> be useful. And if we expose something knowingly inaccurate, we risk >> having another "bogoMIPS in /proc/cpuinfo" ABI burden on our hands, and >> nobody wants that... > Makes sense to me. "out-of-range reserved region type for 0 to $base and > $top to ULONG-MAX" can be an alternative to fulfill the requirement. >> >>> 2) Shall the kernel check collision with MSI window* when userspace >>> calls VFIO_IOMMU_MAP_DMA? >>> Joerg/Will No; Alex yes >>> *for IOVA regions consumed downstream to the IOMMU: everyone says NO >> >> If we're starting off by having the SMMU drivers expose it as a fake >> fixed region, I don't think we need to worry about this yet. We all seem >> to agree that as long as we communicate the fixed regions to userspace, >> it's then userspace's job to work around them. Let's come back to this >> one once we actually get to the point of dynamically sizing and >> allocating 'real' MSI remapping region(s). >> >> Ultimately, the kernel *will* police collisions either way, because an >> underlying iommu_map() is going to fail if overlapping IOVAs are ever >> actually used, so it's really just a question of whether to have a more >> user-friendly failure mode. > That's true on ARM but not on x86 where the APIC MSI region is not > mapped I think. Yes, but the APIC interrupt region is fixed, i.e. it falls under "IOVA regions consumed downstream to the IOMMU" since the DMAR units are physically incapable of remapping those addresses. I take "MSI window" to mean specifically the thing we have to set aside and get a magic remapping cookie for, which is also why it warrants its own special internal type - we definitely *don't* want VFIO trying to set up a remapping cookie on x86. We just don't let userspace know about the difference, yet (if ever). Robin. >>> 3) RMRR reporting in the iommu group sysfs? Joerg: yes; Don: no >>> My current series does not expose them in iommu group sysfs. >>> I understand we can expose the RMRR regions in the iomm group sysfs >>> without necessarily supporting RMRR requiring device assignment. >>> We can also add this support later. >> >> As you say, reporting them doesn't necessitate allowing device >> assignment, and it's information which can already be easily grovelled >> out of dmesg (for intel-iommu at least) - there doesn't seem to be any >> need to hide them, but the x86 folks can have the final word on that. > agreed > > Thanks > > Eric >> >> Robin. >> >>> Thanks >>> >>> Eric >>> >>> >>>> >>>> Reserved regions are populated through the IOMMU get_resv_region callback >>>> (former get_dm_regions), now implemented by amd-iommu, intel-iommu and >>>> arm-smmu. >>>> >>>> The intel-iommu reports the [FEE0_0000h - FEF0_000h] MSI window as an >>>> IOMMU_RESV_NOMAP reserved region. >>>> >>>> arm-smmu reports the MSI window (arbitrarily located at 0x8000000 and >>>> 1MB large) and the PCI host bridge windows. >>>> >>>> The series integrates a not officially posted patch from Robin: >>>> "iommu/dma: Allow MSI-only cookies". >>>> >>>> This series currently does not address IRQ safety assessment. >>>> >>>> Best Regards >>>> >>>> Eric >>>> >>>> Git: complete series available at >>>> https://github.com/eauger/linux/tree/v4.9-rc5-reserved-rfc-v3 >>>> >>>> History: >>>> RFC v2 -> v3: >>>> - switch to an iommu-group sysfs API >>>> - use new dummy allocator provided by Robin >>>> - dummy allocator initialized by vfio-iommu-type1 after enumerating >>>> the reserved regions >>>> - at the moment ARM MSI base address/size is left unchanged compared >>>> to v2 >>>> - we currently report reserved regions and not usable IOVA regions as >>>> requested by Alex >>>> >>>> RFC v1 -> v2: >>>> - fix intel_add_reserved_regions >>>> - add mutex lock/unlock in vfio_iommu_type1 >>>> >>>> >>>> Eric Auger (10): >>>> iommu/dma: Allow MSI-only cookies >>>> iommu: Rename iommu_dm_regions into iommu_resv_regions >>>> iommu: Add new reserved IOMMU attributes >>>> iommu: iommu_alloc_resv_region >>>> iommu: Do not map reserved regions >>>> iommu: iommu_get_group_resv_regions >>>> iommu: Implement reserved_regions iommu-group sysfs file >>>> iommu/vt-d: Implement reserved region get/put callbacks >>>> iommu/arm-smmu: Implement reserved region get/put callbacks >>>> vfio/type1: Get MSI cookie >>>> >>>> drivers/iommu/amd_iommu.c | 20 +++--- >>>> drivers/iommu/arm-smmu.c | 52 +++++++++++++++ >>>> drivers/iommu/dma-iommu.c | 116 ++++++++++++++++++++++++++------- >>>> drivers/iommu/intel-iommu.c | 50 ++++++++++---- >>>> drivers/iommu/iommu.c | 141 ++++++++++++++++++++++++++++++++++++---- >>>> drivers/vfio/vfio_iommu_type1.c | 26 ++++++++ >>>> include/linux/dma-iommu.h | 7 ++ >>>> include/linux/iommu.h | 49 ++++++++++---- >>>> 8 files changed, 391 insertions(+), 70 deletions(-) >>>> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe kvm" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html