On 08/12/16 09:36, Auger Eric wrote: > Hi, > > On 15/11/2016 14:09, Eric Auger wrote: >> Following LPC discussions, we now report reserved regions through >> iommu-group sysfs reserved_regions attribute file. > > > While I am respinning this series into v4, here is a tentative summary > of technical topics for which no consensus was reached at this point. > > 1) Shall we report the usable IOVA range instead of reserved IOVA > ranges. Not discussed at/after LPC. > x I currently report reserved regions. Alex expressed the need to > report the full usable IOVA range instead (x86 min-max range > minus MSI APIC window). I think this is meaningful for ARM > too where arm-smmu might not support the full 64b range. > x Any objection we report the usable IOVA regions instead? The issue with that is that we can't actually report "the usable regions" at the moment, as that involves pulling together disjoint properties of arbitrary hardware unrelated to the IOMMU. We'd be reporting "the not-definitely-unusable regions, which may have some unusable holes in them still". That seems like an ABI nightmare - I'd still much rather say "here are some, but not necessarily all, regions you definitely can't use", because saying "here are some regions which you might be able to use most of, probably" is what we're already doing today, via a single implicit region from 0 to ULONG_MAX ;) The address space limits are definitely useful to know, but I think it would be better to expose them separately to avoid the ambiguity. At worst, I guess it would be reasonable to express the limits via an "out-of-range" reserved region type for 0 to $base and $top to ULONG-MAX. To *safely* expose usable regions, we'd have to start out with a very conservative assumption (e.g. only IOVAs matching physical RAM), and only expand them once we're sure we can detect every possible bit of problematic hardware in the system - that's just too limiting to be useful. And if we expose something knowingly inaccurate, we risk having another "bogoMIPS in /proc/cpuinfo" ABI burden on our hands, and nobody wants that... > 2) Shall the kernel check collision with MSI window* when userspace > calls VFIO_IOMMU_MAP_DMA? > Joerg/Will No; Alex yes > *for IOVA regions consumed downstream to the IOMMU: everyone says NO If we're starting off by having the SMMU drivers expose it as a fake fixed region, I don't think we need to worry about this yet. We all seem to agree that as long as we communicate the fixed regions to userspace, it's then userspace's job to work around them. Let's come back to this one once we actually get to the point of dynamically sizing and allocating 'real' MSI remapping region(s). Ultimately, the kernel *will* police collisions either way, because an underlying iommu_map() is going to fail if overlapping IOVAs are ever actually used, so it's really just a question of whether to have a more user-friendly failure mode. > 3) RMRR reporting in the iommu group sysfs? Joerg: yes; Don: no > My current series does not expose them in iommu group sysfs. > I understand we can expose the RMRR regions in the iomm group sysfs > without necessarily supporting RMRR requiring device assignment. > We can also add this support later. As you say, reporting them doesn't necessitate allowing device assignment, and it's information which can already be easily grovelled out of dmesg (for intel-iommu at least) - there doesn't seem to be any need to hide them, but the x86 folks can have the final word on that. Robin. > Thanks > > Eric > > >> >> Reserved regions are populated through the IOMMU get_resv_region callback >> (former get_dm_regions), now implemented by amd-iommu, intel-iommu and >> arm-smmu. >> >> The intel-iommu reports the [FEE0_0000h - FEF0_000h] MSI window as an >> IOMMU_RESV_NOMAP reserved region. >> >> arm-smmu reports the MSI window (arbitrarily located at 0x8000000 and >> 1MB large) and the PCI host bridge windows. >> >> The series integrates a not officially posted patch from Robin: >> "iommu/dma: Allow MSI-only cookies". >> >> This series currently does not address IRQ safety assessment. >> >> Best Regards >> >> Eric >> >> Git: complete series available at >> https://github.com/eauger/linux/tree/v4.9-rc5-reserved-rfc-v3 >> >> History: >> RFC v2 -> v3: >> - switch to an iommu-group sysfs API >> - use new dummy allocator provided by Robin >> - dummy allocator initialized by vfio-iommu-type1 after enumerating >> the reserved regions >> - at the moment ARM MSI base address/size is left unchanged compared >> to v2 >> - we currently report reserved regions and not usable IOVA regions as >> requested by Alex >> >> RFC v1 -> v2: >> - fix intel_add_reserved_regions >> - add mutex lock/unlock in vfio_iommu_type1 >> >> >> Eric Auger (10): >> iommu/dma: Allow MSI-only cookies >> iommu: Rename iommu_dm_regions into iommu_resv_regions >> iommu: Add new reserved IOMMU attributes >> iommu: iommu_alloc_resv_region >> iommu: Do not map reserved regions >> iommu: iommu_get_group_resv_regions >> iommu: Implement reserved_regions iommu-group sysfs file >> iommu/vt-d: Implement reserved region get/put callbacks >> iommu/arm-smmu: Implement reserved region get/put callbacks >> vfio/type1: Get MSI cookie >> >> drivers/iommu/amd_iommu.c | 20 +++--- >> drivers/iommu/arm-smmu.c | 52 +++++++++++++++ >> drivers/iommu/dma-iommu.c | 116 ++++++++++++++++++++++++++------- >> drivers/iommu/intel-iommu.c | 50 ++++++++++---- >> drivers/iommu/iommu.c | 141 ++++++++++++++++++++++++++++++++++++---- >> drivers/vfio/vfio_iommu_type1.c | 26 ++++++++ >> include/linux/dma-iommu.h | 7 ++ >> include/linux/iommu.h | 49 ++++++++++---- >> 8 files changed, 391 insertions(+), 70 deletions(-) >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html