This series improves the way drivers get their FW data. It introduces a series of driver facing APIs and converts every driver to use them and replaces fwspec with this mechanism. The new API allows drivers to do less stuff in alot of cases, and permits all drivers to sequence their operations in a logical way with correct error handling. It doesn't force the idea of pre or post parse on the driver, though I choose to implement post-parse here. Aside from the API change this takes on a few other goals: 1) New API for drivers to simplify drivers and make them more robust 2) Remove all FW parsing from _dma_configure() 3) Make the global lock back into a static 4) Organize things to reduce the probe time memory allocations from fw_spec, N*realloc, dev_iommu, device-private To just allocating device-private. The last step to merge dev_iommu and device-private would be a follow on couple of patches. 5) Get everything setup to allow all devices to probe during bus scan, not during _dma_configure. 6) Better layering between the ACPI and IOMMU code, less iommu related things in the ACPI directory. The drivers break down into two broad approaches for accessing FW in a driver's probe_device() function. AMD, Intel, mtk_v1, omap, S390, smmu-legacy and tegra can "self probe" using code in their probe_device() functions to parse their associated FW descriptions. The rest (DART, smmu-modern, qcom, smmu-v3, exynos, ipmmu, msm, mtk, rockchip, sprd, sun50i, virtio) use the iommu_fwspec system and rely on the core code to do the FW parsing for them before invoking probe_device(). To understand the philosophy of this design it is worth to take a moment and review the 2009 LWN article on Kernel Design patterns (https://lwn.net/Articles/336262/) exploring the "midlayer mistake". While I do not entirely agree that midlayers are unconditionally bad, the discussion does nicely frame the architectural choice I've made in this series. Organize the FW parsing so the drivers have an toolbox style API to extract the information they require from the FW. Create a set of APIs for the drivers to call that do exactly what the current set of drivers need in an efficient and understandable way. The API is designed to significantly clean and simplify the drivers. The API transforms the iommu_fwspec from a long-term allocated struct into a short-term on-stack struct iommu_probe_info. Using a short-term structure means we no longer have to be memory efficient and can store temporary details about the ongoing parsing in the structure which is the key to allowing the new API. Further, philosophically, the API changes to make the iommu driver responsible to provide per-device storage for any parsed FW data. It is stored in whatever format the driver finds appropriate. The API provides concise helpers to make this easy for the standard 'u32 id[]' cases. The unusual cases were already using unique per-driver storage and continue to do so in a cleaner, less convoluted way. The design allows the API to be implemented either as a 'parse on demand' or 'parse and cache in advance' (more like fwspec) approach. I've implemented the 'parse on demand' version here, but I'm not fixed on it. Let's discuss. The choice is deliberately not baked into the driver facing API, iommu_probe_info combined with the generic *_iommu_for_each_id() lower layer provides a lot of flexibility to do whatever organization we want. The specific commit messages provide details, but the drivers break down into three basic idiomatic sequences for their probe_device with the new API. Single iommu with no ID list: iommu_driver = iommu_of_get_single_iommu(pinf, ops, num_cells, iommu_driver_struct, iommu_member); per_driver = kzalloc(..) per_driver->iommu = iommu_driver; dev_iommu_priv_set(dev, per_driver); return &iommu_driver->iommu_member; Single iommu with a simple u32 ID list: iommu_driver = iommu_of_get_single_iommu(pinf, ops, num_cells, iommu_driver_struct, iommu_member); per_driver = iommu_fw_alloc_per_device_ids(pinf, per_driver); per_driver->iommu = iommu_driver; dev_iommu_priv_set(dev, per_driver); return &iommu_driver->iommu_member; The iommu_fw_alloc_per_device_ids() helper allocates a correctly sized per-driver struct and places a u32 ID list into the flex array per_driver->ids[]. This removes the need for a fw_spec allocation. Complex multi-iommu or complex ID list: static int driver_of_xlate(struct iommu_device *iommu, struct of_phandle_args *args, void *priv); per_driver = kzalloc(...); iommu_of_xlate(pinf, &ops, num_cells, &driver_of_xlate, per_driver); dev_iommu_priv_set(dev, per_driver); return &per_driver->iommu; // The first iommu_device parsed The driver will process the given (iommu, args) tuples and store them into the per_driver struct in its own way (eg DART encodes things into a 2D array). Allocating the per_driver struct before running the parse cleans all the tortured logic. The VIOT and IORT ACPI parsers provide the same API. Drivers supporting ACPI will call the ACPI helper (ie iommu_iort_get_single_iommu()) which understands how to parse OF if no ACPI is present. Since the SMMU drivers directly invoke the IORT parser they also get back the extra IORT ACPI data in a "struct iort_params", which eliminates the need to create SW properties or store SMMU only stuff in the fwspec. The bulk of the series is converting each driver to this toolbox API. The design of the toolbox functions are lightweight from a driver perspective and generally replace existing steps that the drivers already had to do. A significant amount of inefficient boiler plate is removed from all drivers related to how the driver obtains the iommu_device pointers. The implementation of the three flavours of FW parsers (OF/IORT/VIOT) are all structured the same. The lower FW level provides a *_iommu_for_each_id() function which parses and iterates over the actual FW table and calls a function pointer for each (instance, id). The iommu layer then uses this for_each primitive to do the IOMMU specific stuff and create the above APIs. To allow the iommu_of_get_single_iommu()/iommu_fw_alloc_per_device_ids() split API the get_single will count all of the IDs and then alloc will size the flex array. The iommu_probe_info has scratch space for some ids which can then be memcpy'd without a reparse. Reparse is kept as a backup so we can handle arbitary complexity without burdening the typical fast path. The removal of all FW parsing from _dma_configure leaves only the initialization of the iommu_probe_info behind. My plan is to add a new bus op 'init iommu probe info' that will do this step and then we can fully execute probe outside the *_dma_configure() context. The big win here is the extensive cleaning of the driver's probe paths. There is alot of "creative" stuff there, this cleans almost all of it away. This is on github: https://github.com/jgunthorpe/linux/commits/iommu_fwspec Cc: Hector Martin <marcan@xxxxxxxxx> Cc: André Draszik <andre.draszik@xxxxxxxxxx> Signed-off-by: Jason Gunthorpe <jgg@xxxxxxxxxx> Jason Gunthorpe (30): iommu/of: Make a of_iommu_for_each_id() ACPI: VIOT: Make a viot_iommu_for_each_id() ACPI: IORT: Make a iort_iommu_for_each_id() ACPI: IORT: Remove fwspec from the reserved region code iommu: Add iommu_probe_info iommu: Make iommu_ops_from_fwnode() return the iommu_device iommu/of: Call of_iommu_get_resv_regions() directly iommu/of: Add iommu_of_get_single_iommu() iommu/rockchip: Move to iommu_of_get_single_iommu() iommu/sprd: Move to iommu_of_get_single_iommu() iommu/sun50i: Move to iommu_of_get_single_iommu() iommu/of: Add iommu_of_xlate() iommu/dart: Move to iommu_of_xlate() iommu/exynos: Move to iommu_of_xlate() iommu/msm: Move to iommu_of_xlate() iommu/tegra: Route tegra_dev_iommu_get_stream_id() through an op iommu: Add iommu_fw_alloc_per_device_ids() iommu/tegra: Move to iommu_fw_alloc_per_device_ids() iommu/mtk: Move to iommu_fw_alloc_per_device_ids() iommu/ipmmu-vmsa: Move to iommu_fw_alloc_per_device_ids() iommu/mtk_v1: Move to iommu_fw_alloc_per_device_ids() iommu/qcom: Move to iommu_fw_alloc_per_device_ids() iommu/viot: Add iommu_viot_get_single_iommu() iommu/virtio: Move to iommu_fw_alloc_per_device_ids() iommu/iort: Add iommu_iort_get_single_iommu() iommu/arm-smmu-v3: Move to iommu_fw_alloc_per_device_ids() iommu/arm-smmu: Move to iommu_of_xlate() iommu: Call all drivers if there is no fwspec iommu: Check for EPROBE_DEFER using the new FW parsers iommu: Remove fwspec and related drivers/acpi/arm64/iort.c | 233 +++++++--------- drivers/acpi/scan.c | 58 +--- drivers/acpi/viot.c | 67 ++--- drivers/iommu/Makefile | 2 + drivers/iommu/apple-dart.c | 62 +++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 73 ++--- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 + drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 6 +- drivers/iommu/arm/arm-smmu/arm-smmu.c | 209 +++++++------- drivers/iommu/arm/arm-smmu/arm-smmu.h | 13 +- drivers/iommu/arm/arm-smmu/qcom_iommu.c | 160 +++++------ drivers/iommu/dma-iommu.c | 22 -- drivers/iommu/dma-iommu.h | 6 - drivers/iommu/exynos-iommu.c | 78 +++--- drivers/iommu/iommu.c | 271 ++++++++++++------ drivers/iommu/iort_iommu.c | 99 +++++++ drivers/iommu/ipmmu-vmsa.c | 95 +++---- drivers/iommu/msm_iommu.c | 92 +++---- drivers/iommu/mtk_iommu.c | 115 ++++---- drivers/iommu/mtk_iommu_v1.c | 162 +++++------ drivers/iommu/of_iommu.c | 288 ++++++++++++++------ drivers/iommu/rockchip-iommu.c | 73 ++--- drivers/iommu/sprd-iommu.c | 31 +-- drivers/iommu/sun50i-iommu.c | 59 ++-- drivers/iommu/tegra-smmu.c | 158 ++++------- drivers/iommu/viot_iommu.c | 71 +++++ drivers/iommu/virtio-iommu.c | 71 ++--- include/acpi/acpi_bus.h | 3 - include/linux/acpi_iort.h | 24 +- include/linux/acpi_viot.h | 16 +- include/linux/iommu-driver.h | 250 +++++++++++++++++ include/linux/iommu.h | 101 +------ include/linux/of_iommu.h | 8 - 33 files changed, 1649 insertions(+), 1331 deletions(-) create mode 100644 drivers/iommu/iort_iommu.c create mode 100644 drivers/iommu/viot_iommu.c create mode 100644 include/linux/iommu-driver.h base-commit: 68ec454bc1514f557686b5895dd9719e18d31705 -- 2.42.0