On Wed, Oct 23, 2024 at 12:22:15AM -0700, Nicolin Chen wrote: > On Thu, Sep 05, 2024 at 03:01:19PM -0300, Jason Gunthorpe wrote: > > On Tue, Aug 27, 2024 at 10:02:06AM -0700, Nicolin Chen wrote: > > > The iommufd core provides a lookup helper for an IOMMU driver to find a > > > device pointer by device's per-viommu virtual ID. Yet a driver may need > > > an inverted lookup to find a device's per-viommu virtual ID by a device > > > pointer, e.g. when reporting virtual IRQs/events back to the user space. > > > In this case, it'd be unsafe for iommufd core to do an inverted lookup, > > > as the driver can't track the lifecycle of a viommu object or a vdev_id > > > object. > > > > > > Meanwhile, some HW can even support virtual device ID lookup by its HW- > > > accelerated virtualization feature. E.g. Tegra241 CMDQV HW supports to > > > execute vanilla guest-issued SMMU commands containing virtual Stream ID > > > but requires software to configure a link between virtual Stream ID and > > > physical Stream ID via HW registers. So not only the iommufd core needs > > > a vdev_id lookup table, drivers will want one too. > > > > > > Given the two justifications above, it's the best practice to provide a > > > a pair of set_vdev_id/unset_vdev_id ops in the viommu ops, so a driver > > > can implement them to control a vdev_id's lifecycle, and configure the > > > HW properly if required. > > > > I think the lifecycle rules should be much simpler. > > > > If a nested domain is attached to a STE/RID/device then the vIOMMU > > affiliated with that nested domain is pinned while the STE is in place > > > > So the driver only need to provide locking around attach changing the > > STE's vIOMMU vs async operations translating from a STE to a > > vIOMMU. This can be a simple driver lock of some kind, ie a rwlock > > across the STE table. > > > > Generally that is how all the async events should work, go from the > > STE to the VIOMMU to a iommufd callback to the iommufd event > > queue. iommufd will translate the struct device from the driver to an > > idev_id (or maybe even a vdevid) the same basic way the PRI code works > > I am trying to draft something following this, and here is what > it would look like: > > ------------------------draft--------------------------- > struct arm_smmu_master { > .... > + struct rw_semaphore lock; It would be called vsmmu_lock > + struct arm_vsmmu *vsmmu; > .... > }; > > ->attach_dev() { > down_write(&master->lock); > if (domain->type == IOMMU_DOMAIN_NESTED) > master->vsmmu = to_smmu_domain(domain)->vsmmu; > else > master->vsmmu = NULL; > up_write(&master->lock); > } > > isr() { > down_read(&master->lock); > if (master->vsmmu) { > xa_lock(&master->vsmmu->core.vdevs); > vdev = iommufd_viommu_dev_to_vdev(&master->vsmmu->core, > master->dev); > if (vdev) { > struct iommu_virq_arm_smmuv3 virq_data = evt; > > virq_data.evt[0] &= ~EVTQ_0_SID; > virq_data.evt[0] |= FIELD_PREP(EVTQ_0_SID, vdev->id); > return iommufd_viommu_report_irq( > vdev->viommu, > IOMMU_VIRQ_TYPE_ARM_SMMUV3, &virq_data, > sizeof(virq_data)); > } else { > rc = -ENOENT; > } > xa_unlock(&master->vsmmu->core.vdevs); > } > up_read(&master->lock); > } > -------------------------------------------------------- This looks reasonable > [Comparison] | This v1 | Draft > 1. Adds to master | A lock and vdev ptr | A lock and viommu ptr > 2. Set/unset ptr | In ->vdevice_alloc/free | In all ->attach_dev > 3. Do dev_to_vdev | master->vdev->id | attach_handle? The set/unset ops have the major issue that they can get out of sync with the domain. The only time things should be routed to the viommu is when a viommu related domain is attached. The lock on attach can be reduced: iommu_group_mutex_assert(dev) if (domain->type == IOMMU_DOMAIN_NESTED) new_vsmmu = to_smmu_domain(domain)->vsmmu; else new_vsmmu = NULL; if (new_vsmmu != master->vsmmu) { down_write(&master->lock); master->vsmmu = new_vsmmu; up_write(&master->lock); } And you'd stick this in prepare or commit.. > Both solutions needs a driver-level lock and an extra pointer in > the master structure. And both ISR routines require that driver- > level lock to avoid race against attach_dev v.s. vdev alloc/free. > Overall, taking step.3 into consideration, it seems that letting > master lock&hold the vdev pointer (i.e. this v1) is simpler? I'm not sure the vdev pointer should even be visible to the drivers.. > As for the implementation of iommufd_viommu_dev_to_vdev(), I read > the attach_handle part in the PRI code, yet I don't see the lock > that protects the handle returned by iommu_attach_handle_get() in > iommu_report_device_fault/find_fault_handler(). It is the xa_lock and some rules about flushing irqs and work queues before allowing the dev to disappear: > "Callers are required to synchronize the call of > iommu_attach_handle_get() with domain attachment > and detachment. The attach handle can only be used > during its life cycle." > But the caller iommu_report_device_fault() is an async event that > cannot guarantee the lifecycle. Would you please shed some light? The iopf detatch function will act as a barrirer to ensure that all the async work has completed, sort of like how RCU works. But here, I think it is pretty simple, isn't it? When you update the master->vsmmu you can query the vsmmu to get the vdev id of that master, then store it in the master struct and forward it to the iommufd_viommu_report_irq(). That could even search the xarray since attach is not a performance path. Then it is locked under the master->lock Jason