On Wed, Nov 15, 2023 at 06:25:44PM +0000, Robin Murphy wrote: > It turns out there are more subtle races beyond just the main part of > __iommu_probe_device() itself running in parallel - the dev_iommu_free() > on the way out of an unsuccessful probe can still manage to trip up > concurrent accesses to a device's fwspec. Thus, extend the scope of > iommu_probe_device_lock() to also serialise fwspec creation and initial > retrieval. > > Reported-by: Zhenhua Huang <quic_zhenhuah@xxxxxxxxxxx> > Link: https://lore.kernel.org/linux-iommu/e2e20e1c-6450-4ac5-9804-b0000acdf7de@xxxxxxxxxxx/ > Fixes: 01657bc14a39 ("iommu: Avoid races around device probe") > Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx> > --- > > This is my idea of a viable fix, since it does not need a 700-line > diffstat to make the code do what it was already *trying* to do anyway. > This stuff should fundamentally not be hanging off driver probe in the > first place, so I'd rather get on with removing the underlying > brokenness than waste time and effort polishing it any further. I'm fine with this as some hacky backport, but I don't want to see this cross-layer leakage left in the next merge window. ie we should still do my other series on top of and reverting this. I've poked at moving parts of it under probe and I think we can do substantial amounts in about two more series and a tidy a bunch of other stuff too. Jason