On Wed, Nov 22, 2023 at 02:44:53PM +0000, Greg KH wrote: > On Thu, Nov 16, 2023 at 01:05:58PM -0400, Jason Gunthorpe wrote: > > On Wed, Nov 15, 2023 at 06:25:44PM +0000, Robin Murphy wrote: > > > It turns out there are more subtle races beyond just the main part of > > > __iommu_probe_device() itself running in parallel - the dev_iommu_free() > > > on the way out of an unsuccessful probe can still manage to trip up > > > concurrent accesses to a device's fwspec. Thus, extend the scope of > > > iommu_probe_device_lock() to also serialise fwspec creation and initial > > > retrieval. > > > > > > Reported-by: Zhenhua Huang <quic_zhenhuah@xxxxxxxxxxx> > > > Link: https://lore.kernel.org/linux-iommu/e2e20e1c-6450-4ac5-9804-b0000acdf7de@xxxxxxxxxxx/ > > > Fixes: 01657bc14a39 ("iommu: Avoid races around device probe") > > > Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx> > > > --- > > > > > > This is my idea of a viable fix, since it does not need a 700-line > > > diffstat to make the code do what it was already *trying* to do anyway. > > > This stuff should fundamentally not be hanging off driver probe in the > > > first place, so I'd rather get on with removing the underlying > > > brokenness than waste time and effort polishing it any further. > > > > I'm fine with this as some hacky backport, but I don't want to see > > this cross-layer leakage left in the next merge window. > > > > ie we should still do my other series on top of and reverting this. > > > > I've poked at moving parts of it under probe and I think we can do > > substantial amounts in about two more series and a tidy a bunch of > > other stuff too. > > I agree, it's messy and acpi should not need this, BUT at the moment, I > can't see any other way to resolve this simply. > > So here's a begrudged ack: > > Acked-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > > and hopefully the larger series should resolve this correctly? Can that > be rebased on top of this? Yeah, I'm working on something more along the lines of Robin's desire for a full reorganization. The existing series has been tested by a few people now. We can decide which order to put things in maybe next week if I get the new approach done.. > Also, cc: stable on this for whomever applies it? Also please update the commit message, the text from here does describe the race: https://lore.kernel.org/linux-iommu/11-v2-36a0088ecaa7+22c6e-iommu_fwspec_jgg@xxxxxxxxxx/ Jason