On Fri, Aug 18, 2023 at 08:24:01PM +0200, Joerg Roedel wrote: > On Fri, Aug 18, 2023 at 01:06:43PM -0300, Jason Gunthorpe wrote: > > On Fri, Aug 18, 2023 at 05:56:20PM +0200, Joerg Roedel wrote: > > > On Thu, Aug 17, 2023 at 03:33:16PM -0300, Jason Gunthorpe wrote: > > > > Bascially.. Yikes! > > > > > > Hmm, that is a difficult situation. Even if the problem is a misuse of > > > the APIs we can not just blindly break other drivers by our core > > > changes. > > > > They are not broken, they just throw a lockdep warning and keep going > > as before. This is what triggers: > > > > static inline void device_lock_assert(struct device *dev) > > { > > lockdep_assert_held(&dev->mutex); > > } > > > > So non-debug builds won't even see anything. > > But this still means that a function is called without holding the > proper lock. It has alway been like that. of_dma_configure_id() always required the device_lock to be held! eg as one example: of_iommu_configure of_iommu_configure_device of_iommu_configure_dev of_iommu_xlate iommu_fwspec_init dev_iommu_get It is subtle, but the device_lock is what protects the store to dev->iommu inside dev_iommu_get(). In v6.5-rc1 many callers held the device lock here, and after this series only these broken drivers don't. The driver assumes it has exclusive use of the platform device it steals. Not just for this call, but in general it does other stuff that rests on this assumption. Since it is exclusive it doesn't actually need any locking - this is why it works reliably as is today. Again, there is no practical bug here, the driver works fine. On non-debug kernels there is no warning or functional issue. Debug kernels rightly highlight that the API is being used wrong. It is wrong use of the APIs because someone could go and use sysfs to attach, say, VFIO to the stolen platform_device and cause all kinds of kernel problems. > I can't send anything with known problems upstream. In my view this is not creating a new problem. It is exposing existing problems with a debugging message only on debug kernels. However, if you view the debug message as a problem then I suggest we simply comment out with a note the device_lock_assert() from the iommu code. This would be sad because the vast majority of systems don't use these badly designed drivers. This puts it back to the v6.5-rc1 behavior where of_dma_configure_id() won't make any prints if it is called with wrong locking. Please let me know. Jason