On Wed, Feb 23, 2022 at 10:30:11AM -0400, Jason Gunthorpe wrote: > On Wed, Feb 23, 2022 at 10:09:01AM -0400, Jason Gunthorpe wrote: > > On Wed, Feb 23, 2022 at 03:06:35PM +0100, Greg Kroah-Hartman wrote: > > > On Wed, Feb 23, 2022 at 09:46:27AM -0400, Jason Gunthorpe wrote: > > > > On Wed, Feb 23, 2022 at 01:04:00PM +0000, Robin Murphy wrote: > > > > > > > > > 1 - tmp->driver is non-NULL because tmp is already bound. > > > > > 1.a - If tmp->driver->driver_managed_dma == 0, the group must currently be > > > > > DMA-API-owned as a whole. Regardless of what driver dev has unbound from, > > > > > its removal does not release someone else's DMA API (co-)ownership. > > > > > > > > This is an uncommon locking pattern, but it does work. It relies on > > > > the mutex being an effective synchronization barrier for an unlocked > > > > store: > > > > > > > > WRITE_ONCE(dev->driver, NULL) > > > > > > Only the driver core should be messing with the dev->driver pointer as > > > when it does so, it already has the proper locks held. Do I need to > > > move that to a "private" location so that nothing outside of the driver > > > core can mess with it? > > > > It would be nice, I've seen a abuse and mislocking of it in drivers > > Though to be clear, what Robin is describing is still keeping the > dev->driver stores in dd.c, just reading it in a lockless way from > other modules. "other modules" should never care if a device has a driver bound to it because instantly after the check happens, it can change so what ever logic it wanted to do with that knowledge is gone. Unless the bus lock is held that the device is on, but that should be only accessable from within the driver core as it controls that type of stuff, not any random other part of the kernel. And in looking at this, ick, there are loads of places in the kernel that are thinking that this pointer being set to something actually means something. Sometimes it does, but lots of places, it doesn't as it can change. In a semi-related incident right now, we currently have a syzbot failure in the usb gadget code where it was manipulating the ->driver pointer directly and other parts of the kernel are crashing. See https://lore.kernel.org/r/PH0PR11MB58805E3C4CF7D4C41D49BFCFDA3C9@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx for the thread. I'll poke at this as a background task to try to clean up over time. thanks, greg k-h