On Thu, Jun 02, 2022 at 10:45:55AM -0600, Logan Gunthorpe wrote: > > > > On 2022-06-02 10:30, Jason Gunthorpe wrote: > > On Thu, Jun 02, 2022 at 10:16:10AM -0600, Logan Gunthorpe wrote: > > > >>> Just stuff the pages into the mmap, and your driver unprobe will > >>> automatically block until all the mmaps are closed - no different than > >>> having an open file descriptor or something. > >> > >> Oh is that what we want? > > > > Yes, it is the typical case - eg if you have a sysfs file open unbind > > hangs indefinitely. Many drivers can't unbind while they have open file > > descriptors/etc. > > > > A couple drivers go out of their way to allow unbinding while a live > > userspace exists but this can get complicated. Usually there should be > > a good reason. > > This is not my experience. All the drivers I've worked with do not block > unbind with open file descriptors (at least for char devices). I know, > for example, that having a file descriptor open of /dev/nvmeX does not > cause unbinding to block. So there are lots of bugs in the kernel, and I've seen many drivers that think calling cdev_device_del() is all they need to do - and then happily allow cdev ioctl's/etc on a de-initialized driver struct. Drivers that do take care of this usually have to put a lock around all their fops to serialize against unbind. RDMA uses SRCU, iirc TPM used a rwlock. But this is tricky and hurts fops performance. I don't know what nvme did to protect against this, I didn't notice an obvious lock. > I figured this was the expectation as the userspace process doing > the unbind won't be able to be interrupted seeing there's no way to > fail on that path. Though, it certainly would make things a lot > easier if the unbind can block indefinitely as it usually requires > some complicated locking. As I said, this is what sysfs does today and I don't see that ever changing. If you userspace has a sysfs file open then the driver unbind hangs until the file is closed. So, doing as bad as sysfs seems like a reasonable baseline to me. > Do you have an example of this? What mechanisms are developers using to > block unbind with open file descriptors? Sysfs maintains a refcount with a bias that is basically a fancied rwlock. Most places use some kind of refcount triggering a completion. Sleep on the completion until refcount is 0 on unbind kind of thing. Jason