On Mon, Aug 19, 2019 at 09:38:41AM -0300, Jason Gunthorpe wrote: > On Mon, Aug 19, 2019 at 07:24:09PM +1000, Dave Chinner wrote: > > > So that leaves just the normal close() syscall exit case, where the > > application has full control of the order in which resources are > > released. We've already established that we can block in this > > context. Blocking in an interruptible state will allow fatal signal > > delivery to wake us, and then we fall into the > > fatal_signal_pending() case if we get a SIGKILL while blocking. > > The major problem with RDMA is that it doesn't always wait on close() for the > MR holding the page pins to be destoyed. This is done to avoid a > deadlock of the form: > > uverbs_destroy_ufile_hw() > mutex_lock() > [..] > mmput() > exit_mmap() > remove_vma() > fput(); > file_operations->release() I think this is wrong, and I'm pretty sure it's an example of why the final __fput() call is moved out of line. fput() fput_many() task_add_work(f, __fput()) and the call chain ends there. Before the syscall returns to userspace, it then runs the __fput() call through the task_work_run() interfaces, and hence the call chain is just: task_work_run __fput > file_operations->release() > ib_uverbs_close() > uverbs_destroy_ufile_hw() > mutex_lock() <-- Deadlock And there is no deadlock because nothing holds the mutex at this point. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx