2014-03-21 6:23 GMT+04:00 Steven Rostedt <rostedt@xxxxxxxxxxx>: > On Thu, 20 Mar 2014 17:02:39 -0400 > Jeff Layton <jlayton@xxxxxxxxxx> wrote: > >> Eventually the server should just allow the read to complete even if >> the client doesn't respond to the oplock break. It has to since clients >> can suddenly drop off the net while holding an oplock. That should >> allow everything to unwedge eventually (though it may take a while). >> >> If that's not happening then I'd be curious as to why... > > The problem is that the data is being filled in the page and the reader > is waiting for the page lock to be released. The kworker for the reader > will issue the complete() and unlock the page to wake up the reader. > > But because the other workqueue callback calls down_read(), and there > can be a down_write() waiting for the reader to finish, this > down_read() will block on the lock as well (rwsems are fair locks). > This blocks the other workqueue callback from issuing the complete and > page_unlock() that will wake up the reader that is holding the rwsem > with down_read(). > > DEADLOCK. Thank you for reporting and clarifying the issue! Read and write codepaths both obtain lock_sem for read and then wait for cifsiod_wq to complete and release lock_sem. They don't do any lock_sem operations inside their work task queued to cifsiod_wq. But oplock code can obtain/release lock_sem in its work task. So, that's why I agree with Jeff and suggest to move the oplock code to a different work queue (cifsioopd_wq?) but leave read and write codepaths use cifsiod_wq. -- Best regards, Pavel Shilovsky. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html