On Fri, 2017-02-17 at 14:15 -0500, Benjamin Coddington wrote: > On 17 Feb 2017, at 14:00, Trond Myklebust wrote: > > > On Fri, 2017-02-17 at 13:46 -0500, Benjamin Coddington wrote: > > > NFS attempts to wait for read and write completion before > > > unlocking > > > in > > > order to ensure that the data returned was protected by the > > > lock. When > > > this waiting is interrupted by a signal, the unlock may never be > > > sent, and > > > messages similar to the following are seen in the kernel ring > > > buffer: > > > > > > [20.167876] Leaked locks on dev=0x0:0x2b ino=0x8dd4c3: > > > [20.168286] POSIX: fl_owner=ffff880078b06940 fl_flags=0x1 > > > fl_type=0x0 > > > fl_pid=20183 > > > [20.168727] POSIX: fl_owner=ffff880078b06680 fl_flags=0x1 > > > fl_type=0x0 > > > fl_pid=20185 > > > > > > For NFSv3, the missing unlock will cause the server to refuse > > > conflicting > > > locks indefinitely. For NFSv4, the leftover lock will be removed > > > by > > > the > > > server after the lease timeout. > > > > > > This patch fixes this for NFSv3 by skipping the wait in order to > > > immediately send the unlock if the FL_CLOSE flag is set when > > > signaled. For > > > NFSv4, this approach may cause the server to see the I/O as > > > arriving > > > with > > > an old stateid, so, for the v4 case the fix is different: the > > > wait on > > > I/O > > > completion is moved into nfs4_locku_ops' > > > rpc_call_prepare(). This > > > will > > > cause the sleep to happen in rpciod context, and a signal to the > > > originally > > > waiting process will not cause the unlock to be skipped. > > > > NACK. I/O waits in rpciod contexts are NOT acceptable. rpciod is > > part > > of the memory reclaim chain, so having it sleep on I/O is deadlock > > prone. > > > > Why is there a need to wait for I/O completion in the first place > > if > > the user has killed the task that held the lock? 'kill -9' will > > cause > > corruption; that's a fact that no amount of paper will cover over. > > To avoid an unnecessary recovery situation where the server asks us > to resend > I/O due to an invalid stateid. > I agree we shouldn't recover in this situation. It would be better to jettison the failed write, and invalidate the page. Can we make use of nfs_wb_page_cancel() together with generic_error_remove_page()? -- Trond Myklebust Linux NFS client maintainer, PrimaryData trond.myklebust@xxxxxxxxxxxxxxx ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥