Re: [PATCH 1/2] NFS: Add sequence_priviliged_ops for nfs4_proc_sequence()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/12/2012 03:54 PM, Myklebust, Trond wrote:
> On Mon, 2012-11-12 at 15:49 -0500, Andy Adamson wrote:
>> On Mon, Nov 12, 2012 at 3:29 PM, Myklebust, Trond
>> <Trond.Myklebust@xxxxxxxxxx> wrote:
>>> On Mon, 2012-11-12 at 15:22 -0500, Andy Adamson wrote:
>>>> On Mon, Nov 12, 2012 at 1:35 PM,  <bjschuma@xxxxxxxxxx> wrote:
>>>>> From: Bryan Schumaker <bjschuma@xxxxxxxxxx>
>>>>>
>>>>> During recovery the NFS4_SESSION_DRAINING flag might be set on the
>>>>> client structure.  This can cause lease renewal to abort when
>>>
>>> Not lease renewal. It is lease verification (i.e. checking that we have
>>> a valid lease and session) that will deadlock.
>>>
>>>>> nfs41_setup_sequence sees that we are doing recovery.  As a result, the
>>>>> client never recovers and all activity with the NFS server halts.
>>>>
>>>>
>>>> When does this happen? Say the session is draining, and an RPC returns
>>>> from one of the compounds such as nfs_open, nfs_locku etc whose
>>>> rpc_call_done routine calls renew_lease after freeing it's slot.  Like
>>>> all calls on a draining session, the renew_lease sleeps on the session
>>>> slot_tbl_waitq - is this what you mean by "causes lease renewal to
>>>> abort"? How does this cause the client to never recover?
>>>>
>>>> The only other call to renew_lease is from the state manager itself
>>>> which runs one function at a time, and will not call renew_lease until
>>>> the recovery is over....
>>>>
>>>> What am I missing....?
>>>
>>> nfs4_check_lease() is run exclusively by the state manager thread in
>>> order to check that the clientid (and session id on NFSv4.1) are valid.
>>> It will deadlock the state manager thread if NFS4_SESSION_DRAINING is
>>> already set.
>>
>> OK. NFS4_SESSION_DRAINING is also set exclusively by the state manager
>> thread. Why is the state manager told to check the lease when it's
>> also draining a session?
> 
> NFS4_SESSION_DRAINING does not just mean that the session is being
> drained; it remains set until open and lock state recovery is completed.
> 
> IOW: if something happens during state recovery that requires us to
> check the lease (e.g. something returns NFS4ERR_EXPIRED) then the
> current code will deadlock.
> 
>> No matter what the answer, please update the patch description to
>> better explain the problem being solved.
> 
> ACK. I agree that the changelog entry can be improved.
> 

Does this read any better (and should I resend the patch)?

NFS: Add sequence_priviliged_ops for nfs4_proc_sequence()

If I mount an NFS v4.1 server to a single client multiple times and then
run xfstests over each mountpoint I usually get the client into a state
where recovery deadlocks.  The server informs the client of a
cb_path_down sequence error, the client then does a
bind_connection_to_session and checks the status of the lease.

I found that bind_connection_to_session sets the NFS4_SESSION_DRAINING
flag on the client, but this flag is never unset before
nfs4_check_lease() reaches nfs4_proc_sequence().  This causes the client
to deadlock, halting all NFS activity to the server.


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux