On Sat, 2016-09-17 at 15:16 -0400, Oleg Drokin wrote: > On Sep 17, 2016, at 2:18 PM, Trond Myklebust wrote: > > > > > > > > > > > On Sep 17, 2016, at 14:04, Oleg Drokin <green@xxxxxxxxxxxxxx> > > > wrote: > > > > > > > > > On Sep 17, 2016, at 1:13 AM, Trond Myklebust wrote: > > > > > > > > > > > According to RFC5661, if any of the SEQUENCE status bits > > > > SEQ4_STATUS_EXPIRED_ALL_STATE_REVOKED, > > > > SEQ4_STATUS_EXPIRED_SOME_STATE_REVOKED, > > > > SEQ4_STATUS_ADMIN_STATE_REVOKED, > > > > or SEQ4_STATUS_RECALLABLE_STATE_REVOKED are set, then we need > > > > to use > > > > TEST_STATEID to figure out which stateids have been revoked, so > > > > we > > > > can acknowledge the loss of state using FREE_STATEID. > > > > > > > > While we already do this for open and lock state, we have not > > > > been doing > > > > so for all the delegations. > > > > > > > > v2: nfs_v4_2_minor_ops needs to set .test_and_free_expired too > > > > v3: Now with added lock revoke fixes and > > > > close/delegreturn/locku fixes > > > > v4: Close a bunch of corner cases > > > > v5: Report revoked delegations as invalid in > > > > nfs_have_delegation() > > > > Fix an infinite loop in nfs_reap_expired_delegations. > > > > Fixes for other looping behaviour > > > > > > This time around the loop seems to be more tight, > > > in userspace process: > > > > > > [ 9197.256571] --> nfs41_call_sync_prepare data->seq_server > > > ffff8800a73ce000 > > > [ 9197.256572] --> nfs41_setup_sequence > > > [ 9197.256573] --> nfs4_alloc_slot used_slots=0000 > > > highest_used=4294967295 max_slots=31 > > > [ 9197.256574] <-- nfs4_alloc_slot used_slots=0001 highest_used=0 > > > slotid=0 > > > [ 9197.256574] <-- nfs41_setup_sequence slotid=0 seqid=14013800 > > > [ 9197.256582] encode_sequence: sessionid=1474126170:1:2:0 > > > seqid=14013800 slotid=0 max_slotid=0 cache_this=1 > > > [ 9197.256755] --> nfs4_alloc_slot used_slots=0001 highest_used=0 > > > max_slots=31 > > > [ 9197.256756] <-- nfs4_alloc_slot used_slots=0003 highest_used=1 > > > slotid=1 > > > [ 9197.256757] nfs4_free_slot: slotid 1 highest_used_slotid 0 > > > [ 9197.256758] nfs41_sequence_process: Error 0 free the slot > > > [ 9197.256760] nfs4_free_slot: slotid 0 highest_used_slotid > > > 4294967295 > > > [ 9197.256779] --> nfs_put_client({2}) > > > > What operation is the userspace process hanging on? Do you have a > > stack trace for it? > > seems to be open_create->truncate->ssetattr coming from: > cp /bin/sleep /mnt/nfs2/racer/12 > > (gdb) bt > #0 nfs41_setup_sequence (session=0xffff88005a853800, > args=0xffff8800a7253b80, > res=0xffff8800a7253b48, task=0xffff8800b0eb0f00) > at /home/green/bk/linux-test/fs/nfs/nfs4proc.c:876 > #1 0xffffffff813a751c in nfs41_call_sync_prepare (task=<optimized > out>, > calldata=0xffff8800a7253b80) > at /home/green/bk/linux-test/fs/nfs/nfs4proc.c:966 > #2 0xffffffff8185c639 in rpc_prepare_task (task=<optimized out>) > at /home/green/bk/linux-test/net/sunrpc/sched.c:683 > #3 0xffffffff8185f12b in __rpc_execute (task=0xffff88005a853800) > at /home/green/bk/linux-test/net/sunrpc/sched.c:775 > #4 0xffffffff818617b4 in rpc_execute (task=0xffff88005a853800) > at /home/green/bk/linux-test/net/sunrpc/sched.c:843 > #5 0xffffffff818539b9 in rpc_run_task > (task_setup_data=0xffff8800a7253a50) > at /home/green/bk/linux-test/net/sunrpc/clnt.c:1052 > #6 0xffffffff813a75e3 in nfs4_call_sync_sequence (clnt=<optimized > out>, > server=<optimized out>, msg=<optimized out>, args=<optimized > out>, > res=<optimized out>) at /home/green/bk/linux- > test/fs/nfs/nfs4proc.c:1051 > #7 0xffffffff813b4645 in nfs4_call_sync (cache_reply=<optimized > out>, > res=<optimized out>, args=<optimized out>, msg=<optimized out>, > server=<optimized out>, clnt=<optimized out>) > at /home/green/bk/linux-test/fs/nfs/nfs4proc.c:1069 > #8 _nfs4_do_setattr (state=<optimized out>, cred=<optimized out>, > res=<optimized out>, arg=<optimized out>, inode=<optimized out>) > ---Type <return> to continue, or q <return> to quit--- > at /home/green/bk/linux-test/fs/nfs/nfs4proc.c:2916 > #9 nfs4_do_setattr (inode=0xffff880079b152a8, cred=<optimized out>, > fattr=<optimized out>, sattr=<optimized out>, > state=0xffff880060588e00, > ilabel=<optimized out>, olabel=0x0 <irq_stack_union>) > at /home/green/bk/linux-test/fs/nfs/nfs4proc.c:2955 > #10 0xffffffff813b4a16 in nfs4_proc_setattr (dentry=<optimized out>, > fattr=0xffff8800a7253b80, sattr=0xffff8800a7253b48) > at /home/green/bk/linux-test/fs/nfs/nfs4proc.c:3684 > #11 0xffffffff8138f1cb in nfs_setattr (dentry=0xffff8800740c1000, > Cool! Does the following help? 8<------------------------------------------------------------ >From 98ddf32a99cfe00e9ae108044e2be67522987511 Mon Sep 17 00:00:00 2001 From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> Date: Sat, 17 Sep 2016 15:27:10 -0400 Subject: [PATCH] NFS: Don't assume a stateid represents a delegation in nfs4_do_handle_exception If the stateid being passed to the error handler is not a delegation stateid, we want to mark the locks/open_state it does represent for recovery. Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> --- fs/nfs/nfs4proc.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 7cecb1d7a217..acc572c51735 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -397,6 +397,10 @@ static int nfs4_do_handle_exception(struct nfs_server *server, exception->delay = 0; exception->recovering = 0; exception->retry = 0; + + if (stateid == NULL && state != NULL) + stateid = &state->stateid; + switch(errorcode) { case 0: return 0; @@ -405,7 +409,7 @@ static int nfs4_do_handle_exception(struct nfs_server *server, case -NFS4ERR_EXPIRED: case -NFS4ERR_BAD_STATEID: if (inode != NULL && stateid != NULL) { - nfs_inode_find_delegation_state_and_recover(inode, + nfs_inode_find_state_and_recover(inode, stateid); goto wait_on_recovery; } -- 2.7.4��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥