Re: Need help debugging NFS issues new to 4.20 kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2019-02-20 at 09:25 -0600, Jason L Tibbitts III wrote:
> Sadly, after a bit more testing I find that 4.20.10 with that patch
> applied still exhibits the problem.  Just found a machine where NFS
> hung
> up just six minutes after the user logged in.  As always: nothing is
> logged to the client or the server.  I turned on the tracing as
> requested previously and see what I think it just the same sequence
> as
> before.  Please let me know if there is any other information I can
> provide.  Right now I'm going to build 4.20.10 with my previously
> posted
> revert patch applied and roll that out to the same 50 hosts and see
> how
> it fares.
> 
>  - J<
> 
>     kworker/u8:0-12608 [001] .... 47884.661073: rpc_request: task:
> 20469@1 nfsv4 SEQUENCE (async)
>     kworker/u8:0-12608 [001] .... 47884.661073: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_reserve
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661074: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_reserveresult
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661074: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_refresh
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661075: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_refreshresult
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661076: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_allocate
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661076: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_encode
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661091: rpc_task_run_action:
> task:20469@1 flags=5281 state=001d status=0 action=call_bind [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661091: rpc_task_run_action:
> task:20469@1 flags=5281 state=001d status=0 action=call_connect
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661092: rpc_task_run_action:
> task:20469@1 flags=5281 state=001d status=0 action=call_transmit
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661106: xprt_transmit:
> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=0
>     kworker/u8:0-12608 [001] .... 47884.661107: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0015 status=0
> action=call_transmit_status [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661107: rpc_task_sleep: task:
> 20469@1 flags=5a81 state=0015 status=0 timeout=60000
> queue=xprt_pending
>     kworker/u9:0-12884 [002] .... 47884.661371: xprt_lookup_rqst:
> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=0
>     kworker/u9:0-12884 [002] .... 47884.661372: xprt_complete_rqst:
> peer=[172.21.86.74]:2049 xid=0x0d7ca068 status=112
>     kworker/u9:0-12884 [002] .... 47884.661373: rpc_task_wakeup:
> task:20469@1 flags=5a81 state=0006 status=0 timeout=60000
> queue=xprt_pending
>     kworker/u9:0-12884 [002] .... 47884.661375:
> xs_stream_read_request: peer=[172.21.86.74]:2049 xid=0x0d7ca068
> copied=112 reclen=112 offset=112
>     kworker/u9:0-12884 [002] .... 47884.661376: xs_stream_read_data:
> peer=[172.21.86.74]:2049 err=-11 total=116
>     kworker/u8:0-12608 [001] .... 47884.661388: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0005 status=0 action=xprt_timer
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661389: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0005 status=0 action=call_status
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661389: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0005 status=0 action=call_decode
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661408: rpc_task_run_action:
> task:20469@1 flags=5a81 state=0005 status=-10063 action=rpc_exit_task
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661409: nfs4_sequence_done:
> error=-10063 (SEQ_MISORDERED) session=0x4480e725 slot_nr=0 seq_nr=1
> highest_slotid=0 target_highest_slotid=0 status_flags=0 ()
>     kworker/u8:0-12608 [001] .... 47884.661410: rpc_stats_latency:
> task:20469@1 xid=0x0d7ca068 nfsv4 SEQUENCE backlog=21 rtt=278
> execute=339
>     kworker/u8:0-12608 [001] .... 47884.661412: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=rpc_prepare_task
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661413: nfs4_setup_sequence:
> session=0x4480e725 slot_nr=0 seq_nr=1 highest_used_slotid=0
>     kworker/u8:0-12608 [001] .... 47884.661413: rpc_task_run_action:
> task:20469@1 flags=5281 state=0005 status=0 action=call_start
> [sunrpc]
>     kworker/u8:0-12608 [001] .... 47884.661414: rpc_request: task:
> 20469@1 nfsv4 SEQUENCE (async)

This is not an RPC layer issue. It is a SEQ_MISORDERED error on slot 0.
If the client can't recover then that will hang your NFSv4.1 session
against that server.

Are you sure this isn't what was happening previously?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@xxxxxxxxxxxxxxx






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux