Re: What is a 10.10.10.20-man process?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is interesting.  Sniffing the network shows:

14:57:05.802390 IP hypervisor3-bus.843 > netapp-bus.nfs: Flags [.],
ack 193, win 58, length 0
14:57:10.802218 IP hypervisor3-bus.843 > netapp-bus.nfs: Flags [P.],
seq 464:580, ack 193, win 58, length 116: NFS request xid 2773655640
112 getattr [|nfs]
14:57:10.802321 IP netapp-bus.nfs > hypervisor3-bus.843: Flags [P.],
seq 193:241, ack 580, win 33580, length 48: NFS reply xid 2773655640
reply ok 44 getattr ERROR: unk 10011
14:57:10.802335 IP hypervisor3-bus.843 > netapp-bus.nfs: Flags [.],
ack 241, win 58, length 0

I turned on some logging for rpc and nfs:

Apr 18 14:49:55 hypervisor3 kernel: nfs4_renew_state: start
Apr 18 14:49:55 hypervisor3 kernel: RPC:       new task initialized,
procpid 30181
Apr 18 14:49:55 hypervisor3 kernel: RPC:       allocated task ffff880aa3b67c00
Apr 18 14:49:55 hypervisor3 kernel: nfs4_renew_state: done
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 __rpc_execute flags=0x5281
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_start nfs4 proc
RENEW (async)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_reserve (status 0)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 reserved req
ffff883f4de15c00 xid 4e52a058
Apr 18 14:49:55 hypervisor3 kernel: RPC:
wake_up_first(ffff883f65d4c990 "xprt_sending")
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_reserveresult (status 0)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_refresh (status 0)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 refreshing UNIX cred
ffff883f6637e300
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_refreshresult (status 0)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_allocate (status 0)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 allocated buffer of
size 444 at ffff8841c052d000
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_bind (status 0)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_connect xprt
ffff883f65d4c800 is connected
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_transmit (status 0)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 xprt_prepare_transmit
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 rpc_xdr_encode (status 0)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 marshaling UNIX cred
ffff883f6637e300
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 using AUTH_UNIX cred
ffff883f6637e300 to wrap rpc data
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 xprt_transmit(116)
Apr 18 14:49:55 hypervisor3 kernel: RPC:       xs_tcp_send_request(116) = 0
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 xmit complete
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 sleep_on(queue
"xprt_pending" time 13230578520)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 added to queue
ffff883f65d4ca58 "xprt_pending"
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 setting alarm for 60000 ms
Apr 18 14:49:55 hypervisor3 kernel: RPC:
wake_up_first(ffff883f65d4c990 "xprt_sending")
Apr 18 14:49:55 hypervisor3 kernel: RPC:       xs_tcp_data_ready...
Apr 18 14:49:55 hypervisor3 kernel: RPC:       xs_tcp_data_recv started
Apr 18 14:49:55 hypervisor3 kernel: RPC:       reading TCP record
fragment of length 44
Apr 18 14:49:55 hypervisor3 kernel: RPC:       reading XID (4 bytes)
Apr 18 14:49:55 hypervisor3 kernel: RPC:       reading reply for XID 4e52a058
Apr 18 14:49:55 hypervisor3 kernel: RPC:       reading CALL/REPLY flag (4 bytes)
Apr 18 14:49:55 hypervisor3 kernel: RPC:       read reply XID 4e52a058
Apr 18 14:49:55 hypervisor3 kernel: RPC:       XID 4e52a058 read 36 bytes
Apr 18 14:49:55 hypervisor3 kernel: RPC:       xprt =
ffff883f65d4c800, tcp_copied = 44, tcp_offset = 44, tcp_reclen = 44
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 xid 4e52a058 complete
(44 bytes received)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 __rpc_wake_up_task (now
13230578520)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 disabling timer
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 removed from queue
ffff883f65d4ca58 "xprt_pending"
Apr 18 14:49:55 hypervisor3 kernel: RPC:       __rpc_wake_up_task done
Apr 18 14:49:55 hypervisor3 kernel: RPC:       xs_tcp_data_recv done
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 __rpc_execute flags=0x5a81
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_status (status 44)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_decode (status 44)
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 validating UNIX cred
ffff883f6637e300
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 using AUTH_UNIX cred
ffff883f6637e300 to unwrap rpc data
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 call_decode result -10011
Apr 18 14:49:55 hypervisor3 kernel: nfs4_schedule_lease_recovery:
scheduling lease recovery for server hypervisor3-bus
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 return 0, status -10011
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 release task
Apr 18 14:49:55 hypervisor3 kernel: RPC:       freeing buffer of size
444 at ffff8841c052d000
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 release request ffff883f4de15c00
Apr 18 14:49:55 hypervisor3 kernel: RPC:
wake_up_first(ffff883f65d4cb20 "xprt_backlog")
Apr 18 14:49:55 hypervisor3 kernel: RPC:
rpc_release_client(ffff883f64653600)
Apr 18 14:49:55 hypervisor3 kernel: nfs4_schedule_state_renewal:
requeueing work. Lease period = 5
Apr 18 14:49:55 hypervisor3 kernel: --> nfs_put_client({4})
Apr 18 14:49:55 hypervisor3 kernel: RPC:  8269 freeing task

Is this an authentication issue on the NFS server (Netapp)?

On Fri, Apr 15, 2016 at 6:16 AM, Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote:
> On Thu, 2016-04-14 at 14:15 -0500, Ziemowit Pierzycki wrote:
>> [<ffffffffa012d158>] rpc_wait_bit_killable+0x38/0x80 [sunrpc]
>> [<ffffffffa012d11d>] __rpc_wait_for_completion_task+0x2d/0x30
>> [sunrpc]
>> [<ffffffffa05caf30>] nfs4_run_open_task+0x130/0x1a0 [nfsv4]
>> [<ffffffffa05cd77d>] nfs4_open_recover_helper+0x16d/0x230 [nfsv4]
>> [<ffffffffa05cd875>] nfs4_open_recover+0x35/0xb0 [nfsv4]
>> [<ffffffffa05cff5d>] nfs4_open_expired.isra.60+0xcd/0x180 [nfsv4]
>> [<ffffffffa05d003e>] nfs40_open_expired+0x2e/0x40 [nfsv4]
>> [<ffffffffa05e3805>] nfs4_do_reclaim+0x175/0x5f0 [nfsv4]
>> [<ffffffffa05e416e>] nfs4_run_state_manager+0x4ee/0x7d0 [nfsv4]
>> [<ffffffff810bc8b8>] kthread+0xd8/0xf0
>> [<ffffffff8177999f>] ret_from_fork+0x3f/0x70
>> [<ffffffffffffffff>] 0xffffffffffffffff
>>
>> Waiting for something to complete but it can't?
>>
>
> Yes. That stack means that it's trying to reclaim an open file (on
> NFSv4.0), and is waiting on the reply.
>
>> On Thu, Apr 14, 2016 at 2:08 PM, Benjamin Coddington
>> <bcodding@xxxxxxxxxx> wrote:
>> >
>> >
>> > On Thu, 14 Apr 2016, Ziemowit PIerzycki wrote:
>> >
>> > >
>> > > I have a strange issue when NFS4 mounts to a Netapp filer freeze
>> > > on my Linux
>> > > KVMs.  Looking over few things I noticed a process "10.10.10.20-
>> > > man" is in D
>> > > state but I can't find anything information about it.  Mainly
>> > > because
>> > > 10.10.10.20 is the IP of the filer and searching for "-man" is
>> > > too generic.
>> > >
>> > > Why would this process be frozen?
>> > This is the nfs4 state manager process.  What's in the file:
>> > /proc/<pid of that process>/stack
>> >
>> > Ben
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
>> in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> Jeff Layton <jlayton@xxxxxxxxxxxxxxx>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux