nfs4 infinite loop in rpc_clnt_iterate_for_each_xprt without multipath

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello!

   I am hitting a strange problem with 4.7.0-rc1, basically eventually my NFS4 client
   enters a state where it's stuck in an infinite loop in
   rpc_clnt_iterate_for_each_xprt() called from nfs4_proc_bind_conn_to_session_callback

   The whole backtrace looks like this:
(gdb) bt
#0  xprt_iter_next_entry_multiple (xpi=0xffff880058cf3d80, 
    find_next=0xffffffff81865de0 <xprt_switch_find_next_entry>)
    at /home/green/bk/linux/net/sunrpc/xprtmultipath.c:276
#1  0xffffffff81866085 in xprt_iter_next_entry_all (xpi=<optimized out>)
    at /home/green/bk/linux/net/sunrpc/xprtmultipath.c:306
#2  0xffffffff81865e56 in xprt_iter_get_helper (xpi=0xffff880058cf3d80, 
    fn=0xffffffff81866070 <xprt_iter_next_entry_all>)
    at /home/green/bk/linux/net/sunrpc/xprtmultipath.c:411
#3  0xffffffff818668e6 in xprt_iter_get_next (xpi=0xffff880058cf3d80)
    at /home/green/bk/linux/net/sunrpc/xprtmultipath.c:448
#4  0xffffffff8183ebc2 in rpc_clnt_iterate_for_each_xprt (
    clnt=0xffff88005e313e00, 
    fn=0xffffffff8139d8f0 <nfs4_proc_bind_conn_to_session_callback>, 
    data=0xffff880058cf3dd8) at /home/green/bk/linux/net/sunrpc/clnt.c:776
#5  0xffffffff813adfdb in nfs4_proc_bind_conn_to_session (clp=<optimized out>, 
    cred=<optimized out>) at /home/green/bk/linux/fs/nfs/nfs4proc.c:6917
#6  0xffffffff813bea11 in nfs4_bind_conn_to_session (clp=<optimized out>)
    at /home/green/bk/linux/fs/nfs/nfs4state.c:2311
#7  nfs4_state_manager (clp=<optimized out>)
    at /home/green/bk/linux/fs/nfs/nfs4state.c:2376
#8  nfs4_run_state_manager (ptr=0xffff88003c39d800)
    at /home/green/bk/linux/fs/nfs/nfs4state.c:2457
#9  0xffffffff810af3a1 in kthread (_create=0xffff8800509c62c0)
    at /home/green/bk/linux/kernel/kthread.c:209


   if I enable nfs debug, I also see a very tight loop like:
[ 4563.114185] --> nfs4_proc_bind_one_conn_to_session
[ 4563.114690] <-- nfs4_proc_bind_one_conn_to_session status= 0
[ 4563.114691] --> nfs4_proc_bind_one_conn_to_session
[ 4563.115177] <-- nfs4_proc_bind_one_conn_to_session status= 0
. . .
   the NFSD side also gets a lot of these back to back requests.
   Everytthign using this nfs export is stuck in D state.

   So I looked around and I guess I am confused how is this all supposed to work.

   The loop in rpc_clnt_iterate_for_each_xprt() supposedly iterates over all connections
   for the "import". Now looking into the xprt_iter_next_entry_multiple, we can see that
        if (xps->xps_nxprts < 2)
                return xprt_switch_find_first_entry(head);

   This is my case:
$15 = {xps_lock = {{rlock = {raw_lock = {val = {counter = 0}}, 
        magic = 3735899821, owner_cpu = 4294967295, owner = 0xffffffffffffffff, 
        dep_map = {key = 0xffffffff8357e4b0 <__key.23771>, class_cache = {
            0x0 <irq_stack_union>, 0x0 <irq_stack_union>}, 
          name = 0xffffffff81cf96e6 "&(&xps->xps_lock)->rlock", cpu = 4, 
          ip = 6510615555426900570}}, {
        __padding = "\000\000\000\000\255N\255\336\377\377\377\377ZZZZ\377\377\377\377\377\377\377\377", dep_map = {key = 0xffffffff8357e4b0 <__key.23771>, 
          class_cache = {0x0 <irq_stack_union>, 0x0 <irq_stack_union>}, 
          name = 0xffffffff81cf96e6 "&(&xps->xps_lock)->rlock", cpu = 4, 
          ip = 6510615555426900570}}}}, xps_kref = {refcount = {counter = 3}}, 
  xps_nxprts = 1, xps_xprt_list = {next = 0xffff88004f5835e0, 
    prev = 0xffff88004f5835e0}, xps_net = 0xffffffff81f790c0 <init_net>, 
  xps_iter_ops = 0xffffffff81adfb20 <rpc_xprt_iter_singular>, xps_rcu = {
    next = 0x5a5a5a5a5a5a5a5a, func = 0xa55a5a5a5a5a5a5a}}


   So the loop in rpc_clnt_iterate_for_each_xprt(), that terminates on when the next
   element returned is NULL never gets that for when there are no failover links
   and happily keeps looping forever? Am I reading this right?

   This seems to be a somewhat new code landing on Linus' tree only on Mar 22,
   so I imagine if it was indeed an eternal loop like that, there would be a lot
   more reports already but in fact I don't hit this all the time myself, so I
   wonder if there's something else in play?

   Thanks.

Bye,
    Oleg--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux