pnfs client running out TCP port numbers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear NFS gurus,

we observe very interesting problem with pNFS client.
We have ~600 DSes in our installation + MDS + some
regular NFSv3 and v4 mounts. After some time we get on
the client nodes that they can't create new mounts:

May 10 16:00:25 bXXX0 automount[5351]: attempting to mount entry /nfs/aaa/bbb
May 10 16:00:26 bXXX0 automount[5351]: >> mount.nfs: mount system call failed
May 10 16:00:26 bXXX0 automount[5351]: >> mount.nfs: mount system call failed

Turned out that problem is in RPC layer. There are no free source ports anymore:

May 10 17:05:04 bXXX0 kernel: RPC: 35575 call_connect xprt ffff880209f0b000 is not connected
May 10 17:05:04 bXXX0 kernel: RPC: 35575 xprt_connect xprt ffff880209f0b000 is not connected
May 10 17:05:04 bXXX0 kernel: RPC: 35575 sleep_on(queue "xprt_pending" time 14685414165)
May 10 17:05:04 bXXX0 kernel: RPC: 35575 added to queue ffff880209f0b258 "xprt_pending"
May 10 17:05:04 bXXX0 kernel: RPC: 35575 setting alarm for 60000 ms
May 10 17:05:04 bXXX0 kernel: RPC:       xs_connect scheduled xprt ffff880209f0b000
May 10 17:05:04 bXXX0 kernel: RPC: 35575 sync task going to sleep
May 10 17:05:04 bXXX0 kernel: RPC:       xs_bind 0.0.0.0:1023: failed (-98)
May 10 17:05:04 bXXX0 kernel: RPC: 35575 __rpc_wake_up_task (now 14685414165)
May 10 17:05:04 bXXX0 kernel: RPC: 35575 disabling timer
May 10 17:05:04 bXXX0 kernel: RPC: 35575 removed from queue ffff880209f0b258 "xprt_pending"
May 10 17:05:04 bXXX0 kernel: RPC:       __rpc_wake_up_task done
May 10 17:05:04 bXXX0 kernel: RPC: 35575 sync task resuming
May 10 17:05:04 bXXX0 kernel: RPC: 35575 xprt_connect_status: error 98 connecting to server 1xx.xx4.xx8.xx3
May 10 17:05:04 bXXX0 kernel: RPC:       wake_up_first(ffff880209f0b190 "xprt_sending")
May 10 17:05:04 bXXX0 kernel: RPC: 35575 call_connect_status (status -5)
May 10 17:05:04 bXXX0 kernel: RPC: 35575 return 0, status -5


This is limited by min_resvport and max_resvport, which are, by default, 665 and 1023, accordingly.
This gives us only 358 connections. If a client accesses many DSes, then we have a problem.

Questions:

  - Why pNFS client must use privileged port number, when talks to DS?

  - Why pNFS client uses port number only for one connection as for ip connection
    it a src_addr+src_port - dst_addr+dst_port must be unique and source port number
    can be reused for other connections as well.

  - Should we just bump max_resvport to solve it (which in did have helped)?


Thanks in advance,
   Tigran.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux