Re: pnfs client running out TCP port numbers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hi Trond,

today, during linux-nfs phone-conf Chuck has suggested to use **noresvport**
mount option. It works for client <=> MDS connection, but was ignored for
client <=> DS connection:

[root@dcache-lab-wn002 ~]# netstat -tnC
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address               Foreign Address             State      
tcp        0      0 131.169.161.126:860         131.169.191.142:32049       ESTABLISHED 
tcp        0      0 131.169.161.126:49451       131.169.191.144:2049        ESTABLISHED 
tcp        0    200 131.169.161.126:887         131.169.191.141:32049       ESTABLISHED 
[root@dcache-lab-wn002 ~]# 

Looks like it's a trivial change to fix that. I will send a patch after testing.

Tigran.

----- Original Message -----
> From: "Trond Myklebust" <trondmy@xxxxxxxxxxxxxxx>
> To: "Mkrtchyan, Tigran" <tigran.mkrtchyan@xxxxxxx>, "linux-nfs list" <linux-nfs@xxxxxxxxxxxxxxx>
> Cc: "yves kemp" <yves.kemp@xxxxxxx>
> Sent: Tuesday, May 10, 2016 6:21:14 PM
> Subject: Re: pnfs client running out TCP port numbers

> On 5/10/16, 11:57, "linux-nfs-owner@xxxxxxxxxxxxxxx on behalf of Mkrtchyan,
> Tigran" <linux-nfs-owner@xxxxxxxxxxxxxxx on behalf of tigran.mkrtchyan@xxxxxxx>
> wrote:
> 
>>
>>Dear NFS gurus,
>>
>>we observe very interesting problem with pNFS client.
>>We have ~600 DSes in our installation + MDS + some
>>regular NFSv3 and v4 mounts. After some time we get on
>>the client nodes that they can't create new mounts:
>>
>>May 10 16:00:25 bXXX0 automount[5351]: attempting to mount entry /nfs/aaa/bbb
>>May 10 16:00:26 bXXX0 automount[5351]: >> mount.nfs: mount system call failed
>>May 10 16:00:26 bXXX0 automount[5351]: >> mount.nfs: mount system call failed
>>
>>Turned out that problem is in RPC layer. There are no free source ports anymore:
>>
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 call_connect xprt ffff880209f0b000 is
>>not connected
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 xprt_connect xprt ffff880209f0b000 is
>>not connected
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 sleep_on(queue "xprt_pending" time
>>14685414165)
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 added to queue ffff880209f0b258
>>"xprt_pending"
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 setting alarm for 60000 ms
>>May 10 17:05:04 bXXX0 kernel: RPC:       xs_connect scheduled xprt
>>ffff880209f0b000
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 sync task going to sleep
>>May 10 17:05:04 bXXX0 kernel: RPC:       xs_bind 0.0.0.0:1023: failed (-98)
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 __rpc_wake_up_task (now 14685414165)
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 disabling timer
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 removed from queue ffff880209f0b258
>>"xprt_pending"
>>May 10 17:05:04 bXXX0 kernel: RPC:       __rpc_wake_up_task done
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 sync task resuming
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 xprt_connect_status: error 98
>>connecting to server 1xx.xx4.xx8.xx3
>>May 10 17:05:04 bXXX0 kernel: RPC:       wake_up_first(ffff880209f0b190
>>"xprt_sending")
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 call_connect_status (status -5)
>>May 10 17:05:04 bXXX0 kernel: RPC: 35575 return 0, status -5
>>
>>
>>This is limited by min_resvport and max_resvport, which are, by default, 665 and
>>1023, accordingly.
>>This gives us only 358 connections. If a client accesses many DSes, then we have
>>a problem.
>>
>>Questions:
>>
>>  - Why pNFS client must use privileged port number, when talks to DS?
> 
> That's a default requirement on most NFSv3 servers, particularly when using
> AUTH_SYS.
> 
>>
>>  - Why pNFS client uses port number only for one connection as for ip connection
>>    it a src_addr+src_port - dst_addr+dst_port must be unique and source port number
>>    can be reused for other connections as well.
> 
> As you say above, in order to reuse the port, the connection end points need to
> be unique. That can sometimes be tricky if the server is acting both as an MDS
> and a DS.
> 
>>
>>  - Should we just bump max_resvport to solve it (which in did have helped)?
> 
> You could. We could also look into handling the AUTH_TOOWEAK RPC level error by
> turning on privileged ports. That might allow us to default to not using
> privileged ports.
> 
> 
> 
> N�����r��y���b�X��ǧv�^�)޺{.n�+����{���"��^n�r���z���h����&���G���h�(�階�ݢj"���m�����z�ޖ���f���h���~�m�
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux