Re: pNFS: invalid IP:port selection when talks to DS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Olga,

you did not have the answer, however you gave me an important hint!
I believe, all our DSes on a single host generate the same server
owner during exchange-id. I guess, this can be the reason, why
client decides to talk to an other DS.

Tigran.

----- Original Message -----
> From: "Mkrtchyan, Tigran" <tigran.mkrtchyan@xxxxxxx>
> To: "Olga Kornievskaia" <aglo@xxxxxxxxx>
> Cc: "Linux NFS Mailing list" <linux-nfs@xxxxxxxxxxxxxxx>, "Steve Dickson" <steved@xxxxxxxxxx>
> Sent: Monday, March 20, 2017 9:51:21 PM
> Subject: Re: pNFS: invalid IP:port selection when talks to DS

> Hi Olga,
> 
> ----- Original Message -----
>> From: "Olga Kornievskaia" <aglo@xxxxxxxxx>
>> To: "Mkrtchyan, Tigran" <tigran.mkrtchyan@xxxxxxx>
>> Cc: "Linux NFS Mailing list" <linux-nfs@xxxxxxxxxxxxxxx>, "Steve Dickson"
>> <steved@xxxxxxxxxx>
>> Sent: Monday, March 20, 2017 9:14:34 PM
>> Subject: Re: pNFS: invalid IP:port selection when talks to DS
> 
>> Hi Tigran,
>> 
>> While I don't have an answer to your question, I'd like to point out
>> that in 4.9 is when Andy's session trunking patches when in.
>> 
>> I'm curious this client that's now talking to the DS at port 24006
>> instead of 24005, did it before also earlier correctly (legally)
>> talked to DS that was on 24006?
> 
> Yes, earlier during testing it had legal access to DS on port 24006.
> 
> Tigran.
> 
>> 
>> On Mon, Mar 20, 2017 at 11:52 AM, Mkrtchyan, Tigran
>> <tigran.mkrtchyan@xxxxxxx> wrote:
>>>
>>>
>>> Dear (p)NFS-ors,
>>>
>>> we observe VERY unpleasant situation with pNFS in the production.
>>> Our hosts run multiple DSes on different ports, usually 24001-24009.
>>> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes
>>> a wrong port number when talks to data server:
>>>
>>> If client uses different DSes on the same host, then at some point it starts
>>> to send data to the wrong port number:
>>>
>>> Client <=> MDS:
>>>
>>>
>>>     1 0.000000000 131.169.251.53 → 131.169.51.35 NFS V4 Call OPEN DH:
>>>     0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata
>>>     2 0.001469799 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 1) OPEN
>>>     StateID: 0xec18
>>>     3 0.001578128 131.169.251.53 → 131.169.51.35 NFS V4 Call SETATTR FH: 0x6ccf3dfa
>>>     4 0.002657187 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 3) SETATTR
>>>     5 0.003243819 131.169.251.53 → 131.169.51.35 NFS V4 Call LAYOUTGET
>>>     6 0.014603386 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 5) LAYOUTGET
>>>     7 0.014899121 131.169.251.53 → 131.169.51.35 NFS V4 Call GETDEVINFO
>>>     8 0.015014216 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 7) GETDEVINFO
>>>         Opcode: GETDEVINFO (47)
>>>             Status: NFS4_OK (0)
>>>             layout type: LAYOUT4_NFSV4_1_FILES (1)
>>>             device index: 0
>>>             r_netid: tcp
>>>                 length: 3
>>>                 contents: tcp
>>>                 fill bytes: opaque data
>>>             r_addr: 131.169.51.50.93.197
>>>                 length: 20
>>>                 contents: 131.169.51.50.93.197
>>>             r_netid: tcp
>>>                 length: 3
>>>                 contents: tcp
>>>                 fill bytes: opaque data
>>>             r_addr: 131.169.51.50.93.197
>>>                 length: 20
>>>                 contents: 131.169.51.50.93.197
>>>             notification bitmap: 6
>>>             notification bitmap: 0
>>>     [Main Opcode: GETDEVINFO (47)]
>>>
>>>     9 0.105442455 131.169.251.53 → 131.169.51.35 NFS V4 Call TEST_STATEID
>>>    10 0.105521354 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 9)
>>>    TEST_STATEID
>>>
>>>
>>>
>>> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005.
>>>
>>> client <=> DS
>>>
>>> $ tshark -r ds-write.pcap  -n -z conv,tcp
>>>     1   0.000000 131.169.251.53 → 131.169.51.50 NFS V4 Call WRITE StateID: 0xff01
>>>     Offset: 0 Len: 3968
>>>     2   0.000090 131.169.51.50 → 131.169.251.53 NFS V4 Reply (Call In 1) WRITE
>>>     Status: NFS4ERR_BAD_STATEID
>>> ================================================================================
>>> TCP Conversations
>>> Filter:<No Filter>
>>>                                                            |       <-      | |       ->      | |     Total     |    Relative    |   Duration   |
>>>                                                            | Frames  Bytes | | Frames  Bytes | | Frames  Bytes |      Start     |
>>>                                                            | |
>>> 131.169.51.50:24006        <-> 131.169.251.53:847               1      4240
>>> 1       168       2      4408     0.000000000         0.0001
>>> ================================================================================
>>>
>>> NOTICE, that it talks to DS on port 24006!
>>>
>>> Is there know fix which is missing in CentOS7? I can't reproduce it with
>>> 4.9 kernel (or it's harder to reproduce).
>>>
>>>
>>> The packages are attached.
>>>
>>> Tigran.
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux