Hi Olga, ----- Original Message ----- > From: "Olga Kornievskaia" <aglo@xxxxxxxxx> > To: "Mkrtchyan, Tigran" <tigran.mkrtchyan@xxxxxxx> > Cc: "Linux NFS Mailing list" <linux-nfs@xxxxxxxxxxxxxxx>, "Steve Dickson" <steved@xxxxxxxxxx> > Sent: Monday, March 20, 2017 9:14:34 PM > Subject: Re: pNFS: invalid IP:port selection when talks to DS > Hi Tigran, > > While I don't have an answer to your question, I'd like to point out > that in 4.9 is when Andy's session trunking patches when in. > > I'm curious this client that's now talking to the DS at port 24006 > instead of 24005, did it before also earlier correctly (legally) > talked to DS that was on 24006? Yes, earlier during testing it had legal access to DS on port 24006. Tigran. > > On Mon, Mar 20, 2017 at 11:52 AM, Mkrtchyan, Tigran > <tigran.mkrtchyan@xxxxxxx> wrote: >> >> >> Dear (p)NFS-ors, >> >> we observe VERY unpleasant situation with pNFS in the production. >> Our hosts run multiple DSes on different ports, usually 24001-24009. >> With CentOS7 (3.10.0-514.6.2.el7.x86_64) we see that client takes >> a wrong port number when talks to data server: >> >> If client uses different DSes on the same host, then at some point it starts >> to send data to the wrong port number: >> >> Client <=> MDS: >> >> >> 1 0.000000000 131.169.251.53 → 131.169.51.35 NFS V4 Call OPEN DH: >> 0x7cbc716b/MIL-68-onebatch-80C-30s-00057.tif.metadata >> 2 0.001469799 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 1) OPEN >> StateID: 0xec18 >> 3 0.001578128 131.169.251.53 → 131.169.51.35 NFS V4 Call SETATTR FH: 0x6ccf3dfa >> 4 0.002657187 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 3) SETATTR >> 5 0.003243819 131.169.251.53 → 131.169.51.35 NFS V4 Call LAYOUTGET >> 6 0.014603386 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 5) LAYOUTGET >> 7 0.014899121 131.169.251.53 → 131.169.51.35 NFS V4 Call GETDEVINFO >> 8 0.015014216 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 7) GETDEVINFO >> Opcode: GETDEVINFO (47) >> Status: NFS4_OK (0) >> layout type: LAYOUT4_NFSV4_1_FILES (1) >> device index: 0 >> r_netid: tcp >> length: 3 >> contents: tcp >> fill bytes: opaque data >> r_addr: 131.169.51.50.93.197 >> length: 20 >> contents: 131.169.51.50.93.197 >> r_netid: tcp >> length: 3 >> contents: tcp >> fill bytes: opaque data >> r_addr: 131.169.51.50.93.197 >> length: 20 >> contents: 131.169.51.50.93.197 >> notification bitmap: 6 >> notification bitmap: 0 >> [Main Opcode: GETDEVINFO (47)] >> >> 9 0.105442455 131.169.251.53 → 131.169.51.35 NFS V4 Call TEST_STATEID >> 10 0.105521354 131.169.51.35 → 131.169.251.53 NFS V4 Reply (Call In 9) >> TEST_STATEID >> >> >> >> NOTICE, that 131.169.51.50.93.197 corresponds to port 24005. >> >> client <=> DS >> >> $ tshark -r ds-write.pcap -n -z conv,tcp >> 1 0.000000 131.169.251.53 → 131.169.51.50 NFS V4 Call WRITE StateID: 0xff01 >> Offset: 0 Len: 3968 >> 2 0.000090 131.169.51.50 → 131.169.251.53 NFS V4 Reply (Call In 1) WRITE >> Status: NFS4ERR_BAD_STATEID >> ================================================================================ >> TCP Conversations >> Filter:<No Filter> >> | <- | | -> | | Total | Relative | Duration | >> | Frames Bytes | | Frames Bytes | | Frames Bytes | Start | >> | | >> 131.169.51.50:24006 <-> 131.169.251.53:847 1 4240 >> 1 168 2 4408 0.000000000 0.0001 >> ================================================================================ >> >> NOTICE, that it talks to DS on port 24006! >> >> Is there know fix which is missing in CentOS7? I can't reproduce it with >> 4.9 kernel (or it's harder to reproduce). >> >> >> The packages are attached. >> >> Tigran. >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html