On Mon, Feb 05, 2018 at 08:21:06AM +0000, Terry Barnaby wrote: > On 01/02/18 08:29, Terry Barnaby wrote: > > On 01/02/18 01:34, Jeremy Linton wrote: > > > On 01/31/2018 09:49 AM, J. Bruce Fields wrote: > > > > On Tue, Jan 30, 2018 at 01:52:49PM -0600, Jeremy Linton wrote: > > > > > Have you tried this with a '-o nfsvers=3' during mount? Did that help? > > > > > > > > > > I noticed a large decrease in my kernel build times across > > > > > NFS/lan a while > > > > > back after a machine/kernel/10g upgrade. After playing with > > > > > mount/export > > > > > options filesystem tuning/etc, I got to this point of timing > > > > > a bunch of > > > > > these operations vs the older machine, at which point I > > > > > discovered that > > > > > simply backing down to NFSv3 solved the problem. > > > > > > > > > > AKA a nfsv3 server on a 10 year old 4 disk xfs RAID5 on 1Gb > > > > > ethernet, was > > > > > slower than a modern machine with a 8 disk xfs RAID5 on 10Gb > > > > > on nfsv4. The > > > > > effect was enough to change a kernel build from ~45 minutes > > > > > down to less > > > > > than 5. > > > > > > Using NFSv3 in async mode is faster than NFSv4 in async mode (still > > abysmal in sync mode). > > > > NFSv3 async: sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; > > sync) > > > > real 2m25.717s > > user 0m8.739s > > sys 0m13.362s > > > > NFSv4 async: sync; time (tar -xf linux-4.14.15.tar.gz -C /data2/tmp; > > sync) > > > > real 3m33.032s > > user 0m8.506s > > sys 0m16.930s > > > > NFSv3 async: wireshark trace > > > > No. Time Source Destination Protocol Length Info > > 18527 2.815884979 192.168.202.2 192.168.202.1 NFS > > 216 V3 CREATE Call (Reply In 18528), DH: 0x62f39428/dma.h Mode: > > EXCLUSIVE > > 18528 2.816362338 192.168.202.1 192.168.202.2 NFS > > 328 V3 CREATE Reply (Call In 18527) > > 18529 2.816418841 192.168.202.2 192.168.202.1 NFS > > 224 V3 SETATTR Call (Reply In 18530), FH: 0x13678ba0 > > 18530 2.816871820 192.168.202.1 192.168.202.2 NFS > > 216 V3 SETATTR Reply (Call In 18529) > > 18531 2.816966771 192.168.202.2 192.168.202.1 NFS > > 1148 V3 WRITE Call (Reply In 18532), FH: 0x13678ba0 Offset: 0 Len: 934 > > FILE_SYNC > > 18532 2.817441291 192.168.202.1 192.168.202.2 NFS > > 208 V3 WRITE Reply (Call In 18531) Len: 934 FILE_SYNC > > 18533 2.817495775 192.168.202.2 192.168.202.1 NFS > > 236 V3 SETATTR Call (Reply In 18534), FH: 0x13678ba0 > > 18534 2.817920346 192.168.202.1 192.168.202.2 NFS > > 216 V3 SETATTR Reply (Call In 18533) > > 18535 2.818002910 192.168.202.2 192.168.202.1 NFS > > 216 V3 CREATE Call (Reply In 18536), DH: 0x62f39428/elf.h Mode: > > EXCLUSIVE > > 18536 2.818492126 192.168.202.1 192.168.202.2 NFS > > 328 V3 CREATE Reply (Call In 18535) > > > > This is taking about 2ms for a small file write rather than 3ms for > > NFSv4. There is an extra GETATTR and CLOSE RPC in NFSv4 accounting for > > the difference. > > > > So where I am: > > > > 1. NFS in sync mode, at least on my two Fedora27 systems for my usage is > > completely unusable. (sync: 2 hours, async: 3 minutes, localdisk: 13 > > seconds). > > > > 2. NFS async mode is working, but the small writes are still very slow. > > > > 3. NFS in async mode is 30% better with NFSv3 than NFSv4 when writing > > small files due to the increased latency caused by NFSv4's two extra RPC > > calls. > > > > I really think that in 2018 we should be able to have better NFS > > performance when writing many small files such as used in software > > development. This would speed up any system that was using NFS with this > > sort of workload dramatically and reduce power usage all for some > > improvements in the NFS protocol. > > > > I don't know the details of if this would work, or who is responsible > > for NFS, but it would be good if possible to have some improvements > > (NFSv4.3 ?). Maybe: > > > > 1. Have an OPEN-SETATTR-WRITE RPC call all in one and a SETATTR-CLOSE > > call all in one. This would reduce the latency of a small file to 1ms > > rather than 3ms thus 66% faster. Would require the client to delay the > > OPEN/SETATTR until the first WRITE. Not sure how possible this is in the > > implementations. Maybe READ's could be improved as well but getting the > > OPEN through quick may be better in this case ? > > > > 2. Could go further with an OPEN-SETATTR-WRITE-CLOSE RPC call. (0.5ms vs > > 3ms). > > > > 3. On sync/async modes personally I think it would be better for the > > client to request the mount in sync/async mode. The setting of sync on > > the server side would just enforce sync mode for all clients. If the > > server is in the default async mode clients can mount using sync or > > async as to their requirements. This seems to match normal VFS semantics > > and usage patterns better. > > > > 4. The 0.5ms RPC latency seems a bit high (ICMP pings 0.12ms) . Maybe > > this is worth investigating in the Linux kernel processing (how ?) ? > > > > 5. The 20ms RPC latency I see in sync mode needs a look at on my system > > although async mode is fine for my usage. Maybe this ends up as 2 x 10ms > > drive seeks on ext4 and is thus expected. > > > Yet another poor NFSv3 performance issue. If I do a "ls -lR" of a certain > NFS mounted directory over a slow link (NFS over Openvpn over FTTP > 80/20Mbps), just after mounting the file system (default NFSv4 mount with > async), it takes about 9 seconds. If I run the same "ls -lR" again, just > after, it takes about 60 seconds. A wireshark trace might help. Also, is it possible some process is writing while this is happening? --b. > So much for caching ! I have noticed > Makefile based builds (over Ethernet 1Gbps) taking a long time with a second > or so between each directory, I think this maybe why. > > Listing the directory using a NFSv3 mount takes 67 seconds on the first > mount and about the same on subsequent ones. No noticeable caching (default > mount options with async), At least NFSv4 is fast the first time ! > > NFSv4 directory reads after mount: > > No. Time Source Destination Protocol Length Info > 667 4.560833210 192.168.202.2 192.168.201.1 NFS 304 > V4 Call (Reply In 672) READDIR FH: 0xde55a546 > 668 4.582809439 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 679 [ACK] Seq=304477 Ack=45901 Win=1452 Len=1337 TSval=2646321616 > TSecr=913651354 [TCP segment of a reassembled PDU] > 669 4.582986377 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 679 [ACK] Seq=305814 Ack=45901 Win=1452 Len=1337 TSval=2646321616 > TSecr=913651354 [TCP segment of a reassembled PDU] > 670 4.583003805 192.168.202.2 192.168.201.1 TCP 68 > 679 â 2049 [ACK] Seq=45901 Ack=307151 Win=1444 Len=0 TSval=913651376 > TSecr=2646321616 > 671 4.583265423 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 679 [ACK] Seq=307151 Ack=45901 Win=1452 Len=1337 TSval=2646321616 > TSecr=913651354 [TCP segment of a reassembled PDU] > 672 4.583280603 192.168.201.1 192.168.202.2 NFS 289 > V4 Reply (Call In 667) READDIR > 673 4.583291818 192.168.202.2 192.168.201.1 TCP 68 > 679 â 2049 [ACK] Seq=45901 Ack=308709 Win=1444 Len=0 TSval=913651377 > TSecr=2646321616 > 674 4.583819172 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 675) GETATTR FH: 0xb91bfde7 > 675 4.605389953 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 674) GETATTR > 676 4.605491075 192.168.202.2 192.168.201.1 NFS 288 > V4 Call (Reply In 677) ACCESS FH: 0xb91bfde7, [Check: RD LU MD XT DL] > 677 4.626848306 192.168.201.1 192.168.202.2 NFS 240 > V4 Reply (Call In 676) ACCESS, [Allowed: RD LU MD XT DL] > 678 4.626993773 192.168.202.2 192.168.201.1 NFS 304 > V4 Call (Reply In 679) READDIR FH: 0xb91bfde7 > 679 4.649330354 192.168.201.1 192.168.202.2 NFS 2408 > V4 Reply (Call In 678) READDIR > 680 4.649380840 192.168.202.2 192.168.201.1 TCP 68 > 679 â 2049 [ACK] Seq=46569 Ack=311465 Win=1444 Len=0 TSval=913651443 > TSecr=2646321683 > 681 4.649716746 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 682) GETATTR FH: 0xb6d01f2a > 682 4.671167708 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 681) GETATTR > 683 4.671281003 192.168.202.2 192.168.201.1 NFS 288 > V4 Call (Reply In 684) ACCESS FH: 0xb6d01f2a, [Check: RD LU MD XT DL] > 684 4.692647455 192.168.201.1 192.168.202.2 NFS 240 > V4 Reply (Call In 683) ACCESS, [Allowed: RD LU MD XT DL] > 685 4.692825251 192.168.202.2 192.168.201.1 NFS 304 > V4 Call (Reply In 690) READDIR FH: 0xb6d01f2a > 686 4.715060586 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 679 [ACK] Seq=311881 Ack=47237 Win=1452 Len=1337 TSval=2646321748 > TSecr=913651486 [TCP segment of a reassembled PDU] > 687 4.715199557 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 679 [ACK] Seq=313218 Ack=47237 Win=1452 Len=1337 TSval=2646321748 > TSecr=913651486 [TCP segment of a reassembled PDU] > 688 4.715215055 192.168.202.2 192.168.201.1 TCP 68 > 679 â 2049 [ACK] Seq=47237 Ack=314555 Win=1444 Len=0 TSval=913651509 > TSecr=2646321748 > 689 4.715524465 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 679 [ACK] Seq=314555 Ack=47237 Win=1452 Len=1337 TSval=2646321749 > TSecr=913651486 [TCP segment of a reassembled PDU] > 690 4.715911571 192.168.201.1 192.168.202.2 NFS 1449 > V4 Reply (Call In 685) READDIR > > NFS directory reads later: > > No. Time Source Destination Protocol Length Info > 664 9.485593049 192.168.202.2 192.168.201.1 NFS 304 > V4 Call (Reply In 669) READDIR FH: 0x1933e99e > 665 9.507596250 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 788 [ACK] Seq=127921 Ack=65730 Win=3076 Len=1337 TSval=2645776572 > TSecr=913106316 [TCP segment of a reassembled PDU] > 666 9.507717425 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 788 [ACK] Seq=129258 Ack=65730 Win=3076 Len=1337 TSval=2645776572 > TSecr=913106316 [TCP segment of a reassembled PDU] > 667 9.507733352 192.168.202.2 192.168.201.1 TCP 68 > 788 â 2049 [ACK] Seq=65730 Ack=130595 Win=1444 Len=0 TSval=913106338 > TSecr=2645776572 > 668 9.507987020 192.168.201.1 192.168.202.2 TCP 1405 > 2049 â 788 [ACK] Seq=130595 Ack=65730 Win=3076 Len=1337 TSval=2645776572 > TSecr=913106316 [TCP segment of a reassembled PDU] > 669 9.508456847 192.168.201.1 192.168.202.2 NFS 989 > V4 Reply (Call In 664) READDIR > 670 9.508472149 192.168.202.2 192.168.201.1 TCP 68 > 788 â 2049 [ACK] Seq=65730 Ack=132853 Win=1444 Len=0 TSval=913106338 > TSecr=2645776572 > 671 9.508880627 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 672) GETATTR FH: 0x7e9e8300 > 672 9.530375865 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 671) GETATTR > 673 9.530564317 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 674) GETATTR FH: 0xcb837ac9 > 674 9.551906321 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 673) GETATTR > 675 9.552064038 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 676) GETATTR FH: 0xbf951d32 > 676 9.574210528 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 675) GETATTR > 677 9.574334117 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 678) GETATTR FH: 0xd3f3dc3e > 678 9.595902902 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 677) GETATTR > 679 9.596025484 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 680) GETATTR FH: 0xf534332a > 680 9.617497794 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 679) GETATTR > 681 9.617621218 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 682) GETATTR FH: 0xa7e5bbc5 > 682 9.639157371 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 681) GETATTR > 683 9.639279098 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 684) GETATTR FH: 0xa8050515 > 684 9.660669335 192.168.201.1 192.168.202.2 NFS 312 > V4 Reply (Call In 683) GETATTR > 685 9.660787725 192.168.202.2 192.168.201.1 NFS 304 > V4 Call (Reply In 686) READDIR FH: 0x7e9e8300 > 686 9.682612756 192.168.201.1 192.168.202.2 NFS 1472 > V4 Reply (Call In 685) READDIR > 687 9.682646761 192.168.202.2 192.168.201.1 TCP 68 > 788 â 2049 [ACK] Seq=67450 Ack=135965 Win=1444 Len=0 TSval=913106513 > TSecr=2645776747 > 688 9.682906293 192.168.202.2 192.168.201.1 NFS 280 > V4 Call (Reply In 689) GETATTR FH: 0xa8050515 > > Lots of GETATTR calls the second time around (each file ?). > > Really NFS is really broken performance wise these days and it "appears" > that significant/huge improvements are possible. > > Anyone know what group/who is responsible for NFS protocol these days ? > > Also what group/who is responsible for the Linux kernel's implementation of > it ? > _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx