On 05/02/18 14:52, J. Bruce Fields wrote:
Yet another poor NFSv3 performance issue. If I do a "ls -lR" of a certain
NFS mounted directory over a slow link (NFS over Openvpn over FTTP
80/20Mbps), just after mounting the file system (default NFSv4 mount with
async), it takes about 9 seconds. If I run the same "ls -lR" again, just
after, it takes about 60 seconds.
A wireshark trace might help.
Also, is it possible some process is writing while this is happening?
--b.
Ok, I have made some wireshark traces and put these at:
https://www.beam.ltd.uk/files/files//nfs/
There are other processing running obviously, but nothing that should be
doing anything that should really affect this.
As a naive input, it looks like the client is using a cache but checking
the update times of each file individually using GETATTR. As it is using
a simple GETATTR per file in each directory the latency of these RPC
calls is mounting up. I guess it would be possible to check the cache
status of all files in a dir at once with one call that would allow this
to be faster when a full readdir is in progress, like a "GETATTR_DIR
<dir>" RPC call. The overhead of the extra data would probably not
affect a single file check cache time as latency rather than amount of
data is the killer.
So much for caching ! I have noticed
Makefile based builds (over Ethernet 1Gbps) taking a long time with a second
or so between each directory, I think this maybe why.
Listing the directory using a NFSv3 mount takes 67 seconds on the first
mount and about the same on subsequent ones. No noticeable caching (default
mount options with async), At least NFSv4 is fast the first time !
NFSv4 directory reads after mount:
No. Time Source Destination Protocol Length Info
667 4.560833210 192.168.202.2 192.168.201.1 NFS 304
V4 Call (Reply In 672) READDIR FH: 0xde55a546
668 4.582809439 192.168.201.1 192.168.202.2 TCP 1405
2049 → 679 [ACK] Seq=304477 Ack=45901 Win=1452 Len=1337 TSval=2646321616
TSecr=913651354 [TCP segment of a reassembled PDU]
669 4.582986377 192.168.201.1 192.168.202.2 TCP 1405
2049 → 679 [ACK] Seq=305814 Ack=45901 Win=1452 Len=1337 TSval=2646321616
TSecr=913651354 [TCP segment of a reassembled PDU]
670 4.583003805 192.168.202.2 192.168.201.1 TCP 68
679 → 2049 [ACK] Seq=45901 Ack=307151 Win=1444 Len=0 TSval=913651376
TSecr=2646321616
671 4.583265423 192.168.201.1 192.168.202.2 TCP 1405
2049 → 679 [ACK] Seq=307151 Ack=45901 Win=1452 Len=1337 TSval=2646321616
TSecr=913651354 [TCP segment of a reassembled PDU]
672 4.583280603 192.168.201.1 192.168.202.2 NFS 289
V4 Reply (Call In 667) READDIR
673 4.583291818 192.168.202.2 192.168.201.1 TCP 68
679 → 2049 [ACK] Seq=45901 Ack=308709 Win=1444 Len=0 TSval=913651377
TSecr=2646321616
674 4.583819172 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 675) GETATTR FH: 0xb91bfde7
675 4.605389953 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 674) GETATTR
676 4.605491075 192.168.202.2 192.168.201.1 NFS 288
V4 Call (Reply In 677) ACCESS FH: 0xb91bfde7, [Check: RD LU MD XT DL]
677 4.626848306 192.168.201.1 192.168.202.2 NFS 240
V4 Reply (Call In 676) ACCESS, [Allowed: RD LU MD XT DL]
678 4.626993773 192.168.202.2 192.168.201.1 NFS 304
V4 Call (Reply In 679) READDIR FH: 0xb91bfde7
679 4.649330354 192.168.201.1 192.168.202.2 NFS 2408
V4 Reply (Call In 678) READDIR
680 4.649380840 192.168.202.2 192.168.201.1 TCP 68
679 → 2049 [ACK] Seq=46569 Ack=311465 Win=1444 Len=0 TSval=913651443
TSecr=2646321683
681 4.649716746 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 682) GETATTR FH: 0xb6d01f2a
682 4.671167708 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 681) GETATTR
683 4.671281003 192.168.202.2 192.168.201.1 NFS 288
V4 Call (Reply In 684) ACCESS FH: 0xb6d01f2a, [Check: RD LU MD XT DL]
684 4.692647455 192.168.201.1 192.168.202.2 NFS 240
V4 Reply (Call In 683) ACCESS, [Allowed: RD LU MD XT DL]
685 4.692825251 192.168.202.2 192.168.201.1 NFS 304
V4 Call (Reply In 690) READDIR FH: 0xb6d01f2a
686 4.715060586 192.168.201.1 192.168.202.2 TCP 1405
2049 → 679 [ACK] Seq=311881 Ack=47237 Win=1452 Len=1337 TSval=2646321748
TSecr=913651486 [TCP segment of a reassembled PDU]
687 4.715199557 192.168.201.1 192.168.202.2 TCP 1405
2049 → 679 [ACK] Seq=313218 Ack=47237 Win=1452 Len=1337 TSval=2646321748
TSecr=913651486 [TCP segment of a reassembled PDU]
688 4.715215055 192.168.202.2 192.168.201.1 TCP 68
679 → 2049 [ACK] Seq=47237 Ack=314555 Win=1444 Len=0 TSval=913651509
TSecr=2646321748
689 4.715524465 192.168.201.1 192.168.202.2 TCP 1405
2049 → 679 [ACK] Seq=314555 Ack=47237 Win=1452 Len=1337 TSval=2646321749
TSecr=913651486 [TCP segment of a reassembled PDU]
690 4.715911571 192.168.201.1 192.168.202.2 NFS 1449
V4 Reply (Call In 685) READDIR
NFS directory reads later:
No. Time Source Destination Protocol Length Info
664 9.485593049 192.168.202.2 192.168.201.1 NFS 304
V4 Call (Reply In 669) READDIR FH: 0x1933e99e
665 9.507596250 192.168.201.1 192.168.202.2 TCP 1405
2049 → 788 [ACK] Seq=127921 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
666 9.507717425 192.168.201.1 192.168.202.2 TCP 1405
2049 → 788 [ACK] Seq=129258 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
667 9.507733352 192.168.202.2 192.168.201.1 TCP 68
788 → 2049 [ACK] Seq=65730 Ack=130595 Win=1444 Len=0 TSval=913106338
TSecr=2645776572
668 9.507987020 192.168.201.1 192.168.202.2 TCP 1405
2049 → 788 [ACK] Seq=130595 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
669 9.508456847 192.168.201.1 192.168.202.2 NFS 989
V4 Reply (Call In 664) READDIR
670 9.508472149 192.168.202.2 192.168.201.1 TCP 68
788 → 2049 [ACK] Seq=65730 Ack=132853 Win=1444 Len=0 TSval=913106338
TSecr=2645776572
671 9.508880627 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 672) GETATTR FH: 0x7e9e8300
672 9.530375865 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 671) GETATTR
673 9.530564317 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 674) GETATTR FH: 0xcb837ac9
674 9.551906321 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 673) GETATTR
675 9.552064038 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 676) GETATTR FH: 0xbf951d32
676 9.574210528 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 675) GETATTR
677 9.574334117 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 678) GETATTR FH: 0xd3f3dc3e
678 9.595902902 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 677) GETATTR
679 9.596025484 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 680) GETATTR FH: 0xf534332a
680 9.617497794 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 679) GETATTR
681 9.617621218 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 682) GETATTR FH: 0xa7e5bbc5
682 9.639157371 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 681) GETATTR
683 9.639279098 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 684) GETATTR FH: 0xa8050515
684 9.660669335 192.168.201.1 192.168.202.2 NFS 312
V4 Reply (Call In 683) GETATTR
685 9.660787725 192.168.202.2 192.168.201.1 NFS 304
V4 Call (Reply In 686) READDIR FH: 0x7e9e8300
686 9.682612756 192.168.201.1 192.168.202.2 NFS 1472
V4 Reply (Call In 685) READDIR
687 9.682646761 192.168.202.2 192.168.201.1 TCP 68
788 → 2049 [ACK] Seq=67450 Ack=135965 Win=1444 Len=0 TSval=913106513
TSecr=2645776747
688 9.682906293 192.168.202.2 192.168.201.1 NFS 280
V4 Call (Reply In 689) GETATTR FH: 0xa8050515
Lots of GETATTR calls the second time around (each file ?).
Really NFS is really broken performance wise these days and it "appears"
that significant/huge improvements are possible.
Anyone know what group/who is responsible for NFS protocol these days ?
Also what group/who is responsible for the Linux kernel's implementation of
it ?
--
Dr Terry Barnaby BEAM Ltd
Phone: +44 1454 324512 Northavon Business Center,
Email: terry@xxxxxxxxxxx Dean Rd, Yate
Web: www.beam.ltd.uk Bristol, BS37 5NH, UK
BEAM Engineering: Instrumentation, Electronics/Software/Systems
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx