Re: Fwd: Re: Fedora27: NFS v4 terrible write performance, is async working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/02/18 14:52, J. Bruce Fields wrote:
Yet another poor NFSv3 performance issue. If I do a "ls -lR" of a certain
NFS mounted directory over a slow link (NFS over Openvpn over FTTP
80/20Mbps), just after mounting the file system (default NFSv4 mount with
async), it takes about 9 seconds. If I run the same "ls -lR" again, just
after, it takes about 60 seconds.
A wireshark trace might help.

Also, is it possible some process is writing while this is happening?

--b.

Ok, I have made some wireshark traces and put these at:

https://www.beam.ltd.uk/files/files//nfs/

There are other processing running obviously, but nothing that should be doing anything that should really affect this.

As a naive input, it looks like the client is using a cache but checking the update times of each file individually using GETATTR. As it is using a simple GETATTR per file in each directory the latency of these RPC calls is mounting up. I guess it would be possible to check the cache status of all files in a dir at once with one call that would allow this to be faster when a full readdir is in progress, like a "GETATTR_DIR <dir>" RPC call. The overhead of the extra data would probably not affect a single file check cache time as latency rather than amount of data is the killer.

So much for caching ! I have noticed
Makefile based builds (over Ethernet 1Gbps) taking a long time with a second
or so between each directory, I think this maybe why.

Listing the directory using a NFSv3 mount takes 67 seconds on the first
mount and about the same on subsequent ones. No noticeable caching (default
mount options with async), At least NFSv4 is fast the first time !

NFSv4 directory reads after mount:

No.     Time           Source Destination           Protocol Length Info
     667 4.560833210    192.168.202.2         192.168.201.1 NFS      304
V4 Call (Reply In 672) READDIR FH: 0xde55a546
     668 4.582809439    192.168.201.1         192.168.202.2 TCP      1405
2049 → 679 [ACK] Seq=304477 Ack=45901 Win=1452 Len=1337 TSval=2646321616
TSecr=913651354 [TCP segment of a reassembled PDU]
     669 4.582986377    192.168.201.1         192.168.202.2 TCP      1405
2049 → 679 [ACK] Seq=305814 Ack=45901 Win=1452 Len=1337 TSval=2646321616
TSecr=913651354 [TCP segment of a reassembled PDU]
     670 4.583003805    192.168.202.2         192.168.201.1 TCP      68
679 → 2049 [ACK] Seq=45901 Ack=307151 Win=1444 Len=0 TSval=913651376
TSecr=2646321616
     671 4.583265423    192.168.201.1         192.168.202.2 TCP      1405
2049 → 679 [ACK] Seq=307151 Ack=45901 Win=1452 Len=1337 TSval=2646321616
TSecr=913651354 [TCP segment of a reassembled PDU]
     672 4.583280603    192.168.201.1         192.168.202.2 NFS      289
V4 Reply (Call In 667) READDIR
     673 4.583291818    192.168.202.2         192.168.201.1 TCP      68
679 → 2049 [ACK] Seq=45901 Ack=308709 Win=1444 Len=0 TSval=913651377
TSecr=2646321616
     674 4.583819172    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 675) GETATTR FH: 0xb91bfde7
     675 4.605389953    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 674) GETATTR
     676 4.605491075    192.168.202.2         192.168.201.1 NFS      288
V4 Call (Reply In 677) ACCESS FH: 0xb91bfde7, [Check: RD LU MD XT DL]
     677 4.626848306    192.168.201.1         192.168.202.2 NFS      240
V4 Reply (Call In 676) ACCESS, [Allowed: RD LU MD XT DL]
     678 4.626993773    192.168.202.2         192.168.201.1 NFS      304
V4 Call (Reply In 679) READDIR FH: 0xb91bfde7
     679 4.649330354    192.168.201.1         192.168.202.2 NFS      2408
V4 Reply (Call In 678) READDIR
     680 4.649380840    192.168.202.2         192.168.201.1 TCP      68
679 → 2049 [ACK] Seq=46569 Ack=311465 Win=1444 Len=0 TSval=913651443
TSecr=2646321683
     681 4.649716746    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 682) GETATTR FH: 0xb6d01f2a
     682 4.671167708    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 681) GETATTR
     683 4.671281003    192.168.202.2         192.168.201.1 NFS      288
V4 Call (Reply In 684) ACCESS FH: 0xb6d01f2a, [Check: RD LU MD XT DL]
     684 4.692647455    192.168.201.1         192.168.202.2 NFS      240
V4 Reply (Call In 683) ACCESS, [Allowed: RD LU MD XT DL]
     685 4.692825251    192.168.202.2         192.168.201.1 NFS      304
V4 Call (Reply In 690) READDIR FH: 0xb6d01f2a
     686 4.715060586    192.168.201.1         192.168.202.2 TCP      1405
2049 → 679 [ACK] Seq=311881 Ack=47237 Win=1452 Len=1337 TSval=2646321748
TSecr=913651486 [TCP segment of a reassembled PDU]
     687 4.715199557    192.168.201.1         192.168.202.2 TCP      1405
2049 → 679 [ACK] Seq=313218 Ack=47237 Win=1452 Len=1337 TSval=2646321748
TSecr=913651486 [TCP segment of a reassembled PDU]
     688 4.715215055    192.168.202.2         192.168.201.1 TCP      68
679 → 2049 [ACK] Seq=47237 Ack=314555 Win=1444 Len=0 TSval=913651509
TSecr=2646321748
     689 4.715524465    192.168.201.1         192.168.202.2 TCP      1405
2049 → 679 [ACK] Seq=314555 Ack=47237 Win=1452 Len=1337 TSval=2646321749
TSecr=913651486 [TCP segment of a reassembled PDU]
     690 4.715911571    192.168.201.1         192.168.202.2 NFS      1449
V4 Reply (Call In 685) READDIR

NFS directory reads later:

No.     Time           Source Destination           Protocol Length Info
     664 9.485593049    192.168.202.2         192.168.201.1 NFS      304
V4 Call (Reply In 669) READDIR FH: 0x1933e99e
     665 9.507596250    192.168.201.1         192.168.202.2 TCP      1405
2049 → 788 [ACK] Seq=127921 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
     666 9.507717425    192.168.201.1         192.168.202.2 TCP      1405
2049 → 788 [ACK] Seq=129258 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
     667 9.507733352    192.168.202.2         192.168.201.1 TCP      68
788 → 2049 [ACK] Seq=65730 Ack=130595 Win=1444 Len=0 TSval=913106338
TSecr=2645776572
     668 9.507987020    192.168.201.1         192.168.202.2 TCP      1405
2049 → 788 [ACK] Seq=130595 Ack=65730 Win=3076 Len=1337 TSval=2645776572
TSecr=913106316 [TCP segment of a reassembled PDU]
     669 9.508456847    192.168.201.1         192.168.202.2 NFS      989
V4 Reply (Call In 664) READDIR
     670 9.508472149    192.168.202.2         192.168.201.1 TCP      68
788 → 2049 [ACK] Seq=65730 Ack=132853 Win=1444 Len=0 TSval=913106338
TSecr=2645776572
     671 9.508880627    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 672) GETATTR FH: 0x7e9e8300
     672 9.530375865    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 671) GETATTR
     673 9.530564317    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 674) GETATTR FH: 0xcb837ac9
     674 9.551906321    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 673) GETATTR
     675 9.552064038    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 676) GETATTR FH: 0xbf951d32
     676 9.574210528    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 675) GETATTR
     677 9.574334117    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 678) GETATTR FH: 0xd3f3dc3e
     678 9.595902902    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 677) GETATTR
     679 9.596025484    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 680) GETATTR FH: 0xf534332a
     680 9.617497794    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 679) GETATTR
     681 9.617621218    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 682) GETATTR FH: 0xa7e5bbc5
     682 9.639157371    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 681) GETATTR
     683 9.639279098    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 684) GETATTR FH: 0xa8050515
     684 9.660669335    192.168.201.1         192.168.202.2 NFS      312
V4 Reply (Call In 683) GETATTR
     685 9.660787725    192.168.202.2         192.168.201.1 NFS      304
V4 Call (Reply In 686) READDIR FH: 0x7e9e8300
     686 9.682612756    192.168.201.1         192.168.202.2 NFS      1472
V4 Reply (Call In 685) READDIR
     687 9.682646761    192.168.202.2         192.168.201.1 TCP      68
788 → 2049 [ACK] Seq=67450 Ack=135965 Win=1444 Len=0 TSval=913106513
TSecr=2645776747
     688 9.682906293    192.168.202.2         192.168.201.1 NFS      280
V4 Call (Reply In 689) GETATTR FH: 0xa8050515

Lots of GETATTR calls the second time around (each file ?).

Really NFS is really broken performance wise these days and it "appears"
that significant/huge improvements are possible.

Anyone know what group/who is responsible for NFS protocol these days ?

Also what group/who is responsible for the Linux kernel's implementation of
it ?

--

Dr Terry Barnaby            BEAM Ltd
Phone: +44 1454 324512      Northavon Business Center,
Email: terry@xxxxxxxxxxx    Dean Rd, Yate
Web: www.beam.ltd.uk        Bristol, BS37 5NH, UK
BEAM Engineering: Instrumentation, Electronics/Software/Systems
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux