Hi,
On 01/30/2018 01:03 PM, Terry Barnaby wrote:
Being a daredevil, I have used the NFS async option for 27 years
without an issue on multiple systems :)
I have just mounted my ext4 disk with the same options you were using
and the same NFS export options and the speed here looks the same as I
had previously. As I can't wait 2+ hours so I'm just looking at
ksysguard and it is showing a network rate of about 10 KBytes/s and
the directory on the server is growing in size very very slowly.
This is using the current Fedora27 kernel 4.14.14-300.fc27.x86_64.
I will have a look at using wireshark to see if this shows anything.
This is a snippet from a wireshark trace of the NFS when untaring the
linux kernel 4.14.15 sources into an NFSv4.2 mounted directory with
"sync" option on my NFS server. The whole untar would take > 2 hours vs
13 seconds direct to the disk. This is about 850 MBytes of 60k files.
The following is a single, small file write.
No. Time Source Destination Protocol Length Info
1880 11.928600315 192.168.202.2 192.168.202.1 NFS 380
V4 Call (Reply In 1881) OPEN DH: 0xac0502f2/sysfs-c2port
1881 11.950329198 192.168.202.1 192.168.202.2 NFS 408
V4 Reply (Call In 1880) OPEN StateID: 0xaa72
1882 11.950446430 192.168.202.2 192.168.202.1 NFS 304
V4 Call (Reply In 1883) SETATTR FH: 0x825014ee
1883 11.972608880 192.168.202.1 192.168.202.2 NFS 336
V4 Reply (Call In 1882) SETATTR
1884 11.972754709 192.168.202.2 192.168.202.1 TCP
1516 785 â 2049 [ACK] Seq=465561 Ack=183381 Win=8990 Len=1448
TSval=1663691771 TSecr=3103357902 [TCP segment of a reassembled PDU]
1885 11.972763078 192.168.202.2 192.168.202.1 TCP
1516 785 â 2049 [ACK] Seq=467009 Ack=183381 Win=8990 Len=1448
TSval=1663691771 TSecr=3103357902 [TCP segment of a reassembled PDU]
1886 11.972979437 192.168.202.2 192.168.202.1 NFS 332
V4 Call (Reply In 1888) WRITE StateID: 0xafdf Offset: 0 Len: 2931
1887 11.973074490 192.168.202.1 192.168.202.2 TCP
68 2049 â 785 [ACK] Seq=183381 Ack=468721 Win=24557 Len=0
TSval=3103357902 TSecr=1663691771
1888 12.017153631 192.168.202.1 192.168.202.2 NFS 248
V4 Reply (Call In 1886) WRITE
1889 12.017338766 192.168.202.2 192.168.202.1 NFS 260
V4 Call (Reply In 1890) GETATTR FH: 0x825014ee
1890 12.017834411 192.168.202.1 192.168.202.2 NFS 312
V4 Reply (Call In 1889) GETATTR
1891 12.017961690 192.168.202.2 192.168.202.1 NFS 328
V4 Call (Reply In 1892) SETATTR FH: 0x825014ee
1892 12.039456634 192.168.202.1 192.168.202.2 NFS 336
V4 Reply (Call In 1891) SETATTR
1893 12.039536705 192.168.202.2 192.168.202.1 NFS 284
V4 Call (Reply In 1894) CLOSE StateID: 0xaa72
1894 12.039979528 192.168.202.1 192.168.202.2 NFS 248
V4 Reply (Call In 1893) CLOSE
1895 12.040077180 192.168.202.2 192.168.202.1 NFS 392
V4 Call (Reply In 1896) OPEN DH: 0xac0502f2/sysfs-cfq-target-latency
1896 12.061903798 192.168.202.1 192.168.202.2 NFS 408
V4 Reply (Call In 1895) OPEN StateID: 0xaa72
It looks like this takes about 100ms to write this small file. With the
approx 60k files in the archive this would take about 6000 secs, so is
in the 2 hours ballpark or the untar that I am seeing.
Looks like OPEN 21ms, SETATTR 22ms, WRITE 44ms, second SETATTR 21ms a
lot of time ...
The following is for an "async" mount:
No. Time Source Destination Protocol Length Info
37393 7.630012608 192.168.202.2 192.168.202.1 NFS 396
V4 Call (Reply In 37394) OPEN DH: 0x1f828ac9/vidioc-dbg-g-chip-info.rst
37394 7.630488451 192.168.202.1 192.168.202.2 NFS 408
V4 Reply (Call In 37393) OPEN StateID: 0xaa72
37395 7.630525117 192.168.202.2 192.168.202.1 NFS 304
V4 Call (Reply In 37396) SETATTR FH: 0x0f65c554
37396 7.630980560 192.168.202.1 192.168.202.2 NFS 336
V4 Reply (Call In 37395) SETATTR
37397 7.631035171 192.168.202.2 192.168.202.1 TCP
1516 785 â 2049 [ACK] Seq=13054241 Ack=3620329 Win=8990 Len=1448
TSval=1664595527 TSecr=3104261711 [TCP segment of a reassembled PDU]
37398 7.631038994 192.168.202.2 192.168.202.1 TCP
1516 785 â 2049 [ACK] Seq=13055689 Ack=3620329 Win=8990 Len=1448
TSval=1664595527 TSecr=3104261711 [TCP segment of a reassembled PDU]
37399 7.631042228 192.168.202.2 192.168.202.1 TCP
1516 785 â 2049 [ACK] Seq=13057137 Ack=3620329 Win=8990 Len=1448
TSval=1664595527 TSecr=3104261711 [TCP segment of a reassembled PDU]
37400 7.631195554 192.168.202.2 192.168.202.1 NFS 448
V4 Call (Reply In 37402) WRITE StateID: 0xafdf Offset: 0 Len: 4493
37401 7.631277423 192.168.202.1 192.168.202.2 TCP
68 2049 â 785 [ACK] Seq=3620329 Ack=13058965 Win=24550 Len=0
TSval=3104261712 TSecr=1664595527
37402 7.631506418 192.168.202.1 192.168.202.2 NFS 248
V4 Reply (Call In 37400) WRITE
37403 7.631529718 192.168.202.2 192.168.202.1 NFS 260
V4 Call (Reply In 37404) GETATTR FH: 0x0f65c554
37404 7.631946710 192.168.202.1 192.168.202.2 NFS 312
V4 Reply (Call In 37403) GETATTR
37405 7.631982683 192.168.202.2 192.168.202.1 NFS 328
V4 Call (Reply In 37406) SETATTR FH: 0x0f65c554
37406 7.632423600 192.168.202.1 192.168.202.2 NFS 336
V4 Reply (Call In 37405) SETATTR
37407 7.632461397 192.168.202.2 192.168.202.1 NFS 284
V4 Call (Reply In 37408) CLOSE StateID: 0xaa72
37408 7.632880138 192.168.202.1 192.168.202.2 NFS 248
V4 Reply (Call In 37407) CLOSE
37409 7.632926994 192.168.202.2 192.168.202.1 NFS 396
V4 Call (Reply In 37410) OPEN DH: 0x1f828ac9/vidioc-dbg-g-register.rst
37410 7.633470097 192.168.202.1 192.168.202.2 NFS 408
V4 Reply (Call In 37409) OPEN StateID: 0xaa72
It looks like this takes about 3ms to write this small file. With the
approx 60k files in the archive this would take about 180 secs, so is in
the 3 minutes ballpark that I am seeing.
It looks like each RPC call takes about 0.5ms. Why do there need to be
some many RPC calls for this ? The OPEN call could set the attribs, no
need for the later GETATTR or SETATTR calls. Even the CLOSE could be
integrated with the WRITE and taking this further OPEN could do OPEN,
SETATTR, and some WRITE all in one.
Have you tried this with a '-o nfsvers=3' during mount? Did that help?
I noticed a large decrease in my kernel build times across NFS/lan a
while back after a machine/kernel/10g upgrade. After playing with
mount/export options filesystem tuning/etc, I got to this point of
timing a bunch of these operations vs the older machine, at which point
I discovered that simply backing down to NFSv3 solved the problem.
AKA a nfsv3 server on a 10 year old 4 disk xfs RAID5 on 1Gb ethernet,
was slower than a modern machine with a 8 disk xfs RAID5 on 10Gb on
nfsv4. The effect was enough to change a kernel build from ~45 minutes
down to less than 5.
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx