Re: extremely slow nfs when sync enabled

Daniel Pocock <daniel@xxxxxxxxxxxxx> · Mon, 07 May 2012 13:59:42 +0000

On 07/05/12 09:19, Daniel Pocock wrote:
> 
>>> Ok, so the combination of:
>>>
>>> - enable writeback with hdparm
>>> - use ext4 (and not ext3)
>>> - barrier=1 and data=writeback?  or data=?
>>>
>>> - is there a particular kernel version (on either client or server side)
>>> that will offer more stability using this combination of features?
>>
>> Not that I'm aware of. As long as you have a kernel > 2.6.29, then LVM
>> should work correctly. The main problem is that some SATA hardware tends
>> to be buggy, defeating the methods used by the barrier code to ensure
>> data is truly on disk. I believe that XFS will therefore actually test
>> the hardware when you mount with write caching and barriers, and should
>> report if the test fails in the syslogs.
>> See http://xfs.org/index.php/XFS_FAQ#Write_barrier_support.
>>
>>> I think there are some other variations of my workflow that I can
>>> attempt too, e.g. I've contemplated compiling C++ code onto a RAM disk
>>> because I don't need to keep the hundreds of object files.
>>
>> You might also consider using something like ccache and set the
>> CCACHE_DIR to a local disk if you have one.
>>
> 
> 
> Thanks for the feedback about these options, I am going to look at these
> strategies more closely
> 

I decided to try and take md and LVM out of the picture, I tried two
variations:

a) the boot partitions are not mirrored, so I reformatted one of them as
ext4,
- enabled write-cache for the whole of sdb,
- mounted ext4, barrier=1,data=ordered
- and exported this volume over NFS

unpacking a large source tarball on this volume, iostat reports write
speeds that are even slower, barely 300kBytes/sec

b) I took an external USB HDD,
- created two 20GB partitions sdc1 and sdc2
- formatted sdc1 as btrfs
- formatted sdc2 as ext4
- mounted sdc2 the same as sdb1 in test (a),
      ext4, barrier=1,data=ordered
- exported both volumes over NFS

unpacking a large source tarball on these two volumes, iostat reports
write speeds that are around 5MB/sec - much faster than the original
problem I was having

Bottom line, this leaves me with the impression that either
- the server's SATA controller or disks need a firmware upgrade,
- or there is some issue with the kernel barriers and/or cache flushing
on this specific SATA hardware.

I think it is fair to say that the NFS client is not at fault, however,
I can imagine many people would be tempted to just use `async' when
faced with a problem like this, given that async makes everything just
run fast.

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html