Re: Buffered fileio I/O write latency over LIO

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Thu, 11 Apr 2013 11:42:48 -0700

On Thu, 2013-04-11 at 16:58 +0200, Ferry wrote:
> Hi there,
> 
> we had 2 SAN's (8 disk SATA in RAID-10 (mdadm) only 4GB RAM) running 
> IET. Whilst it was on IET the read latency was *always* above the write 
> latency. We used buffered fileio on IET as well. This is quite expected 
> in my limited view as the writes would go to RAM (buffer) on the SAN and 
> were written to disk a little later (but this isn't visible from vmware 
> oc). Reads most of the time actually have to come from the disk, so the 
> rust has to move and that takes time.
> 

Please provide the lspci -vvv, and HBA / disk / mdadm information,
preferably for both IET and LIO machines if possible..

> Since a couple of months we're running on LIO (on Ubuntu 12.10 with 2 x 
> 12 disk SAS in RAID-10 with 16GB RAM) and we notice occasional hick-ups 
> as well as the write latency being pretty high too. Write latency 
> frequently peaks to >100ms, which I don't really get as there's 16GB mem 
> in the servers and it should have plenty of buffer space with the loads 
> we currently have on it. With IET the write latency didn't go above 
> 1-2ms until the buffers were full (ie when we started writing/copying a 
> 50GB vm for example).
> 

The fact that the buffer cache is 4x larger could be having an effect.

Also, 2 x 12 disks means a single RAID10 with 24 disks vs. RAID10 with 8
disks on the previous setup, yes..?

Having this many drives is a single software RAID could be having an
effect here, given that only a single kernel thread is responsible for
doing I/O to all of these drives.

> For example, looking at the performance on one of the hosts for the 
> entire 2 SAN's:
> 
> SAN1 read latency (avg 30 mins) 4.261ms - max (30 mins) 57ms,
>      write latency (avg 30 mins) 7.194ms - max (30 mins) 83ms (this SAN 
> is the least loaded btw)
> SAN2 read latency (avg 30 mins) 5.756ms - max (30 mins) 54ms
>      write latency (avg 30 mins) 14.744ms - max (30 mins) 106
> 
> During normal loads on the previous setup the read latencies were 
> *always* higher than the write latencies. The opposite is true now (most 
> of the time anyways).
> 
> Any ideas what might cause this? As vmware does sync writes only these 
> latencies seem to hinder performance a lot. Whilst this hardware is not 
> new and approx the same age as the SATA disks the density is lower and 
> there's more platters to spread the load over. Yet it performs worse 
> (over iSCSI tests before production showed higher throughput on the 
> machine locally) in writes and oddly enough is faster in reads (there's 
> more memory though so it will have more in cache).
> 
> Any ideas on what might be causing these write latencies? No clue on 
> where to look for it. One thing worth mentioning is that the OS now runs 
> from USB stick. I don't see any correlation with the USB stick 
> performance and LIO/mdadm, but if there is any that might explain a lot 
> as the stick is horribly slow due to the USB 1.1 (the chipset reports 
> USB 2.0 support but I think HP found it too expensive to actually wire 
> that to the port - tried all ports - none function as USB 2.0 
> unfortunately).
> 
> Btw if I monitor the disks with iostat every 2 seconds (iostat -x 2 
> /dev/md0) whilst pushing lots of data to it one usually sees, nothing, 
> nothing, nothing, 480-520MB/s, nothing, nothing, nothing, 480-520MB/s, 
> nothing, etc. So buffers seem to be working just fine. Hardly ever see 
> access to the USB stick the OS is on, but if it does happen await times 
> are horrible (1100ms+).
> 
> Is it possible to tune the cache/buffer by the way? I like write cache 
> to speed things up - but it's also a risk so I don't want too much in 
> cache. Preferably not more than 1G is used for write cache and whatever 
> it can take for read cache. Seeing the pretty constant rate at which it 
> writes in iostat it seems to flush every 8-10 secs or so it probably 
> doesn't matter much though.
> 

This may be helpful for undering the buffer cache:

http://serverfault.com/questions/126413/limit-linux-background-flush-dirty-pages

> Kind regards,
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html