Buffered fileio I/O write latency over LIO

Ferry <iscsitmp@xxxxxxxxxxxxx> · Thu, 11 Apr 2013 16:58:25 +0200

Hi there,

we had 2 SAN's (8 disk SATA in RAID-10 (mdadm) only 4GB RAM) running 
IET. Whilst it was on IET the read latency was *always* above the write 
latency. We used buffered fileio on IET as well. This is quite expected 
in my limited view as the writes would go to RAM (buffer) on the SAN and 
were written to disk a little later (but this isn't visible from vmware 
oc). Reads most of the time actually have to come from the disk, so the 
rust has to move and that takes time.

Since a couple of months we're running on LIO (on Ubuntu 12.10 with 2 x 
12 disk SAS in RAID-10 with 16GB RAM) and we notice occasional hick-ups 
as well as the write latency being pretty high too. Write latency 
frequently peaks to >100ms, which I don't really get as there's 16GB mem 
in the servers and it should have plenty of buffer space with the loads 
we currently have on it. With IET the write latency didn't go above 
1-2ms until the buffers were full (ie when we started writing/copying a 
50GB vm for example).

For example, looking at the performance on one of the hosts for the 
entire 2 SAN's:

SAN1 read latency (avg 30 mins) 4.261ms - max (30 mins) 57ms,
    write latency (avg 30 mins) 7.194ms - max (30 mins) 83ms (this SAN 
is the least loaded btw)
SAN2 read latency (avg 30 mins) 5.756ms - max (30 mins) 54ms
    write latency (avg 30 mins) 14.744ms - max (30 mins) 106

During normal loads on the previous setup the read latencies were 
*always* higher than the write latencies. The opposite is true now (most 
of the time anyways).

Any ideas what might cause this? As vmware does sync writes only these 
latencies seem to hinder performance a lot. Whilst this hardware is not 
new and approx the same age as the SATA disks the density is lower and 
there's more platters to spread the load over. Yet it performs worse 
(over iSCSI tests before production showed higher throughput on the 
machine locally) in writes and oddly enough is faster in reads (there's 
more memory though so it will have more in cache).

Any ideas on what might be causing these write latencies? No clue on 
where to look for it. One thing worth mentioning is that the OS now runs 
from USB stick. I don't see any correlation with the USB stick 
performance and LIO/mdadm, but if there is any that might explain a lot 
as the stick is horribly slow due to the USB 1.1 (the chipset reports 
USB 2.0 support but I think HP found it too expensive to actually wire 
that to the port - tried all ports - none function as USB 2.0 
unfortunately).

Btw if I monitor the disks with iostat every 2 seconds (iostat -x 2 
/dev/md0) whilst pushing lots of data to it one usually sees, nothing, 
nothing, nothing, 480-520MB/s, nothing, nothing, nothing, 480-520MB/s, 
nothing, etc. So buffers seem to be working just fine. Hardly ever see 
access to the USB stick the OS is on, but if it does happen await times 
are horrible (1100ms+).

Is it possible to tune the cache/buffer by the way? I like write cache 
to speed things up - but it's also a risk so I don't want too much in 
cache. Preferably not more than 1G is used for write cache and whatever 
it can take for read cache. Seeing the pretty constant rate at which it 
writes in iostat it seems to flush every 8-10 secs or so it probably 
doesn't matter much though.

Kind regards,
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html