RE: Buffered fileio I/O write latency over LIO

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Thu, 11 Apr 2013 11:53:55 -0700

On Thu, 2013-04-11 at 20:22 +0200, Ferry wrote:
> Hi,
> 
> Well things stall occasionally (when apps are dependent on 
> input/output). This didn't happen as much with IET (of course if there's 
> resource contention things might slow down). Occasionally I see write 
> latency peaks of over 500ms whilst at the same time read latencies are 
> still below 5ms which I find very odd - especially since there's 16GB of 
> RAM, it should have quite some buffer space. Then again - I might just 
> be unlucky in monitoring and it might just be requesting blocks that are 
> in cache.
> 
> On average it seems to outperform IET though, for example extracting 
> the MySQL installer earlier went with 90MB/s+ consistently which is 
> pretty nice considering it's connected over gigabit (2 separate wires / 
> separate subnets so it sees 2 links (1 per adapter) instead of 4 (1 -> 
> 1, 1 -> 2, 2 -> 1, 2 -> 2 if they were in the same subnet or could route 
> between it). We could get these speeds with IET too but they weren't as 
> sustainable (the rate would usually drop pretty quick). It's configured 
> using round-robin (default # of I/O's are IIRC 1000 to hop to the next - 
> didn't tune that yet) to distribute load a bit.
> 

FYI, given that FILEIO is using the same logic between both
implementations, I pretty certain that these write latency anomalies
are not specific iscsi-target.

My guess is that something between writeback and the RAID10 is blocking
incoming WRITEs.

> But occasionally things just hang for a short time (1 - 2 sec). 
> Unfortunately customers notice that much more than the increased 
> throughput on overall as they work on terminal servers and it's pretty 
> noticeable if things freeze for a short while.
> 
> Maybe I should try a new kernel - but since it runs production with 
> ~20VM's on it taking it down requires quite a bit of planning or 
> additional storage. Usually we have some overcapacity so I could just 
> svmotion but due to a SAN crash at a customer they have our overcapacity 
> until their new SAN comes in. Very bad timing :/.
> 
> Might try turning off round-robin for some time and see if that makes a 
> difference. See more or less the same on 3 hosts though.
> 
> Thanks for the re'.
> 
> 

Also, you'll want to verify the I/O scheduler on the backend devices for
the RAID10.  Also, turning down nr_requests for the backend devices (as
mentioned in the serverfault url) might be useful as well.

> 
> On 2013-04-11 18:59, Marc Fleischmann wrote:
> > Hi Ferry,
> >
> > Thank you for your posting. The latency fluctuation is a bit strange
> > indeed, but hard to assess without more information.
> >
> > May I ask you: "Since a couple of months we're running on LIO [...]
> > and we notice occasional hick-ups [...]"
> >
> > What kind of hick-ups are you finding with LIO? Can you please give
> > us a bit more details?
> >
> > Thanks very much,
> >
> > 	Marc
> >
> > -----Original Message-----
> > From: target-devel-owner@xxxxxxxxxxxxxxx
> > [mailto:target-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Ferry
> > Sent: Thursday, April 11, 2013 7:58 AM
> > To: target-devel@xxxxxxxxxxxxxxx
> > Subject: Buffered fileio I/O write latency over LIO
> >
> > Hi there,
> >
> > we had 2 SAN's (8 disk SATA in RAID-10 (mdadm) only 4GB RAM) running
> > IET. Whilst it was on IET the read latency was *always* above the
> > write latency. We used buffered fileio on IET as well. This is quite
> > expected in my limited view as the writes would go to RAM (buffer) on
> > the SAN and were written to disk a little later (but this isn't
> > visible from vmware oc). Reads most of the time actually have to come
> > from the disk, so the rust has to move and that takes time.
> >
> > Since a couple of months we're running on LIO (on Ubuntu 12.10 with 2 
> > x
> > 12 disk SAS in RAID-10 with 16GB RAM) and we notice occasional
> > hick-ups as well as the write latency being pretty high too. Write
> > latency frequently peaks to >100ms, which I don't really get as
> > there's 16GB mem in the servers and it should have plenty of buffer
> > space with the loads we currently have on it. With IET the write
> > latency didn't go above 1-2ms until the buffers were full (ie when we
> > started writing/copying a 50GB vm for example).
> >
> > For example, looking at the performance on one of the hosts for the
> > entire 2 SAN's:
> >
> > SAN1 read latency (avg 30 mins) 4.261ms - max (30 mins) 57ms,
> >      write latency (avg 30 mins) 7.194ms - max (30 mins) 83ms (this
> > SAN is the least loaded btw)
> > SAN2 read latency (avg 30 mins) 5.756ms - max (30 mins) 54ms
> >      write latency (avg 30 mins) 14.744ms - max (30 mins) 106
> >
> > During normal loads on the previous setup the read latencies were
> > *always* higher than the write latencies. The opposite is true now
> > (most of the time anyways).
> >
> > Any ideas what might cause this? As vmware does sync writes only
> > these latencies seem to hinder performance a lot. Whilst this 
> > hardware
> > is not new and approx the same age as the SATA disks the density is
> > lower and there's more platters to spread the load over. Yet it
> > performs worse (over iSCSI tests before production showed higher
> > throughput on the machine locally) in writes and oddly enough is
> > faster in reads (there's more memory though so it will have more in
> > cache).
> >
> > Any ideas on what might be causing these write latencies? No clue on
> > where to look for it. One thing worth mentioning is that the OS now
> > runs from USB stick. I don't see any correlation with the USB stick
> > performance and LIO/mdadm, but if there is any that might explain a
> > lot as the stick is horribly slow due to the USB 1.1 (the chipset
> > reports USB 2.0 support but I think HP found it too expensive to
> > actually wire that to the port - tried all ports - none function as
> > USB 2.0 unfortunately).
> >
> > Btw if I monitor the disks with iostat every 2 seconds (iostat -x 2
> > /dev/md0) whilst pushing lots of data to it one usually sees,
> > nothing, nothing, nothing, 480-520MB/s, nothing, nothing, nothing,
> > 480-520MB/s, nothing, etc. So buffers seem to be working just fine.
> > Hardly ever see access to the USB stick the OS is on, but if it does
> > happen await times are horrible (1100ms+).
> >
> > Is it possible to tune the cache/buffer by the way? I like write
> > cache to speed things up - but it's also a risk so I don't want too
> > much in cache. Preferably not more than 1G is used for write cache 
> > and
> > whatever it can take for read cache. Seeing the pretty constant rate
> > at which it writes in iostat it seems to flush every 8-10 secs or so
> > it probably doesn't matter much though.
> >
> > Kind regards,
> > --
> > To unsubscribe from this list: send the line "unsubscribe
> > target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html