Re: Several LIO(/mdadm) issues

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Thu, 31 Jan 2013 12:22:22 -0800

On Thu, 2013-01-31 at 09:51 +0100, Ferry wrote:
> Hi,
> 
> running into some weird issues. At first I set up a 12 disk md raid-10 
> (/dev/md0) and exported it with LIO with buffered fileio. It did 119MB/s 
> (GBe full) (just 1 IP / portal).
> 
> Rebooted, had a disk being removed from the md target, trying to re-add 
> it segfaulted md. It has some bad sectors in the first 4096 sectors 
> region - never seen mdadm segfault on that tho'... Rebooted again, it's 
> now degraded. Added another IP on the portal, activated multipathing, 
> all looked well, performance creating a eager zero'd vmdk is now 8-9MB/s 
> not 119MB/s - that's quite a difference. Tried switching paths, but that 
> made no difference.
> 

Your hitting a bug in lio-utils where FILEIO buffered mode is not being
saved into /etc/target/tcm_start.sh across reboots.  See below.

> I'm suspecting it's no longer buffered - don't know how to verify this 
> though. After initial creating it seems to be no longer visible in 
> targetcli.

You can manually re-enable FILEIO buffered IO by appending
',fd_buffered_io=1' to each your FILEIO backends like so in
your /etc/target/tcm_start.sh:

tcm_node --establishdev fileio_0/fileio_test0  \
fd_dev_name=/usr/src/fileio_test0,fd_dev_size=21474836480,fd_buffered_io=1

This setting can also be checked via configfs directly:

# cat /sys/kernel/config/target/core/fileio_0/fileio_test0/info 
Status: ACTIVATED  Max Queue Depth: 32  SectorSize: 512  HwMaxSectors: 1024
        TCM FILEIO ID: 0        File: /usr/src/fileio_test0  Size: 21474836480  Mode: Buffered-WCE

I've pushed a bug-fix into lio-utils.git that will correctly save
fd_buffered_io=1 into /etc/target/tcm_start.sh when generating a
saved configuration, once it's been enabled at creation time, or added
manually in tcm_start.sh:

http://www.risingtidesystems.com/git/?p=lio-utils.git;a=commitdiff;h=5d0f4829aa130619e81edad3fe0aaa697fa00be4

Please give it a shot.

> What's even worse is that the rebuild of mdadm dropped to 1MB/s whilst 
> the iSCSI initiator was doing 8-9MB/s. iostat -x 2 showed the load on 
> the disk (last value) being around 20% on average, nothing above 25-30%, 
> one would say that would leave plenty of performance for md to at least 
> go over 1MB/s (minimum rebuild/sync) - but it did not. Now I don't know 
> how accurate these iostat values are, but I can tell you this does not 
> happen that badly with IET. Not by a long shot.
> 

Yes, that is definitely because FILEIO reverted to O_DSYNC by default.

> Btw, I've also never seen mdadm segfault on a bad disk - until now that 
> is, had some issues in the past too with 3.2 kernel (in combination with 
> mdadm -  haven't seen such issues in 15+ years - only when used with 
> LIO, might be coincidence though).

I would recommend reporting the mdadm segfault on a bad disk as a
separate bug, considering FILEIO is simply doing vectored reads/writes
to struct file here.

> At that point I went back to IET and 
> was hoping 3.5 on ubuntu, being out for ~3 months now, would have 
> stabilized a bit.
> 
> This array is a 12 disk RAID-10 consisting of 1TB SAS drives.
> 
> On another target - which is less important to me - I see a similar 
> drop in performance (hence I suspect buffered not being restored, can't 
> see this in targetcli though, wanted to copy the sys config fs for the 
> target so I could diff them after resetting it up, but cp doesn't copy 
> as the file would have changed below it - all of em).
> 
> Anyways, figured I'd just quickly delete the backstore and recreate it. 
> After 20 mins the delete still hangs:
> 
> /backstores/fileio> ls
> o- fileio 
> ...................................................................................................... 
> [2 Storage Objects]
>    o- BACKUPVOL1 
> ............................................................................................... 
> [/dev/md4 activated]
>    o- BACKUPVOL2 
> .................................................................................................. 
> [/dev/md5 activated]
> /backstores/fileio> delete BACKUPVOL2
> ^C
> 
> ^C
> 
> ^C^C^C^C^C^C^C^C^C^C^C^C^C^C
> 
> 
> ^C
> ^C
> ^C
> ^C
> <remains hanging>
> 

Mmmmm.

Deleting backends on the fly with TPG demo-mode has some known issues.
If your using TPG demo-mode operation, I would recommend shutting down
the TPG containing the LUNs first, and then removing the backend device.

Otherwise if this is occuring w/o active I/O from an initiator, it means
the backend device is not returning outstanding I/Os back to the target.

> Although the delete still hangs - the I/O on the device immediately 
> died!! All performance counters towards the volume just flat lined 
> immediately.
> 
> Starting targetcli at this point from another console hangs too:
> 
>   Copyright (c) 2011 by RisingTide Systems LLC.
> 
> Visit us at http://www.risingtidesystems.com.
> 
> Using qla2xxx fabric module.
> Using loopback fabric module.
> Using iscsi fabric module.
> <hangs>
> 
> 
> So basically I'm left with some questions:
> * How prime time ready is LIO? The vmware ready certification that some 
> devices get with it seem to imply whole different things than what I'm 
> seeing now.
> * Can I verify buffered mode is on? Synchronous iSCSI kills 
> performance, this is well known. IIRC buffered mode on blockio has been 
> removed, but should have returned in 3.7, did that actually happen? I'll 
> try the 3.7 kernel with buffered blockio if it exists. I know the risks, 
> don't bother :).

Please confirm what kernel version your using.  It sounds like the
target kernel side is fine for enabling buffered FILEIO mode, but that
user-space is not saving it.

> * Why are there weird issues with mdadm? Like segfaults and huge sync 
> performance drops?

O_DSYNC is going to be the cause of the performance drop in MD-RAID.

> 
> This is all running on Ubuntu 12.10 server (64 bit) as I wanted/needed 
> a somewhat recent kernel for LIO and don't really do anything else with 
> the box anyways. Fully updated yesterday.
> 
> Will be able to test / debug some things for maybe a couple of days. 
> Any advise is appreciated :). 

The above should get you going again with FILEIO buffered mode across
restarts.

Thanks,

--nab

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html