Hi,
running into some weird issues. At first I set up a 12 disk md raid-10
(/dev/md0) and exported it with LIO with buffered fileio. It did 119MB/s
(GBe full) (just 1 IP / portal).
Rebooted, had a disk being removed from the md target, trying to re-add
it segfaulted md. It has some bad sectors in the first 4096 sectors
region - never seen mdadm segfault on that tho'... Rebooted again, it's
now degraded. Added another IP on the portal, activated multipathing,
all looked well, performance creating a eager zero'd vmdk is now 8-9MB/s
not 119MB/s - that's quite a difference. Tried switching paths, but that
made no difference.
I'm suspecting it's no longer buffered - don't know how to verify this
though. After initial creating it seems to be no longer visible in
targetcli.
What's even worse is that the rebuild of mdadm dropped to 1MB/s whilst
the iSCSI initiator was doing 8-9MB/s. iostat -x 2 showed the load on
the disk (last value) being around 20% on average, nothing above 25-30%,
one would say that would leave plenty of performance for md to at least
go over 1MB/s (minimum rebuild/sync) - but it did not. Now I don't know
how accurate these iostat values are, but I can tell you this does not
happen that badly with IET. Not by a long shot.
Btw, I've also never seen mdadm segfault on a bad disk - until now that
is, had some issues in the past too with 3.2 kernel (in combination with
mdadm - haven't seen such issues in 15+ years - only when used with
LIO, might be coincidence though). At that point I went back to IET and
was hoping 3.5 on ubuntu, being out for ~3 months now, would have
stabilized a bit.
This array is a 12 disk RAID-10 consisting of 1TB SAS drives.
On another target - which is less important to me - I see a similar
drop in performance (hence I suspect buffered not being restored, can't
see this in targetcli though, wanted to copy the sys config fs for the
target so I could diff them after resetting it up, but cp doesn't copy
as the file would have changed below it - all of em).
Anyways, figured I'd just quickly delete the backstore and recreate it.
After 20 mins the delete still hangs:
/backstores/fileio> ls
o- fileio
......................................................................................................
[2 Storage Objects]
o- BACKUPVOL1
...............................................................................................
[/dev/md4 activated]
o- BACKUPVOL2
..................................................................................................
[/dev/md5 activated]
/backstores/fileio> delete BACKUPVOL2
^C
^C
^C^C^C^C^C^C^C^C^C^C^C^C^C^C
^C
^C
^C
^C
<remains hanging>
Although the delete still hangs - the I/O on the device immediately
died!! All performance counters towards the volume just flat lined
immediately.
Starting targetcli at this point from another console hangs too:
Copyright (c) 2011 by RisingTide Systems LLC.
Visit us at http://www.risingtidesystems.com.
Using qla2xxx fabric module.
Using loopback fabric module.
Using iscsi fabric module.
<hangs>
So basically I'm left with some questions:
* How prime time ready is LIO? The vmware ready certification that some
devices get with it seem to imply whole different things than what I'm
seeing now.
* Can I verify buffered mode is on? Synchronous iSCSI kills
performance, this is well known. IIRC buffered mode on blockio has been
removed, but should have returned in 3.7, did that actually happen? I'll
try the 3.7 kernel with buffered blockio if it exists. I know the risks,
don't bother :).
* Why are there weird issues with mdadm? Like segfaults and huge sync
performance drops?
This is all running on Ubuntu 12.10 server (64 bit) as I wanted/needed
a somewhat recent kernel for LIO and don't really do anything else with
the box anyways. Fully updated yesterday.
Will be able to test / debug some things for maybe a couple of days.
Any advise is appreciated :). After that I'll need it running again
which will probably mean moving back to IET.
Kind regards,
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html