Re: how to configure a drbd tgtd kvm cluster to make sure one machine doesn't eat up all the IO and hangs other nodes

Jelle de Jong <jelledejong@xxxxxxxxxxxxx> · Sun, 25 May 2014 19:17:56 +0200

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 25/05/14 15:14, Jelle de Jong wrote:
> I got a drbd cluster with tgtd managed by pacemaker as my iscsi
> target
> 
> I then got three kvm host servers connected with open-iscsi to
> this target.
> 
> All servers use an dual nic 1G LACP (mode 4 bonding) connection to
> a cisco switch.
> 
> On the servers the iscsi disk is managed by lvm.
> 
> I use the bellow for all the disks in the drbd cluster servers and
> on for all the disks on the kvm hosts (so also the iscsi disks).
> 
> echo deadline > /sys/block/sdi/queue/scheduler echo 0 >
> /sys/block/sdi/queue/iosched/front_merges echo 150 >
> /sys/block/sdi/queue/iosched/read_expire echo 1500 >
> /sys/block/sdi/queue/iosched/write_expire
> 
> My first problem is that if i run dd[1] on one of the KVM host,
> all the kvm guests on the other machines drop dead because they
> cant get any IO any more.
> 
> My second problem is that if I run a heavy IO job[2] inside a KVM 
> guest it can take all the IO and the KVM guest on an other KVM
> host can't get any IO any more
> 
> [1] ionice -c 3 dd oflag=direct bs=4M
> if=/dev/lvm2-vol/kvm07-snapshot of=/dev/lvm3-vol/kvm07-disk
> 
> [2] nice -n 19 ionice -c2 -n7 pvmove /dev/vdb /dev/vdc
> 
> Now my guess is this is because, there is no IO scheduling between
> the KVM host, they do IO scheduling on there own server but just
> pass the load to open-iscsi and tgtd. However if multiple server
> are connected to tgtd it seems to be that one server can take it
> all and the other connections just have to wait.... (till a heavy
> dd job that takes hours runs out...)
> 
> So how should I configure my cluster and tgtd so an heavy IO jobs
> just doesn’t get all the IO and the other request get handled as
> well? Is this an impossibility?

As an update, I tried the noop and cfg scheduling for the isci disks
on the kvm host (with deadline on the drbd tgtd iscsi cluster), lots
of test with ionice and kernel params like the cfg quantum,
slice_idle, back_seek_max, nothing worked.

iostat -dkx /dev/sdi 5 showed a big file queue

echo 1 > /sys/block/sdi/device/queue_depth

I set the queue_depth to 1 and the other kvm guests get a bit of IO
again and stay working, set the queue_depth to 2 and they starve again....

Kind regards,

Jelle de Jong
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iJwEAQECAAYFAlOCJcAACgkQ1WclBW9j5HluKgP7BZmHEovdiPtfl05zW44t2dxV
zi+wgYvqxBl9Q1uW/VDu/u1DUTvUf/T4rEhyfGgCo3gC+0hCzk5Iin1Cz2gFQIii
fXl5U+q82MvbF2LFMHy/LLC+0JIUTLoXGfDQDOEFEpe8k5oBzoXeQHIHPQs8dUwi
7GXrqIq1Z2kvJAwShFk=
=lGzS
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe stgt" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html