Re: [Qemu-devel] [RFC]QEMU disk I/O limits

Stefan Hajnoczi <stefanha@xxxxxxxxx> · Wed, 1 Jun 2011 22:15:30 +0100

On Wed, Jun 1, 2011 at 2:20 PM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Tue, May 31, 2011 at 06:30:09PM -0500, Anthony Liguori wrote:
>
> [..]
>> The level of consistency will then depend on whether you overcommit
>> your hardware and how you have it configured.
>
> Agreed.
>
>>
>> Consistency is very hard because at the end of the day, you still
>> have shared resources.  Even with blkio, I presume one guest can
>> still impact another guest by forcing the disk to do excessive
>> seeking or something of that nature.
>>
>> So absolutely consistency can't be the requirement for the use-case.
>> The use-cases we are interested really are more about providing caps
>> than anything else.
>
> I think both qemu and kenrel can do the job. The only thing which
> seriously favors throttling implementation in qemu is the ability
> to handle wide variety of backend files (NFS, qcow, libcurl based
> devices etc).
>
> So what I am arguing is that your previous reason that qemu can do
> a better job because it knows effective IOPS of guest, is not
> necessarily a very good reason. To me simplicity of being able to handle
> everything as file and do the throttling is the most compelling reason
> to do this implementation in qemu.

The variety of backends is the reason to go for a QEMU-based approach.
 If there were kernel mechanisms to handle non-block backends that
would be great.  cgroups NFS?

Of course for something like Sheepdog or Ceph it becomes quite hard to
do it in the kernel at all since they are userspace libraries that
speak their protocol over sockets, and you really don't have sinight
into what I/O operations they are doing from the kernel.

One issue that concerns me is how effective iops and throughput are as
capping mechanisms.  If you cap throughput then you're likely to
affect sequential I/O but do little against random I/O which can hog
the disk with a seeky I/O pattern.  If you limit iops you can cap
random I/O but artifically limit sequential I/O, which may be able to
perform a high number of iops without hogging the disk due to seek
times at all.  One proposed solution here (I think Christoph Hellwig
suggested it) is to do something like merging sequential I/O counting
so that multiple sequential I/Os only count as 1 iop.

I like the idea of a proportional share of disk utilization but doing
that from QEMU is problematic since we only know when we issued an I/O
to the kernel, not when it's actually being serviced by the disk -
there could be queue wait times in the block layer that we don't know
about - so we end up with a magic number for disk utilization which
may not be a very meaningful number.

So given the constraints and the backends we need to support, disk I/O
limits in QEMU with iops and throughput limits seem like the approach
we need.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html