Am 26.09.2011 10:01, schrieb Zhi Yong Wu: > On Fri, Sep 23, 2011 at 11:32 PM, Kevin Wolf <kwolf@xxxxxxxxxx> wrote: >> Am 08.09.2011 12:11, schrieb Zhi Yong Wu: >>> Signed-off-by: Zhi Yong Wu <wuzhy@xxxxxxxxxxxxxxxxxx> >>> --- >>> Makefile.objs | 2 +- >>> block/blk-queue.c | 201 +++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> block/blk-queue.h | 59 ++++++++++++++++ >>> block_int.h | 27 +++++++ >>> 4 files changed, 288 insertions(+), 1 deletions(-) >>> create mode 100644 block/blk-queue.c >>> create mode 100644 block/blk-queue.h >>> >>> diff --git a/Makefile.objs b/Makefile.objs >>> index 26b885b..5dcf456 100644 >>> --- a/Makefile.objs >>> +++ b/Makefile.objs >>> @@ -33,7 +33,7 @@ block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vv >>> block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o >>> block-nested-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o >>> block-nested-y += qed-check.o >>> -block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o >>> +block-nested-y += parallels.o nbd.o blkdebug.o sheepdog.o blkverify.o blk-queue.o >>> block-nested-$(CONFIG_WIN32) += raw-win32.o >>> block-nested-$(CONFIG_POSIX) += raw-posix.o >>> block-nested-$(CONFIG_CURL) += curl.o >>> diff --git a/block/blk-queue.c b/block/blk-queue.c >>> new file mode 100644 >>> index 0000000..adef497 >>> --- /dev/null >>> +++ b/block/blk-queue.c >>> @@ -0,0 +1,201 @@ >>> +/* >>> + * QEMU System Emulator queue definition for block layer >>> + * >>> + * Copyright (c) IBM, Corp. 2011 >>> + * >>> + * Authors: >>> + * Zhi Yong Wu <wuzhy@xxxxxxxxxxxxxxxxxx> >>> + * Stefan Hajnoczi <stefanha@xxxxxxxxxxxxxxxxxx> >>> + * >>> + * Permission is hereby granted, free of charge, to any person obtaining a copy >>> + * of this software and associated documentation files (the "Software"), to deal >>> + * in the Software without restriction, including without limitation the rights >>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell >>> + * copies of the Software, and to permit persons to whom the Software is >>> + * furnished to do so, subject to the following conditions: >>> + * >>> + * The above copyright notice and this permission notice shall be included in >>> + * all copies or substantial portions of the Software. >>> + * >>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR >>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, >>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL >>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER >>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, >>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN >>> + * THE SOFTWARE. >>> + */ >>> + >>> +#include "block_int.h" >>> +#include "block/blk-queue.h" >>> +#include "qemu-common.h" >>> + >>> +/* The APIs for block request queue on qemu block layer. >>> + */ >>> + >>> +struct BlockQueueAIOCB { >>> + BlockDriverAIOCB common; >>> + QTAILQ_ENTRY(BlockQueueAIOCB) entry; >>> + BlockRequestHandler *handler; >>> + BlockDriverAIOCB *real_acb; >>> + >>> + int64_t sector_num; >>> + QEMUIOVector *qiov; >>> + int nb_sectors; >>> +}; >> >> The idea is that each request is first queued on the QTAILQ, and at some >> point it's removed from the queue and gets a real_acb. But it never has >> both at the same time. Correct? > NO. if block I/O throttling is enabled and I/O rate at runtime exceed > this limits, this request will be enqueued. > It represents the whole lifecycle of one enqueued request. What are the conditions under which the request will still be enqueued, but has a real_acb at the same time? >>> + >>> +typedef struct BlockQueueAIOCB BlockQueueAIOCB; >>> + >>> +struct BlockQueue { >>> + QTAILQ_HEAD(requests, BlockQueueAIOCB) requests; >>> + bool req_failed; >>> + bool flushing; >>> +}; >> >> I find req_failed pretty confusing. Needs documentation at least, but >> most probably also a better name. > OK. request_has_failed? No, that doesn't describe what it's really doing. You set req_failed = true by default and then on some obscure condition clear it or not. It's tracking something, but I'm not sure what meaning it has during the whole process. >>> + >>> +static void qemu_block_queue_dequeue(BlockQueue *queue, >>> + BlockQueueAIOCB *request) >>> +{ >>> + BlockQueueAIOCB *req; >>> + >>> + assert(queue); >>> + while (!QTAILQ_EMPTY(&queue->requests)) { >>> + req = QTAILQ_FIRST(&queue->requests); >>> + if (req == request) { >>> + QTAILQ_REMOVE(&queue->requests, req, entry); >>> + break; >>> + } >>> + } >>> +} >> >> Is it just me or is this an endless loop if the request isn't the first >> element in the list? > queue->requests is only used to store requests which exceed the limits. > Why is the request not the first evlement? Why do you have a loop if it's always the first element? >>> +void qemu_del_block_queue(BlockQueue *queue) >>> +{ >>> + BlockQueueAIOCB *request, *next; >>> + >>> + QTAILQ_FOREACH_SAFE(request, &queue->requests, entry, next) { >>> + QTAILQ_REMOVE(&queue->requests, request, entry); >>> + qemu_aio_release(request); >>> + } >>> + >>> + g_free(queue); >>> +} >> >> Can we be sure that no AIO requests are in flight that still use the now >> released AIOCB? > Yeah, since qemu core code is serially performed, i think that when > qemu_del_block_queue is performed, no requests are in flight. Right? Patch 2 has this code: +void bdrv_io_limits_disable(BlockDriverState *bs) +{ + bs->io_limits_enabled = false; + + if (bs->block_queue) { + qemu_block_queue_flush(bs->block_queue); + qemu_del_block_queue(bs->block_queue); + bs->block_queue = NULL; + } Does this mean that you can't disable I/O limits while the VM is running? >>> + >>> +BlockDriverAIOCB *qemu_block_queue_enqueue(BlockQueue *queue, >>> + BlockDriverState *bs, >>> + BlockRequestHandler *handler, >>> + int64_t sector_num, >>> + QEMUIOVector *qiov, >>> + int nb_sectors, >>> + BlockDriverCompletionFunc *cb, >>> + void *opaque) >>> +{ >>> + BlockDriverAIOCB *acb; >>> + BlockQueueAIOCB *request; >>> + >>> + if (queue->flushing) { >>> + queue->req_failed = false; >>> + return NULL; >>> + } else { >>> + acb = qemu_aio_get(&block_queue_pool, bs, >>> + cb, opaque); >>> + request = container_of(acb, BlockQueueAIOCB, common); >>> + request->handler = handler; >>> + request->sector_num = sector_num; >>> + request->qiov = qiov; >>> + request->nb_sectors = nb_sectors; >>> + request->real_acb = NULL; >>> + QTAILQ_INSERT_TAIL(&queue->requests, request, entry); >>> + } >>> + >>> + return acb; >>> +} >>> + >>> +static int qemu_block_queue_handler(BlockQueueAIOCB *request) >>> +{ >>> + int ret; >>> + BlockDriverAIOCB *res; >>> + >>> + res = request->handler(request->common.bs, request->sector_num, >>> + request->qiov, request->nb_sectors, >>> + qemu_block_queue_callback, request); >>> + if (res) { >>> + request->real_acb = res; >>> + } >>> + >>> + ret = (res == NULL) ? 0 : 1; >>> + >>> + return ret; >> >> You mean return (res != NULL); and want to have bool as the return value >> of this function. > Yeah, thanks. i will modify as below: > ret = (res == NULL) ? false : true; ret = (res != NULL) is really more readable. > and > static bool qemu_block_queue_handler() > >> >>> +} >>> + >>> +void qemu_block_queue_flush(BlockQueue *queue) >>> +{ >>> + queue->flushing = true; >>> + while (!QTAILQ_EMPTY(&queue->requests)) { >>> + BlockQueueAIOCB *request = NULL; >>> + int ret = 0; >>> + >>> + request = QTAILQ_FIRST(&queue->requests); >>> + QTAILQ_REMOVE(&queue->requests, request, entry); >>> + >>> + queue->req_failed = true; >>> + ret = qemu_block_queue_handler(request); >>> + if (ret == 0) { >>> + QTAILQ_INSERT_HEAD(&queue->requests, request, entry); >>> + if (queue->req_failed) { >>> + qemu_block_queue_callback(request, -EIO); >>> + break; >>> + } >>> + } >>> + } >>> + >>> + queue->req_failed = true; >>> + queue->flushing = false; >>> +} >>> + >>> +bool qemu_block_queue_has_pending(BlockQueue *queue) >>> +{ >>> + return !queue->flushing && !QTAILQ_EMPTY(&queue->requests); >>> +} >> >> Why doesn't the queue have pending requests in the middle of a flush >> operation? (That is, the flush hasn't completed yet) > It is possible for the queue to have pending requests. if yes, how about? Sorry, can't parse this. I don't understand why the !queue->flushing part is correct. Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html