Re: [PATCH] rbd: add queuing delay

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 22, 2010 at 09:50:24PM +0200, Christian Brunner wrote:
> > while running tests with qemu-io I've been experiencing a lot of
> > messages when running a large writev request (several hundred MB in
> > a single call):
> >
> > 10.06.20 22:10:07.337108 b67dcb70 client4136.objecter  pg 3.437e on [0] is laggy: 33
> > 10.06.20 22:10:07.337708 b67dcb70 client4136.objecter  pg 3.2553 on [0] is laggy: 19
> > [...]
> >
> > Everything is working fine, though. I think that the large number of
> > queued requests is the cause for this behaviour and I would propose to
> > delay futher requests (see attached patch).
> >
> > What do you think about it?
> 
> It seems that the osd is lagging behind. The usleep might work for you
> as you avoid the pressure, but it's also somewhat random and will
> probably hurt performance on other setups. I'd rather see a
> configurable solution that lets you specify a total in-flight bytes or
> some other resizable window scheme.

I'm not sure if I understand what "lagging behind" means. If the in-flight
bytes are the sum of all requests in the queue, a solution could look like 
this (although it isn't configurable yet).

Christian

---
 block/rbd.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 10daf20..f87e84c 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -50,6 +50,7 @@ int eventfd(unsigned int initval, int flags);
  */
 
 #define OBJ_MAX_SIZE (1UL << OBJ_DEFAULT_OBJ_ORDER)
+#define MAX_QUEUE_SIZE 33554432 // 32MB
 
 typedef struct RBDAIOCB {
     BlockDriverAIOCB common;
@@ -79,6 +80,7 @@ typedef struct BDRVRBDState {
     uint64_t size;
     uint64_t objsize;
     int qemu_aio_count;
+    uint64_t queuesize;
 } BDRVRBDState;
 
 typedef struct rbd_obj_header_ondisk RbdHeader1;
@@ -334,6 +336,7 @@ static int rbd_open(BlockDriverState *bs, const char *filename, int flags)
     le64_to_cpus((uint64_t *) & header->image_size);
     s->size = header->image_size;
     s->objsize = 1 << header->options.order;
+    s->queuesize = 0;
 
     s->efd = eventfd(0, 0);
     if (s->efd < 0) {
@@ -443,6 +446,7 @@ static void rbd_finish_aiocb(rados_completion_t c, RADOSCB *rcb)
     int i;
 
     acb->aiocnt--;
+    acb->s->queuesize -= rcb->segsize;
     r = rados_aio_get_return_value(c);
     rados_aio_release(c);
     if (acb->write) {
@@ -560,6 +564,12 @@ static BlockDriverAIOCB *rbd_aio_rw_vector(BlockDriverState *bs,
         rcb->segsize = segsize;
         rcb->buf = buf;
 
+        while  (s->queuesize > MAX_QUEUE_SIZE) {
+            usleep(100);
+        }
+
+        s->queuesize += segsize;
+
         if (write) {
             rados_aio_create_completion(rcb, NULL,
                                         (rados_callback_t) rbd_finish_aiocb,
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux