Re: [Qemu-devel] [PATCH] ceph/rbd block driver for qemu-kvm (v4)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/07/2010 01:41 PM, Yehuda Sadeh Weinraub wrote:
On Thu, Oct 7, 2010 at 11:38 AM, Anthony Liguori<anthony@xxxxxxxxxxxxx>  wrote:
On 10/07/2010 01:08 PM, Yehuda Sadeh Weinraub wrote:
On Thu, Oct 7, 2010 at 7:12 AM, Anthony Liguori<anthony@xxxxxxxxxxxxx>
  wrote:

On 08/03/2010 03:14 PM, Christian Brunner wrote:

+#include "qemu-common.h"
+#include "qemu-error.h"
+#include<sys/types.h>
+#include<stdbool.h>
+
+#include<qemu-common.h>


This looks to be unnecessary.  Generally, system includes shouldn't be
required so all of these should go away except rado/librados.h

Removed.



+
+#include "rbd_types.h"
+#include "module.h"
+#include "block_int.h"
+
+#include<stdio.h>
+#include<stdlib.h>
+#include<rados/librados.h>
+
+#include<signal.h>
+
+
+int eventfd(unsigned int initval, int flags);


This is not quite right.  Depending on eventfd is curious but in the very
least, you need to detect the presence of eventfd in configure and
provide a
wrapper that redefines it as necessary.

Can fix that, though please see my later remarks.

+static int create_tmap_op(uint8_t op, const char *name, char
**tmap_desc)
+{
+    uint32_t len = strlen(name);
+    /* total_len = encoding op + name + empty buffer */
+    uint32_t total_len = 1 + (sizeof(uint32_t) + len) +
sizeof(uint32_t);
+    char *desc = NULL;


char is the wrong type to use here as it may be signed or unsigned.  That
can have weird effects with binary data when you're directly manipulating
it.

Well, I can change it to uint8_t, so that it matches the op type, but
that'll require adding some other castings. In any case, you usually
get such a weird behavior when you cast to types of different sizes
and have the sign bit padded which is not the case in here.



+
+    desc = qemu_malloc(total_len);
+
+    *tmap_desc = desc;
+
+    *desc = op;
+    desc++;
+    memcpy(desc,&len, sizeof(len));
+    desc += sizeof(len);
+    memcpy(desc, name, len);
+    desc += len;
+    len = 0;
+    memcpy(desc,&len, sizeof(len));
+    desc += sizeof(len);


Shouldn't endianness be a concern?

Right. Fixed that.



+
+    return desc - *tmap_desc;
+}
+
+static void free_tmap_op(char *tmap_desc)
+{
+    qemu_free(tmap_desc);
+}
+
+static int rbd_register_image(rados_pool_t pool, const char *name)
+{
+    char *tmap_desc;
+    const char *dir = RBD_DIRECTORY;
+    int ret;
+
+    ret = create_tmap_op(CEPH_OSD_TMAP_SET, name,&tmap_desc);
+    if (ret<      0) {
+        return ret;
+    }
+
+    ret = rados_tmap_update(pool, dir, tmap_desc, ret);
+    free_tmap_op(tmap_desc);
+
+    return ret;
+}


This ops are all synchronous?  IOW, rados_tmap_update() call blocks until
the operation is completed?

Yeah. And this is only called from the rbd_create() callback.


+            header_snap += strlen(header_snap) + 1;
+            if (header_snap>      end)
+                error_report("bad header, snapshot list broken");


Missing curly braces here.

Fixed.


+    if (strncmp(hbuf + 68, RBD_HEADER_VERSION, 8)) {
+        error_report("Unknown image version %s", hbuf + 68);
+        r = -EMEDIUMTYPE;
+        goto failed;
+    }
+
+    RbdHeader1 *header;



Don't mix variable definitions with code.

Fixed.


+    s->efd = eventfd(0, 0);
+    if (s->efd<      0) {
+        error_report("error opening eventfd");
+        goto failed;
+    }
+    fcntl(s->efd, F_SETFL, O_NONBLOCK);
+    qemu_aio_set_fd_handler(s->efd, rbd_aio_completion_cb, NULL,
+        rbd_aio_flush_cb, NULL, s);


It looks like you just use the eventfd to signal aio completion
callbacks.
  A better way to do this would be to schedule a bottom half.  eventfds
are
Linux specific and specific to recent kernels.

Digging back why we introduced the eventfd, it was due to some issues
seen with do_savevm() hangs on qemu_aio_flush(). The reason seemed
that we had no fd associated with the block device, which seemed to
not work well with the qemu aio model. If that assumption is wrong,
we'd be happy to change it. In any case, there are other more portable
ways to generate fds, so if it's needed we can do that.

There's no fd at all?   How do you get notifications about an asynchronous
event completion?

Regards,

Anthony Liguori

(resending to list, sorry)

The fd is hidden deep under in librados. We get callback notifications
for events completion.

How is that possible? Are the callbacks delivered in the context of a different thread? If so, don't you need locking?

Regards,

Anthony Liguori

Thanks,
Yehuda

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux