Re: Heads up: libvirt produces unusable images from RBD pool on Ubuntu trusty

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/22/2015 03:38 PM, Wido den Hollander wrote:
> On 04/22/2015 03:20 PM, Florian Haas wrote:
>> On Wed, Apr 22, 2015 at 1:02 PM, Wido den Hollander <wido@xxxxxxxx> wrote:
>>> On 04/22/2015 12:07 PM, Florian Haas wrote:
>>>> Hi everyone,
>>>>
>>>> I don't think this has been posted to this list before, so just
>>>> writing it up so it ends up in the archives.
>>>>
>>>> tl;dr: Using RBD storage pools with libvirt is currently broken on
>>>> Ubuntu trusty (LTS), and any other platform using libvirt 1.2.2.
>>>>
>>>> In libvirt 1.2.2, the rbd_create3 function is invoked, on volume
>>>> creation from a pool, with the stripe_count and stripe_unit parameters
>>>> reversed. So if you have an rbd storage pool, and you do "virsh
>>>> vol-create-as" or something equivalent, then instead of a stripe count
>>>> of one and a stripe size of 4MB, you get 4194304 1-byte stripes.
>>>> Needless to say, this renders the volume excruciatingly slow to the
>>>> point of not being usable. Volume deletion also takes on the order of
>>>> minutes even for an empty volume.
>>>>
>>>> This issue was introduced in libvirt 1.2.1, and was fixed for 1.2.4,
>>>> but Ubuntu 14.04 LTS (which is on 1.2.2) evidently never backported
>>>> that fix.
>>>>
>>>
>>> Oops... Did I do this? I think so. I messed up with the arguments.
>>>
>>> My apologies!
>>
>> No need to apologize; after all it *is* fixed upstream, and AFAICT no
>> other major distros are affected. It's just that Ubuntu really need to
>> backport that one-line fix if they want libvirt RBD pool functionality
>> to work in their current LTS.
>>
>> I'm not entirely sure, though, why virStorageBackendRBDCreateImage()
>> enables striping unconditionally; could you explain the reasoning
>> behind that?
>>
> 
> When working on this with Josh some time ago we had to come up with a
> way to create RBD format 2 images before the RBD default format option
> was available.
> 
> The only way to do it was to set the stripe size and count specifically
> and check in libvirt which version librbd was.
> 
> This code could be changed again if we want by doing:
> 
> rados_conf_set(ptr->cluster, "rbd_default_format", "2");
> rbd_create(.....);
> 
> Any recent librbd library would create format 2 images then. Since there
> is no way to feed any metadata to libvirt when creating an image this is
> the only way to go.
> 

Something like the attached patch should work. Comments?

Wido

> Wido
> 
>> Cheers,
>> Florian
>>
> 
> 


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
>From 9067cf5f2ed89940a0846fc535637af196b76f61 Mon Sep 17 00:00:00 2001
From: Wido den Hollander <wido@xxxxxxxxxxxx>
Date: Wed, 22 Apr 2015 15:43:16 +0200
Subject: [PATCH] rbd: Use rbd_create for creating images

Newer librbd versions (Since 0.67 / Dumpling) support the option
rbd_default_format. By setting this option to 2 we will create
RBD format 2 images if supported.

Almost every librbd version deployed currently supports it, so the
chance that a image is created with version 1 is very small.

Signed-off-by: Wido den Hollander <wido@xxxxxxxxxxxx>
---
 src/storage/storage_backend_rbd.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/storage/storage_backend_rbd.c b/src/storage/storage_backend_rbd.c
index ae4bcb3..7c83f80 100644
--- a/src/storage/storage_backend_rbd.c
+++ b/src/storage/storage_backend_rbd.c
@@ -151,6 +151,15 @@ static int virStorageBackendRBDOpenRADOSConn(virStorageBackendRBDStatePtr ptr,
                            "auth_supported");
             goto cleanup;
         }
+
+        VIR_DEBUG("Setting rbd_default_format to 2. Will create RBD "
+                  "format 2 images if supported");
+        if (rados_conf_set(ptr->cluster, "rbd_default_format", "2") < 0) {
+            virReportError(VIR_ERR_INTERNAL_ERROR,
+                           _("failed to set RADOS option: %s"),
+                           "rbd_default_format");
+            goto cleanup;
+        }
     } else {
         VIR_DEBUG("Not using cephx authorization");
         if (rados_create(&ptr->cluster, NULL) < 0) {
@@ -475,16 +484,7 @@ static int virStorageBackendRBDCreateImage(rados_ioctx_t io,
                                            char *name, long capacity)
 {
     int order = 0;
-#if LIBRBD_VERSION_CODE > 260
-    uint64_t features = 3;
-    uint64_t stripe_count = 1;
-    uint64_t stripe_unit = 4194304;
-
-    if (rbd_create3(io, name, capacity, features, &order,
-                    stripe_unit, stripe_count) < 0) {
-#else
     if (rbd_create(io, name, capacity, &order) < 0) {
-#endif
         return -1;
     }
 
-- 
1.9.1

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux