Re: RBD EC images for a ZFS pool

Ilya Dryomov <idryomov@xxxxxxxxx> · Thu, 9 Jan 2020 21:16:18 +0100

On Thu, Jan 9, 2020 at 2:52 PM Kyriazis, George
<george.kyriazis@xxxxxxxxx> wrote:
>
> Hello ceph-users!
>
> My setup is that I’d like to use RBD images as a replication target of a FreeNAS zfs pool.  I have a 2nd FreeNAS (in a VM) to act as a backup target in which I mount the RBD image.  All this (except the source FreeNAS server) is in Proxmox.
>
> Since I am using RBD as a backup target, performance is not really critical, but I still don’t want it to take months to complete the backup.  My source pool size is in the order of ~30TB.
>
> I’ve set up an EC RBD pool (and the matching replicated pool) and created image with no problems.  However, with the stock 4MB object size, backup speed in quite slow.  I tried creating an image with 4K object size, but even for a relatively small image size (of 1TB), I get:
>
> # rbd -p rbd_backup create vm-118-disk-0 --size 1T --object-size 4K --data-pool rbd_ec
> 2020-01-09 07:40:27.120 7f3e4aa15f40 -1 librbd::image::CreateRequest: validate_layout: image size not compatible with object map
> rbd: create error: (22) Invalid argument
> #

Yeah, this is an object map limitation.  Given that this is a backup
target, you don't really need the object map feature.  Disable it with
"rbd feature disable vm-118-disk-0 object-map" and you should be able
to create an image of any size.

That said, are you sure that object size is the issue?  If you expect
small sequential writes and want them to go to different OSDs, look at
using a fancy striping pattern instead of changing the object size:

  https://docs.ceph.com/docs/master/man/8/rbd/#striping

E.g. with --stripe-unit 4K --stripe-count 8, the first 4K will go to
object 1, the second 4K to object 2, etc.  The ninth 4K will return to
object 1, the tenth to object 2, etc.  When objects 1-8 become full, it
will move on to objects 9-16, then to 17-24, etc.

This way you get the increased parallelism without the very significant
overhead of tons of small objects (if your OSDs are capable enough).

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com