I would actually recommend the exact opposite configuration for a high-performance, journaled image: a small but fast SSD/NVMe-backed pool for the journal data, and a large pool for your image data. With the librbd in-memory, writeback cache enabled, the IO operations will be completed as soon as they are stored in the cache. This will help to alleviate some of the extra latency from appending the journal event. However, if your cache is full, the writeback will be paused until the associate journal events are safely committed to disk so that your image can remain consistent upon failure. There are a couple of configuration knobs that can be used to batch journal append operations to the OSDs to help reduce the IOPS load. These are "rbd_journal_object_flush_interval", "rbd_journal_object_flush_bytes", and "rbd_journal_object_flush_age" -- which control the maximum number of events to batch, number of pending journal event bytes to batch, and maximum age in seconds to batch respectively. For example, if your workload consists of mostly 512 byte IO, setting "rbd_journal_object_flush_interval = 30" in your config file would reduce (30) 512-byte(ish) journal append operations into (1) 15K-ish journal append operation. Also note that the forthcoming 10.2.4 release should include a noticeable performance boost for the journal since it reduces lock contention and will automatically batch events based upon the latency of the OSD responses. On Mon, Oct 10, 2016 at 9:14 PM, Christian Balzer <chibi@xxxxxxx> wrote: > > Hello, > > On Tue, 11 Oct 2016 01:07:16 +0000 Cory Hawkless wrote: > >> Thanks Jason, works perfectly. >> >> Do you know if ceph blocks the client IO until the journal has acknowledged it's write? I.E can I store my journal on slower disks or will that have a negative impact on performance? >> > Knowing nothing about this the little detail that it's a generic pool would > suggest that all the usual rules and suspects apply. > > One assumes the RBD mirror needs to keep a crash safe state, so even if its > writes were to be allowed to be asynchronous, how much of a backlog (and > thus memory consumption) would be permissible? > > So my guess is that slow disks and journals would be a no-no. > > Let's see that Jason has to say. > > Christian > >> Is there perhaps a hole in the documentation here? I've not been able to find anything in the man page for RBD nor on the Ceph website? >> >> Regards, >> Cory >> >> >> -----Original Message----- >> From: Jason Dillaman [mailto:jdillama@xxxxxxxxxx] >> Sent: Tuesday, 11 October 2016 7:57 AM >> To: Cory Hawkless <Cory@xxxxxxxxxxxxxx> >> Cc: ceph-users@xxxxxxxxxxxxxx >> Subject: Re: RBD-Mirror - Journal location >> >> Yes, the "journal_data" objects can be stored in a separate pool from the image. The rbd CLI allows you to use the "--journal-pool" argument when creating, copying, cloning, or importing and image with journaling enabled. You can also specify the journal data pool when dynamically enabling the journaling feature using the same argument. >> Finally, there is a Ceph config setting of "rbd journal pool = XYZ" >> that allows you to default new journals to a specific pool. >> >> Jason >> >> On Mon, Oct 10, 2016 at 1:59 AM, Cory Hawkless <Cory@xxxxxxxxxxxxxx> wrote: >> > I’ve enabled RBD mirroring on my test clusters and it seems to be >> > working well, my question is ‘Can we store the RBD mirror journal on a >> > different pool?’ >> > >> > >> > >> > Currently when I do something like rados ls –p sas I see >> > >> > >> > >> > >> > >> > rbd_data.a67d02eb141f2.0000000000000bd1 >> > >> > rbd_data.a67d02eb141f2.0000000000000b73 >> > >> > rbd_data.a67d02eb141f2.000000000000036d >> > >> > rbd_data.a67d02eb141f2.000000000000074e >> > >> > journal_data.75.a67d02eb141f2.175 >> > >> > rbd_data.a67d02eb141f2.0000000000000bb6 >> > >> > rbd_data.a67d02eb141f2.0000000000000bae >> > >> > rbd_data.a67d02eb141f2.0000000000000313 >> > >> > rbd_data.a67d02eb141f2.0000000000000bb3 >> > >> > >> > >> > >> > >> > Depending on how far behind the remote cluster is on sync, there are >> > more or less of the journal entries. >> > >> > >> > >> > I am worried about the overhead of storing the journal on the same set >> > of disks as the actual RBD images. >> > >> > My understanding is that enabling journaling is going to double the >> > IOPS on the disks, is that correct? >> > >> > >> > >> > Any assistance appreciated >> > >> > >> > >> > Regards, >> > >> > Cory >> > >> > >> > >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@xxxxxxxxxxxxxx >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> >> >> >> -- >> Jason >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > Christian Balzer Network/Systems Engineer > chibi@xxxxxxx Global OnLine Japan/Rakuten Communications > http://www.gol.com/ -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com