Re: about rbd journaling

Jason Dillaman <jdillama@xxxxxxxxxx> · Fri, 6 May 2016 08:28:31 -0400

The rationale for requiring the data to be safely committed to the
journal before permitting writes to the image itself is to ensure
consistent state upon a crash/failure.  If, for example, the write
modified the image and there was a failure before ensuring the journal
entry was safely committed to disk, the replicated image would no
longer be consistent with the source (since there would be no log of
the change for the replicated image to replay).

With the librbd cache enabled, we still immediately ACK the write to
the client since read-after-write requests can be correctly satisfied
by the cache.  The cache writeback for the affected extents will be
blocked until the journal entry is safely stored -- so a flush would
block until the journal event it safe on disk.

On Thu, May 5, 2016 at 11:17 PM, Jaze Lee <jazeltq@xxxxxxxxx> wrote:
> Hi all,
>    I found the rbd journaling first send data to journal pool (in
> append_io_event, and return a future
>   by journaler append) and then(when the append is safe) send to the
> real data pool.
>    Can someone explain a little more on this. I do not why doing this,
> why not first to data pool,
>    then to the journal pool?  May be put data first in journal pool
> will let the rbd-mirror sync first?
>
> Thanks a lot
>
>
>
>
> --
> 谦谦君子
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Jason
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html