The rationale for requiring the data to be safely committed to the journal before permitting writes to the image itself is to ensure consistent state upon a crash/failure. If, for example, the write modified the image and there was a failure before ensuring the journal entry was safely committed to disk, the replicated image would no longer be consistent with the source (since there would be no log of the change for the replicated image to replay). With the librbd cache enabled, we still immediately ACK the write to the client since read-after-write requests can be correctly satisfied by the cache. The cache writeback for the affected extents will be blocked until the journal entry is safely stored -- so a flush would block until the journal event it safe on disk. On Thu, May 5, 2016 at 11:17 PM, Jaze Lee <jazeltq@xxxxxxxxx> wrote: > Hi all, > I found the rbd journaling first send data to journal pool (in > append_io_event, and return a future > by journaler append) and then(when the append is safe) send to the > real data pool. > Can someone explain a little more on this. I do not why doing this, > why not first to data pool, > then to the journal pool? May be put data first in journal pool > will let the rbd-mirror sync first? > > Thanks a lot > > > > > -- > 谦谦君子 > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Jason -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html