Hey folks,
Recently we were facing issues with iops ordering(where when the writes were
due to async operation of cache, the order of write operations flushed to OSD
were not garanteed)[0]
To resolve this @Jianpeng Ma has a PR which uses blockguard, that protects the
extent by restricting and ordering concurrent IO to the same block.[1]
But referring to discussion from the past looks like the persistent cache should
be supporting ordered retiring[2] Some relevant context from etherpad[3]
```
Conventional WB policies may not be suitable as they offer no strict durability (possible loss of recent updates on failures), or ordering guarantees for dirty data eviction (which may cause storage inconsistencies on client crashes/failures).
The alternative proposed here is a "journaled" WB policy as described in [2] which ensures that blocks get evicted from flash cache (and synced to the networked storage) in the original write order, the networked storage state atomically transitions from one consistency point to the next (for crash-consistency) and allows for optimizations (e.g. write coalescing in the cache)
```
if that is the case, we might be deviating from original idea of persistent
writeback cache going forward with BlockGuard approach.
* Is ordering of ops a guarantee of persistent caching? What should be the best way trying to resolve this issue?
Any other thoughts on this are appreciated!
Thanks,
[0] https://tracker.ceph.com/issues/53108
[1] https://github.com/ceph/ceph/blob/master/src/librbd/BlockGuard.h
[2] https://www.snia.org/sites/default/files/SDC/2018/presentations/PM/Peterson_Scott_Using_Persistent_Memory_and_RDMA_for_Ceph_Client_Writeback_Caching.pdf
[3] https://pad.ceph.com/p/rbd_persistent_cache
[4] https://github.com/ceph/ceph/pull/44103#issuecomment-980841893
Recently we were facing issues with iops ordering(where when the writes were
due to async operation of cache, the order of write operations flushed to OSD
were not garanteed)[0]
To resolve this @Jianpeng Ma has a PR which uses blockguard, that protects the
extent by restricting and ordering concurrent IO to the same block.[1]
But referring to discussion from the past looks like the persistent cache should
be supporting ordered retiring[2] Some relevant context from etherpad[3]
```
Conventional WB policies may not be suitable as they offer no strict durability (possible loss of recent updates on failures), or ordering guarantees for dirty data eviction (which may cause storage inconsistencies on client crashes/failures).
The alternative proposed here is a "journaled" WB policy as described in [2] which ensures that blocks get evicted from flash cache (and synced to the networked storage) in the original write order, the networked storage state atomically transitions from one consistency point to the next (for crash-consistency) and allows for optimizations (e.g. write coalescing in the cache)
```
if that is the case, we might be deviating from original idea of persistent
writeback cache going forward with BlockGuard approach.
* Is ordering of ops a guarantee of persistent caching? What should be the best way trying to resolve this issue?
Any other thoughts on this are appreciated!
Thanks,
Deepika
[0] https://tracker.ceph.com/issues/53108
[1] https://github.com/ceph/ceph/blob/master/src/librbd/BlockGuard.h
[2] https://www.snia.org/sites/default/files/SDC/2018/presentations/PM/Peterson_Scott_Using_Persistent_Memory_and_RDMA_for_Ceph_Client_Writeback_Caching.pdf
[3] https://pad.ceph.com/p/rbd_persistent_cache
[4] https://github.com/ceph/ceph/pull/44103#issuecomment-980841893
_______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx