Hi,
I am experiencing an issue with CephFS with cache tiering where the kernel clients are reading files filled entirely with 0s.
The setup:
ceph 0.94.3
create cephfs_metadata replicated pool
create cephfs_data replicated pool
cephfs was created on the above two pools, populated with files, then:
create cephfs_ssd_cache replicated pool,
then adding the tiers:
ceph osd tier add cephfs_data cephfs_ssd_cache
ceph osd tier cache-mode cephfs_ssd_cache writeback
ceph osd tier set-overlay cephfs_data cephfs_ssd_cache
While the cephfs_ssd_cache pool is empty, multiple kernel clients on different hosts open the same file (the size of the file is small, <10k) at approximately the same time. A number of the clients from the OS level see the entire file being empty. I can do a rados -p {cache pool} ls for the list of files cached, and do a rados -p {cache pool} get {object} /tmp/file and see the complete contents of the file.
I can repeat this by setting cache-mode to forward, rados -p {cache pool} cache-flush-evict-all, checking no more objects in cache with rados -p {cache pool} ls, resetting cache-mode to writeback with an empty pool, and doing the multiple same file opens.
Has anyone seen this issue? It seems like what may be a race condition where the object is not yet completely loaded into the cache pool so the cache pool serves out an incomplete object.
If anyone can shed some light or any suggestions to help debug this issue, that would be very helpful.
Thanks,
Arthur
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com