Understanding cephfs snapshot workflow and performance

Nikhil Kommineni <nikhilk.kommineni@xxxxxxxxx> · Mon, 3 Jan 2022 14:52:16 +0530

Hello all,
Our team is currently deciding whether to implement snapshotting on cephfs
directories or not, and thus trying to understand the effects and
performance issues caused by snapshots on the cluster.
Our main concern is "How will the cluster be affected when data is written
to a file under a snapshot?". We were able to find out that ceph uses a
Copy-on-write mechanism to clone the snapshots, so my question is, for
example, if I have a 1GB file under a snapshot and I append another 10MB of
data to the file, then how much data will be copied because of the new
write?

My understanding is that since ceph stripes a file into multiple objects,
only the object containing the last stripe_unit (assuming it's not
completely filled) will be copied and the new data will be added to it, and
then ceph somehow manages to include the new object when I request the
current version file and will include the old object when I request the
file from the snapshot. Data copied = O(10MB), I mean it's in the order of
data written, and a few metadata changes.

Or since Ceph now uses Bluestore as the storage layer, does it have even
better optimizations (compared to the above case), like when editing the
object corresponding to the last stripe_unit, will ceph just write the new
data to a location in the disk and edit the metadata of the object to
include the location of the new data, and also maintain snapshot-based
versions of the metadata to provide us the file contents at previous points
in time. Data copied/written = 10MB and some more metadata changes
(compared to the above case).

Or is the case that ceph will copy the whole file and edit the new copy of
the file i.e, data copied is 1GB + 10MB. I am assuming this is not the case
because it's clearly suboptimal for large files.

PS: Any resources on measuring the effect of snapshots on the cluster and
any resources that explain the internals of ceph snapshots will be very
much appreciated. I've done extensive searching on the internet but
couldn't find any relevant data. Tried reading the code but you guys can
probably guess how it went.
Thank you.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx