On Mon, Mar 1, 2021 at 3:07 PM Pawel S <pejotes@xxxxxxxxx> wrote: > > Hello Jason! > > On Mon, Mar 1, 2021, 19:48 Jason Dillaman <jdillama@xxxxxxxxxx> wrote: > > > On Mon, Mar 1, 2021 at 1:35 PM Pawel S <pejotes@xxxxxxxxx> wrote: > > > > > > hello! > > > > > > I'm trying to understand how Bluestore cooperates with RBD image clones, > > so > > > my test is simple > > > > > > 1. create an image (2G) and fill with data > > > 2. create a snapshot > > > 3. protect it > > > 4. create a clone of the image > > > 5. write a small portion of data (4K) to clone > > > 6. check how it changed and if just 4K are used to prove CoW allocated > > new > > > extent instead of copying out snapped data. > > > > > > Unfortunately it occurs that at least rbd du reports that 4M was changed > > > and the clone consumes 4M of data instead of expected 4K... > > > ''' > > > rbd du rbd/clone1 > > > NAME PROVISIONED USED > > > clone1 2 GiB 4 MiB > > > ''' > > > > > > How can I trace/prove Bluestore CoW really works in this case, and > > prevent > > > copying the rest of the 4M stripe like Filestore did ? > > > > RBD clones are not the same as RBD snapshots. Writing to an > > unallocated extent within an RBD clone image will always copy up to > > the "object-size" amount of data from the parent image. In that > > respect, there is no difference between Bluestore and Filestore since > > this logic is agnostic to the underlying OSD object store. > > > > Thanks for clarification, so the only way to reduce impact here is to > decrease size of objects ? Yes, although since the Octopus release this copy-up from parent to child operation is now a sparse-read from parent/sparse-write to child (i.e. if the overlapping parent object extent was empty, no data would need to be copied to the child). However, in your test case where there is a full 4MiB of data in the overlapping parent extent, it would all need to be copied to the child. > > > RBD snapshots, however, do result in copy-on-write like semantics > > within Bluestore, although it's technically a redirect-on-write since > > older data is not moved to preserve the snapshot history and instead > > the new write request is redirected to a newly allocated block within > > Bluestore and metadata is updated to reference the new block. > > > > > Thanks!!! So I've wrongly adopted the same for clones. > > -- > Pawel S. > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- Jason _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx