Re: RBD clone on Bluestore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 1, 2021 at 3:07 PM Pawel S <pejotes@xxxxxxxxx> wrote:
>
> Hello Jason!
>
> On Mon, Mar 1, 2021, 19:48 Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>
> > On Mon, Mar 1, 2021 at 1:35 PM Pawel S <pejotes@xxxxxxxxx> wrote:
> > >
> > > hello!
> > >
> > > I'm trying to understand how Bluestore cooperates with RBD image clones,
> > so
> > > my test is simple
> > >
> > > 1. create an image (2G) and fill with data
> > > 2. create a snapshot
> > > 3. protect it
> > > 4. create a clone of the image
> > > 5. write a small portion of data (4K) to clone
> > > 6. check how it changed and if just 4K are used to prove CoW allocated
> > new
> > > extent instead of copying out snapped data.
> > >
> > > Unfortunately it occurs that at least rbd du reports that 4M was changed
> > > and the clone consumes 4M of data instead of expected 4K...
> > > '''
> > > rbd du rbd/clone1
> > > NAME PROVISIONED USED
> > > clone1 2 GiB 4 MiB
> > > '''
> > >
> > > How can I trace/prove Bluestore CoW really works in this case, and
> > prevent
> > > copying the rest of the 4M stripe like Filestore did ?
> >
> > RBD clones are not the same as RBD snapshots. Writing to an
> > unallocated extent within an RBD clone image will always copy up to
> > the "object-size" amount of data from the parent image. In that
> > respect, there is no difference between Bluestore and Filestore since
> > this logic is agnostic to the underlying OSD object store.
> >
>
> Thanks for clarification, so the only way to reduce impact here is to
> decrease size of objects ?

Yes, although since the Octopus release this copy-up from parent to
child operation is now a sparse-read from parent/sparse-write to child
(i.e. if the overlapping parent object extent was empty, no data would
need to be copied to the child). However, in your test case where
there is a full 4MiB of data in the overlapping parent extent, it
would all need to be copied to the child.

>
> > RBD snapshots, however, do result in copy-on-write like semantics
> > within Bluestore, although it's technically a redirect-on-write since
> > older data is not moved to preserve the snapshot history and instead
> > the new write request is redirected to a newly allocated block within
> > Bluestore and metadata is updated to reference the new block.
> >
> >
> Thanks!!! So I've wrongly adopted the same for clones.
>
> --
> Pawel S.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>


-- 
Jason
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux