Re: [PATCH 0/3] krbd discard optimizations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 30, 2019 at 8:16 PM Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
>
> On Wed, Jan 30, 2019 at 9:41 AM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> >
> > On Wed, Jan 30, 2019 at 2:53 PM Jason Dillaman <jdillama@xxxxxxxxxx> wrote:
> > >
> > > Should we try to ensure that librbd and krbd have (semi) matching
> > > discard behavior? In librbd, a full object discard on an overlapping
> > > child will truncate the child object to size zero instead of issuing a
> > > delete. With this krbd change, if a child object had previously been
> > > written and now was discarded, the parent data will become visible
> > > again since the child object is deleted. This could be prevented by
> >
> > Right, this is what librbd was doing until about a year ago when
> > I fixed the parent data exposure with a compound create+truncate.
> >
> > > issuing a "assert_exists" + "truncate(0)" op to prevent the implicit
> > > zeroing of a full object extent if it was never written and would
> > > prevent the parent data from becoming visible again after a discard.
> >
> > truncate on a non-existent object is a no-op, so I'm not sure how
> > assert_exists+truncate(0) is different from plain truncate(0).
>
> Thanks, I forgot about that. My intent was to ensure an existing
> object would be truncated to zero but not create an object if it
> doesn't exist.
>
> > > Obviously the data is still consistent either way but it would be good
> > > to match.
> >
> > I thought about this, but decided to drop reverse mapping and overlap
> > calculation code from rbd_obj_setup_discard() simply because it would
> > have been semi-matching at best.  It also seems clearer to me: in the
> > zeroout case we worry about copyups and parent blocks, in the discard
> > case we don't.  Special casing whole-object discards felt like mixing
> > them up without a good reason.
>
> I was only minimally concerned about the case where the copy-up had
> already occurred so your delete-on-discard would now re-expose the
> parent object extent (e.g. parent has all 'A's, child had all 'B's, so
> the discard switches the child to all 'A's instead of zeroes).
> Effectively, a full object discard will cause your object extent to go
> back-in-time.

Well, returning stale (or random) data from the discarded region
technically doesn't violate any specs.  On a second thought, I think
you are right though -- I only dropped create+truncate because I wanted
to avoid having to calculate the reverse mapping for mostly hand wavy
reasons that definitely don't stand against the sneakiness of going
back in time.  I'll put the "delete or create+truncate" logic back
tomorrow.

Thanks,

                Ilya



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux