Re: A way to reduce compression overhead

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 15 Nov 2016, Igor Fedotov wrote:
> > > > I'm curious what you have in mind!  The blob_depth as currently
> > > > implemented is not terribly reliable...
> > > General idea is to estimate allocated vs stored ratio for the blob(s)
> > > under
> > > the extent being written.
> > > Where stored and allocated are measured in allocation units. And can be
> > > calculated using blobs ref_map.
> > > If that ratio is greater than 1 (+-some correction) - we need to perform
> > > GC
> > > for these blobs. Given the fact we do that after compression preprocessing
> > > it's expensive to merge the compressed extent being written and old
> > > shards.
> > > Hence that shards are written as standalone extents as opposed to current
> > > implementation when we try to merge both new and existing extents into  a
> > > single entity. Not a big drawback IMHO. Evidently this is valid for new
> > > compressed extents (that are AU aligned) only. Uncompressed ones can be
> > > merged
> > > in any fashion.
> > > This is just a draft hence comments are highly appreciated.
> > Yeah, I think this is a more sensible approach (focusing on allocated vs
> > referenced).  It seems like the most straightforward thing to do is
> > actually look at the old_extents in the wctx--since those are the ref_maps
> > that will become less referenced than before--in order to identify which
> > blobs might need rewriting.  Avoiding the merge case vastly simplifies it.
> > That also isn't any persistent metadata that we have to maintain (that
> > might become incorrect or inconsistent).
> > 
> > We'd probably do the _do_write_data (which will do the various
> > punch_hole's), then check for any gc work, then do the final
> > _do_alloc_write and _wctx_finish?
> > 
> Sounds good. Still need a detailed consistent algorithm though - working on
> that.

In the meantime, perhaps we should remove the blob_depth code for now so 
it doesn't end up in the on-disk format.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux