RE: Adding compression support for bluestore.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As described earlier, we can easily afford the cost of setting min_alloc_size to  4KB. I don't see any advantage in handling the larger allocation sizes -- only disadvantages.

Allen Samuels
Software Architect, Fellow, Systems and Software Solutions 

2880 Junction Avenue, San Jose, CA 95134
T: +1 408 801 7030| M: +1 408 780 6416
allen.samuels@xxxxxxxxxxx


> -----Original Message-----
> From: Sage Weil [mailto:sage@xxxxxxxxxxxx]
> Sent: Wednesday, March 16, 2016 2:15 PM
> To: Allen Samuels <Allen.Samuels@xxxxxxxxxxx>
> Cc: Igor Fedotov <ifedotov@xxxxxxxxxxxx>; ceph-devel <ceph-
> devel@xxxxxxxxxxxxxxx>
> Subject: RE: Adding compression support for bluestore.
> 
> On Wed, 16 Mar 2016, Allen Samuels wrote:
> > > A potential issue with using WAL for compressed block overwrites is
> > > significant WAL data volume increase. IIUC currently WAL record can
> > > have up to 2*bluestore_min_alloc_size (i.e. 128K) client data per
> > > single write request
> > > - overlapped head and tail.
> > > In case of compressed blocks this will be up to
> > > 2*bluestore_max_compressed_block ( i.e. 8Mb ) as you can't simply
> > > overwrite fully overlapped extents - one should operate compression
> > > blocks now...
> > >
> > > Seems attractive otherwise...
> >
> > This is one of the fundamental tradeoffs with compression. When your
> compression block size exceeds the minimum I/O size you either have to
> consume time (RMW + uncompress/recompress) or you have to consume
> space (overlapping extents). Sage's current code essentially starts out by
> consuming space and then assumes in the background that he'll consume
> time to recover the space.
> > Of course if you set the compression block size equal to or smaller than the
> minimum I/O size you can avoid these problems -- but you create others
> (including poor compression, needing to track very small chunks of space,
> etc.) and nobody seriously believes that this is a viable alternative.
> 
> My inclination would be to set min_alloc_size to something smallish (if not
> 64KB, then 32KB perhaps) and the compression_block to something also
> reasonable (256KB or 512KB at most).  That means you lose some of the
> savings (on average, 1/2 of min_alloc_size) which is more significant if
> compression_block is not >> min_alloc_size, but it avoids the expensive
> r/m/w cases and big read + decompress for a small read request...
> 
> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux