Re: per-pool bluestore options

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 15 Aug 2016, Igor Fedotov wrote:
> On 13.08.2016 21:56, Sage Weil wrote:
> > Hi Igor,
> > 
> > I took another look at
> > 
> > 	https://github.com/ceph/ceph/pull/10556
> > 
> > You define three settings:
> > 
> >   compress_hint - determines if pool contains compressible / incompressible
> > data
> >   compress_algorithm - permits to specify different compression algorithm
> >   compress_ratio - specifies maximum compression ratio
> > 
> > I think we should extend this to include csum-related options.  And use a
> > consistent naming scheme that aligns with the config options where we just
> > strip off the bluestore_ prefix.  The relevant options are:
> Sounds good!
> The major questions here is how do these per-pool options correlate with
> corresponding per-store ones?
> One might suggest per-pool options to have higher priority over per-store one.
> But I'm not sure that the best option.  Sometimes user might want to
> disable/alter corresponding option without enumerating all the pools by simple
> switching at per-store level. Hence we need to consider some means for that.


> >   bluestore_csum = {true, false}
> >   bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
> >   bluestore_csum_min_chunk_size = 4k      (*)
> >   bluestore_csum_max_chunk_size = 64k     (*)
> >   bluestore_compression = {force, aggressive, passive, none}
> Actually this option along with compression_hint result in a single flag:
> compress or not. Any rationale for not using that simple flag?

There are currently two persistent hints: compressible and incompressible.  
Aggressive will compress unless incompressible, whereas passive will 
compress if compressible.  Either way it varies per object, though, so a 
single flag isn't sufficient.

> My original idea here was to have bluestore_compression on per-store basis and
> compression_hint of per pool one. This way one can receive pretty flexible
> control at both storage and pool level - see my question above.

Yeah, I'm not sure how complex it's worth getting.  To get complete 
control, we probably need a default *and* the min/max allowed range for 
each option in order to bound what you can choose per-pool, but I think 
that is probably overkill.

A bit easier would be to have

 bluestore_csum_override = {,true,false}
 bluestore_csum_type_override = {, crc32c, crc32c_{8,16}}
 bluestore_compression_override = {, force, aggressive, passive, none}

and have them blank by default (no override).  For the numeric options 
we'd need to use 0 to mean 'no override'.

What do you think?

> >   bluestore_compression_algorithm = {snappy, zlib, ...}
> >   bluestore_compression_min_blob_size = 256k
> >   bluestore_compression_max_blob_size = 4M
> >   bluestore_compression_required_ratio = .875
> > 
> > (*) These currently have a different name but aren't used yet.  Working on
> > a PR to change that.
> > 
> > What's missing is your 'compress_hint'.  We can call that
> > 'compression_hint' to align with the names above?
> > 
> >   compression_hint = {compressible, incompressible, ...}
> > 
> > The main changes from your PR that I think we need to make are:
> > 
> > * These options should be part of the pool_opts_t structure in pg_pool_t
> > (which is a set of optional key/value-like parameters for the pool).
> > 
> > * We can add a new ObjectStore operation that passes down parameters for a
> > collection, and have the OSD pass these all in for each PG collection when
> > the pool properties change.  That way ObjectStore doesn't need to persist
> > these options at all--just store the ones it understands in memory, and
> > the OSD will always reset them on startup etc.
> One, probably silly, question here - do pool and collection have 1 to 1
> relation? It seemed to me that they don't and hence we can't store per-pool
> settings at collection level without some additional mapping: pool -> setting.
> Also this requires some means to remove pool settings entry when pool goes
> away...

It's 1:1 mapping of PG to collection, but lots of PGs per pool.  So when 
the OSD gets a pool change, it'll set the same flags for all local PGs in 
that pool.  A bit of duplication, but it keeps the interface 
uncomplicated (no need to teach ObjectStore about pools).

> And another question - are there any means to receive notification when pool
> setting changes. Similar to the one we have for OSD settings. We need that to
> trigger that new Object Store operation to update pool settings.

There is already some machinery to do this (the pg_pool_t struct has an 
epoch field).  See void PGPool::update(OSDMapRef map) in PG.cc.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux