Re: per-pool bluestore options

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 18 Aug 2016, Igor Fedotov wrote:
> On 17.08.2016 19:27, Sage Weil wrote:
> > On Mon, 15 Aug 2016, Igor Fedotov wrote:
> > > On 13.08.2016 21:56, Sage Weil wrote:
> > > > Hi Igor,
> > > > 
> > > > I took another look at
> > > > 
> > > > 	https://github.com/ceph/ceph/pull/10556
> > > > 
> > > > You define three settings:
> > > > 
> > > >    compress_hint - determines if pool contains compressible /
> > > > incompressible
> > > > data
> > > >    compress_algorithm - permits to specify different compression
> > > > algorithm
> > > >    compress_ratio - specifies maximum compression ratio
> > > > 
> > > > I think we should extend this to include csum-related options.  And use
> > > > a
> > > > consistent naming scheme that aligns with the config options where we
> > > > just
> > > > strip off the bluestore_ prefix.  The relevant options are:
> > > Sounds good!
> > > The major questions here is how do these per-pool options correlate with
> > > corresponding per-store ones?
> > > One might suggest per-pool options to have higher priority over per-store
> > > one.
> > > But I'm not sure that the best option.  Sometimes user might want to
> > > disable/alter corresponding option without enumerating all the pools by
> > > simple
> > > switching at per-store level. Hence we need to consider some means for
> > > that.
> > 
> > > >    bluestore_csum = {true, false}
> > > >    bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
> > > >    bluestore_csum_min_chunk_size = 4k      (*)
> > > >    bluestore_csum_max_chunk_size = 64k     (*)
> > > >    bluestore_compression = {force, aggressive, passive, none}
> > > Actually this option along with compression_hint result in a single flag:
> > > compress or not. Any rationale for not using that simple flag?
> > There are currently two persistent hints: compressible and incompressible.
> > Aggressive will compress unless incompressible, whereas passive will
> > compress if compressible.  Either way it varies per object, though, so a
> > single flag isn't sufficient.
> So final approach is to have bluestore_compression switch at pool level (and
> per-store too) and compression_hint at object one, right? Hence we don't need
> compression_hint per pool. In fact my original point was that the latter(along
> with bluestore_compression) is IMHO an overkill and can be substituted with
> simple enable_compression flag.

I think so.  Having a pool compression_hint = compressible will accomplish 
the same thing as setting the pool compression = aggressive.

> > > My original idea here was to have bluestore_compression on per-store basis
> > > and
> > > compression_hint of per pool one. This way one can receive pretty flexible
> > > control at both storage and pool level - see my question above.
> > Yeah, I'm not sure how complex it's worth getting.  To get complete
> > control, we probably need a default *and* the min/max allowed range for
> > each option in order to bound what you can choose per-pool, but I think
> > that is probably overkill.
> > 
> > A bit easier would be to have
> > 
> >   bluestore_csum_override = {,true,false}
> >   bluestore_csum_type_override = {, crc32c, crc32c_{8,16}}
> >   bluestore_compression_override = {, force, aggressive, passive, none}
> > 
> > and have them blank by default (no override).  For the numeric options
> > we'd need to use 0 to mean 'no override'.
> > 
> > What do you think?
> IMO per-pool settings (if set) have to override corresponding per-store 
> one by default. But one should be able to turn off that behavior with 
> 'one-click' and force specific per-store setting to prevail.
> Hence we don't need 'override' settings at per-pool level but should have one
> at per-store one. E.g.
> 
> Pool A has:
> bluestore_csum = true
> Pool B:
> bluestore_csum=<not set>
> OSD:
> bluestore_csum=  false [/force]
> 
> if /'force' is omitted - actual settings are
> Pool A:
> bluestore_csum=true
> Pool B:
> bluestore_csum=false
> 
> else
> Pool A:
> bluestore_csum=false
> Pool B:
> bluestore_csum=false

Hmm, this makes parsing more complicated but is probably less confusing.  
But what about numeric values?  

 bluestore_compression_max_blob_size=1048576/force?

This is partly why I like the (implementation) simplicity of

 bluestore_compression_max_blob_size_force=1048576

(or _override, whatever).

> An additional 'bluestore_force_all_settings=true' can be introduced to force
> all per-store settings to prevail too.

Or maybe

 bluestore_ignore_per_collection_settings=true
or the converse
 bluestore_per_collection_settings=false

to be precise?

> > > >    bluestore_compression_algorithm = {snappy, zlib, ...}
> > > >    bluestore_compression_min_blob_size = 256k
> > > >    bluestore_compression_max_blob_size = 4M
> > > >    bluestore_compression_required_ratio = .875
> > > > 
> > > > (*) These currently have a different name but aren't used yet.  Working
> > > > on
> > > > a PR to change that.
> > > > 
> > > > What's missing is your 'compress_hint'.  We can call that
> > > > 'compression_hint' to align with the names above?
> > > > 
> > > >    compression_hint = {compressible, incompressible, ...}
> > > > 
> > > > The main changes from your PR that I think we need to make are:
> > > > 
> > > > * These options should be part of the pool_opts_t structure in pg_pool_t
> > > > (which is a set of optional key/value-like parameters for the pool).
> > > > 
> > > > * We can add a new ObjectStore operation that passes down parameters for
> > > > a
> > > > collection, and have the OSD pass these all in for each PG collection
> > > > when
> > > > the pool properties change.  That way ObjectStore doesn't need to
> > > > persist
> > > > these options at all--just store the ones it understands in memory, and
> > > > the OSD will always reset them on startup etc.
> > > One, probably silly, question here - do pool and collection have 1 to 1
> > > relation? It seemed to me that they don't and hence we can't store
> > > per-pool
> > > settings at collection level without some additional mapping: pool ->
> > > setting.
> > > Also this requires some means to remove pool settings entry when pool goes
> > > away...
> > It's 1:1 mapping of PG to collection, but lots of PGs per pool.  So when
> > the OSD gets a pool change, it'll set the same flags for all local PGs in
> > that pool.  A bit of duplication, but it keeps the interface
> > uncomplicated (no need to teach ObjectStore about pools).
> And PG belongs to single pool only, right?

Right.

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux