Re: per-pool bluestore options

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 18.08.2016 18:26, Sage Weil wrote:
On Thu, 18 Aug 2016, Igor Fedotov wrote:
On 17.08.2016 19:27, Sage Weil wrote:
On Mon, 15 Aug 2016, Igor Fedotov wrote:
On 13.08.2016 21:56, Sage Weil wrote:
Hi Igor,

I took another look at

	https://github.com/ceph/ceph/pull/10556

You define three settings:

    compress_hint - determines if pool contains compressible /
incompressible
data
    compress_algorithm - permits to specify different compression
algorithm
    compress_ratio - specifies maximum compression ratio

I think we should extend this to include csum-related options.  And use
a
consistent naming scheme that aligns with the config options where we
just
strip off the bluestore_ prefix.  The relevant options are:
Sounds good!
The major questions here is how do these per-pool options correlate with
corresponding per-store ones?
One might suggest per-pool options to have higher priority over per-store
one.
But I'm not sure that the best option.  Sometimes user might want to
disable/alter corresponding option without enumerating all the pools by
simple
switching at per-store level. Hence we need to consider some means for
that.
    bluestore_csum = {true, false}
    bluestore_csum_type = {crc32c, crc32c_{8,16}, ...}
    bluestore_csum_min_chunk_size = 4k      (*)
    bluestore_csum_max_chunk_size = 64k     (*)
    bluestore_compression = {force, aggressive, passive, none}
Actually this option along with compression_hint result in a single flag:
compress or not. Any rationale for not using that simple flag?
There are currently two persistent hints: compressible and incompressible.
Aggressive will compress unless incompressible, whereas passive will
compress if compressible.  Either way it varies per object, though, so a
single flag isn't sufficient.
So final approach is to have bluestore_compression switch at pool level (and
per-store too) and compression_hint at object one, right? Hence we don't need
compression_hint per pool. In fact my original point was that the latter(along
with bluestore_compression) is IMHO an overkill and can be substituted with
simple enable_compression flag.
I think so.  Having a pool compression_hint = compressible will accomplish
the same thing as setting the pool compression = aggressive.
OK, Sounds good!
Still there is an open question how to assign such a hint to objects from user perspective....

My original idea here was to have bluestore_compression on per-store basis
and
compression_hint of per pool one. This way one can receive pretty flexible
control at both storage and pool level - see my question above.
Yeah, I'm not sure how complex it's worth getting.  To get complete
control, we probably need a default *and* the min/max allowed range for
each option in order to bound what you can choose per-pool, but I think
that is probably overkill.

A bit easier would be to have

   bluestore_csum_override = {,true,false}
   bluestore_csum_type_override = {, crc32c, crc32c_{8,16}}
   bluestore_compression_override = {, force, aggressive, passive, none}

and have them blank by default (no override).  For the numeric options
we'd need to use 0 to mean 'no override'.

What do you think?
IMO per-pool settings (if set) have to override corresponding per-store
one by default. But one should be able to turn off that behavior with
'one-click' and force specific per-store setting to prevail.
Hence we don't need 'override' settings at per-pool level but should have one
at per-store one. E.g.

Pool A has:
bluestore_csum = true
Pool B:
bluestore_csum=<not set>
OSD:
bluestore_csum=  false [/force]

if /'force' is omitted - actual settings are
Pool A:
bluestore_csum=true
Pool B:
bluestore_csum=false

else
Pool A:
bluestore_csum=false
Pool B:
bluestore_csum=false
Hmm, this makes parsing more complicated but is probably less confusing.
But what about numeric values?

  bluestore_compression_max_blob_size=1048576/force?

This is partly why I like the (implementation) simplicity of

  bluestore_compression_max_blob_size_force=1048576

(or _override, whatever).
This way one gets parameter name duplication for each per-store parameter that has per-pool clone. E.g.

 bluestore_compression_max_blob_size for default settings

and
 bluestore_compression_max_blob_size_force for override ones.

Can't say that's pretty elegant..
What's about a single generic parameter that denotes per-collection params to be banned, e.g.

bluestore_force_per_store_settings = *
bluestore_force_per_store_settings = bluestore_compression
bluestore_force_per_store_settings = bluestore_compression, bluestore_csum, bluestore_csum_type


or bluestore_disable_per_collection_settings = a,b,c ?

Alternative per-collection enable option seems a bit less convenient here IMHO.

An additional 'bluestore_force_all_settings=true' can be introduced to force
all per-store settings to prevail too.
Or maybe

  bluestore_ignore_per_collection_settings=true
or the converse
  bluestore_per_collection_settings=false

to be precise?

    bluestore_compression_algorithm = {snappy, zlib, ...}
    bluestore_compression_min_blob_size = 256k
    bluestore_compression_max_blob_size = 4M
    bluestore_compression_required_ratio = .875

(*) These currently have a different name but aren't used yet.  Working
on
a PR to change that.

What's missing is your 'compress_hint'.  We can call that
'compression_hint' to align with the names above?

    compression_hint = {compressible, incompressible, ...}

The main changes from your PR that I think we need to make are:

* These options should be part of the pool_opts_t structure in pg_pool_t
(which is a set of optional key/value-like parameters for the pool).

* We can add a new ObjectStore operation that passes down parameters for
a
collection, and have the OSD pass these all in for each PG collection
when
the pool properties change.  That way ObjectStore doesn't need to
persist
these options at all--just store the ones it understands in memory, and
the OSD will always reset them on startup etc.
One, probably silly, question here - do pool and collection have 1 to 1
relation? It seemed to me that they don't and hence we can't store
per-pool
settings at collection level without some additional mapping: pool ->
setting.
Also this requires some means to remove pool settings entry when pool goes
away...
It's 1:1 mapping of PG to collection, but lots of PGs per pool.  So when
the OSD gets a pool change, it'll set the same flags for all local PGs in
that pool.  A bit of duplication, but it keeps the interface
uncomplicated (no need to teach ObjectStore about pools).
And PG belongs to single pool only, right?
Right.

sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux