On Mon, 15 Aug 2016, Igor Fedotov wrote: > On 13.08.2016 21:56, Sage Weil wrote: > > Hi Igor, > > > > I took another look at > > > > https://github.com/ceph/ceph/pull/10556 > > > > You define three settings: > > > > compress_hint - determines if pool contains compressible / incompressible > > data > > compress_algorithm - permits to specify different compression algorithm > > compress_ratio - specifies maximum compression ratio > > > > I think we should extend this to include csum-related options. And use a > > consistent naming scheme that aligns with the config options where we just > > strip off the bluestore_ prefix. The relevant options are: > Sounds good! > The major questions here is how do these per-pool options correlate with > corresponding per-store ones? > One might suggest per-pool options to have higher priority over per-store one. > But I'm not sure that the best option. Sometimes user might want to > disable/alter corresponding option without enumerating all the pools by simple > switching at per-store level. Hence we need to consider some means for that. > > bluestore_csum = {true, false} > > bluestore_csum_type = {crc32c, crc32c_{8,16}, ...} > > bluestore_csum_min_chunk_size = 4k (*) > > bluestore_csum_max_chunk_size = 64k (*) > > bluestore_compression = {force, aggressive, passive, none} > Actually this option along with compression_hint result in a single flag: > compress or not. Any rationale for not using that simple flag? There are currently two persistent hints: compressible and incompressible. Aggressive will compress unless incompressible, whereas passive will compress if compressible. Either way it varies per object, though, so a single flag isn't sufficient. > My original idea here was to have bluestore_compression on per-store basis and > compression_hint of per pool one. This way one can receive pretty flexible > control at both storage and pool level - see my question above. Yeah, I'm not sure how complex it's worth getting. To get complete control, we probably need a default *and* the min/max allowed range for each option in order to bound what you can choose per-pool, but I think that is probably overkill. A bit easier would be to have bluestore_csum_override = {,true,false} bluestore_csum_type_override = {, crc32c, crc32c_{8,16}} bluestore_compression_override = {, force, aggressive, passive, none} and have them blank by default (no override). For the numeric options we'd need to use 0 to mean 'no override'. What do you think? > > bluestore_compression_algorithm = {snappy, zlib, ...} > > bluestore_compression_min_blob_size = 256k > > bluestore_compression_max_blob_size = 4M > > bluestore_compression_required_ratio = .875 > > > > (*) These currently have a different name but aren't used yet. Working on > > a PR to change that. > > > > What's missing is your 'compress_hint'. We can call that > > 'compression_hint' to align with the names above? > > > > compression_hint = {compressible, incompressible, ...} > > > > The main changes from your PR that I think we need to make are: > > > > * These options should be part of the pool_opts_t structure in pg_pool_t > > (which is a set of optional key/value-like parameters for the pool). > > > > * We can add a new ObjectStore operation that passes down parameters for a > > collection, and have the OSD pass these all in for each PG collection when > > the pool properties change. That way ObjectStore doesn't need to persist > > these options at all--just store the ones it understands in memory, and > > the OSD will always reset them on startup etc. > One, probably silly, question here - do pool and collection have 1 to 1 > relation? It seemed to me that they don't and hence we can't store per-pool > settings at collection level without some additional mapping: pool -> setting. > Also this requires some means to remove pool settings entry when pool goes > away... It's 1:1 mapping of PG to collection, but lots of PGs per pool. So when the OSD gets a pool change, it'll set the same flags for all local PGs in that pool. A bit of duplication, but it keeps the interface uncomplicated (no need to teach ObjectStore about pools). > And another question - are there any means to receive notification when pool > setting changes. Similar to the one we have for OSD settings. We need that to > trigger that new Object Store operation to update pool settings. There is already some machinery to do this (the pg_pool_t struct has an epoch field). See void PGPool::update(OSDMapRef map) in PG.cc. sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html