Re: Cache Tier configuration

Mateusz Skała <mateusz.skala@xxxxxxxxxxx> · Tue, 12 Jul 2016 11:01:30 +0200

Thank You for replay. Answers below.

> -----Original Message-----
> From: Christian Balzer [mailto:chibi@xxxxxxx]
> Sent: Tuesday, July 12, 2016 3:37 AM
> To: ceph-users@xxxxxxxxxxxxxx
> Cc: Mateusz Skała <mateusz.skala@xxxxxxxxxxx>
> Subject: Re:  Cache Tier configuration
> 
> 
> Hello,
> 
> On Mon, 11 Jul 2016 16:19:58 +0200 Mateusz Skała wrote:
> 
> > Hello Cephers.
> >
> > Can someone help me in my cache tier configuration? I have 4 same SSD
> > drives 176GB (184196208K) in SSD pool, how to determine
> target_max_bytes?
> 
> What exact SSD models are these?
> What version of Ceph?

Intel DC S3610 (SSDSC2BX200G401), ceph version 9.2.1 (752b6a3020c3de74e07d2a8b4c5e48dab5a6b6fd) 

> > I assume
> > that should be (4 drives* 188616916992 bytes )/ 3 replica =
> > 251489222656 bytes *85% (because of full disk warning)
> 
> In theory correct, but you might want to consider (like with all pools) the
> impact of loosing a single SSD.
> In short, backfilling and then the remaining 3 getting full anyway.
> 

OK, so better to make lower max target bates than I have space? For example 170GB? Then I will have 1 osd reserve.

> > It will be 213765839257 bytes ~200GB. I make this little bit lower
> > (160GB) and after some time whole cluster stops on full disk error.
> > One of SSD drives are full. I see that use of space at the osd is not equal:
> >
> > 32 0.17099  1.00000   175G   127G 49514M 72.47 1.77  95
> >
> > 42 0.17099  1.00000   175G   120G 56154M 68.78 1.68  90
> >
> > 37 0.17099  1.00000   175G   136G 39670M 77.95 1.90 102
> >
> > 47 0.17099  1.00000   175G   130G 46599M 74.09 1.80  97
> >
> 
> What's the exact error message?
> 
> None of these are over 85 or 95%, how are they full?

Osd.37 was full on 96%, after error (heath ERR, 1 full osd).Then I set max_target_bytes on 100GB. Flushing reduced used space, now cluster is working ok, but I want to clarify my configuration.

> 
> If the above is a snapshot of when Ceph thinks something is "full", it may be
> an indication that you've reached target_max_bytes and Ceph simply has no
> clean (flushed) objects ready to evict.
> Which means a configuration problem (all ratios, not the defaults, for this
> pool please) or your cache filling up faster than it can flush.
> 
Above snapshot is at this time, when cluster Is working OK. Filling faster than flushing is very possible, when the error become I have in config min 'promote' set at 1, like this

    "osd_tier_default_cache_min_read_recency_for_promote": "1",
    "osd_tier_default_cache_min_write_recency_for_promote": "1",

Now I changed this to 3, and looks like is working, 3 days without near full osd.

> Space is never equal with Ceph, you need a high enough number of PGs for
> starters and then some fine-tuning.
> 
> After fiddling with the weights my cache-tier SSD OSDs are all very close to
> each other:
> ---
> ID WEIGHT  REWEIGHT SIZE  USE    AVAIL  %USE  VAR
> 18 0.64999  1.00000  679G   543G   136G 79.96 4.35
> 19 0.67000  1.00000  679G   540G   138G 79.61 4.33
> 20 0.64999  1.00000  679G   534G   144G 78.70 4.28
> 21 0.64999  1.00000  679G   536G   142G 79.03 4.30
> 26 0.62999  1.00000  679G   540G   138G 79.57 4.33
> 27 0.62000  1.00000  679G   538G   140G 79.30 4.32
> 28 0.67000  1.00000  679G   539G   140G 79.35 4.32
> 29 0.69499  1.00000  679G   536G   142G 78.96 4.30
> ---
In Your snapshot used space is near equal, only 1% difference, I have near 10% differences in used space. It depends on number of PG, or maybe weight?

> 
> >
> >
> > My setup:
> >
> > ceph --admin-daemon /var/run/ceph/ceph-osd.32.asok config show | grep
> > cache
> >
> >
> Nearly all of these are irrelevant, output of "ceph osd pool ls detail"
> please, at least for the cache pool.

ceph osd pool ls detail
pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 68565 flags hashpspool min_read_recency_for_promote 1 min_write_recency_for_promote 1 stripe_width 0
        removed_snaps [1~2,4~12,17~2e,46~ad,f9~2,fd~2,101~2]
pool 4 'ssd' replicated size 3 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 128 pgp_num 128 last_change 68913 flags hashpspool,incomplete_clones tier_of 5 cache_mode writeback target_bytes 182536110080 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 600s x6 stripe_width 0
        removed_snaps [1~3,6~2,9~2,d~8,17~6,1f~10,33~8,3f~a,4d~2,55~22,79~2]
pool 5 'sata' replicated size 3 min_size 1 crush_ruleset 2 object_hash rjenkins pg_num 128 pgp_num 128 last_change 68910 lfor 66807 flags hashpspool tiers 4 read_tier 4 write_tier 4 stripe_width 0
        removed_snaps [1~3,6~2,9~2,d~8,17~6,1f~10,33~8,3f~a,4d~2,55~22,79~2]

Cache tier on 'ssd' pool for 'sata' pool.

> 
> Have you read the documentation and my thread in this ML labeled "Cache
> tier operation clarifications"?

I have read documentation and some Intel blog (https://software.intel.com/en-us/blogs/2015/03/03/ceph-cache-tiering-introduction), I will search now for Your post and read them.

> 
> >
> > Can someone help? Any ideas? It is normal that whole cluster stops at
> > disk full error on cache tier, I was thinking that only one of pools
> > can stops and other without cache tier should still work.
> >
> Once you activate a cache-tier it becomes for all intends and purposes the
> the pool it's caching for.
> So any problem with it will be fatal.

OK.

> 
> Christian
> --
> Christian Balzer        Network/Systems Engineer
> chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
> http://www.gol.com/

Thank You for Your help.
Mateusz

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com