Hello, Ceph 0.94.5 for the record. As some may remember, I phased in a 2TB cache tier 5 weeks ago. About now it has reached about 60% usage, which is what I have the cache_target_dirty_ratio set to. And for the last 3 days I could see some writes (op_in_bytes) to the backing storage (aka HDD pool), which hadn't seen any write action for the aforementioned 5 weeks. Alas my graphite dashboard showed no flushes (tier_flush), whereas tier_promote on the cache pool could always be matched more or less to op_out_bytes on the HDD pool. The documentation (RH site) just parrots the names of the various perf counters, so no help there. OK, lets look a what we got: --- "tier_promote": 49776, "tier_flush": 0, "tier_flush_fail": 0, "tier_try_flush": 558, "tier_try_flush_fail": 0, "agent_flush": 558, "tier_evict": 0, "agent_evict": 0, --- Lots of promotions, that's fine. Not a single tier_flush, er. wot? So what does this denote then? OK, clearly tier_try_flush and agent_flush are where the flushing is actually recorded (in my test cluster they differ, as I have run that against the wall several times). No evictions yet, that will happen at 90% usage. So now I changed the graph data source for flushes to tier_try_flush, however that does not match most of the op_in_bytes (or any other counter I tried!) on the HDDs. As, in there are flushes but no activity on the HDD OSDs as far as Ceph seems to be concerned. I can however match the flushes to actual disk activity on the HDDs (gathered by collectd), which are otherwise totally dormant. Can somebody shed some light on this, is it a known problem, in need of a bug report? Christian -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com