Re: Cache tier operation clarifications

Christian Balzer <chibi@xxxxxxx> · Mon, 7 Mar 2016 11:21:50 +0900

Hello,

I'd like to get some insights, confirmations from people here who are
either familiar with the code or have this tested more empirically than me
(the VM/client node of my test cluster is currently pinning for the
fjords).

When it comes to flushing/evicting we already established that this
triggers based on PG utilization, not a pool wide one.
So for example in a pool with 1024TB capacity (set via target_max_bytes)
and 1024 PGs and a cache_target_dirty_ratio of 0.5 flushing will start
when the first PG reaches 512MB utilization.

However while the documentation states that the least recently objects are
evicted when things hit the cache_target_full_ratio, it is less than clear
(understatement of the year) when flushing is concerned. 
To quote:
"When the cache pool consists of a certain percentage of modified (or
dirty) objects, the cache tiering agent will flush them to the storage pool."

How do we read this?
When hitting 50% (as in the example above) all of the dirty objects will
get flushed? 
That doesn't match what I'm seeing nor would it be a sensible course of
action to unleash such a potentially huge torrent of writes.

If we interpret this as "get the dirty objects below the threshold" (which
is what seems to happen) there are 2 possible courses of action here:

1. Flush dirty object(s) from the PG that has reached the threshold. 
A sensible course of action in terms of reducing I/Os, but it may keep
flushing the same objects over and over again if they happen to be on the
"full" PG.

2. Flush dirty objects from all PGs (most likely in a least recently used
fashion) and stop when we're eventually under the threshold by having
finally hit the "full" PG. 
Results in a lot more IO but will of course create more clean objects
available for eviction if needed.
This is what I think is happening.

So, is there any "least recently used" consideration in effect here,
or is the only way to avoid (pointless) flushes by setting
"cache_min_flush_age" accordingly?

Unlike for flushes above, eviction clearly states that it's going by
"least recently used".
Which in the case of per PG operation would violate that promise, as
people of course expect this to be pool wide.
And if it is indeed pool wide, the same effect as above will happen,
evictions will happen until the "full" PG gets hit, evicting far more than
would have been needed.

Something to maybe consider would be a target value, for example with
"cache_target_full_ratio" at 0.80 and "cache_target_full_ratio_target" at
0.78, evicting things until it reaches the target ratio. 

Lastly, while we have perf counters like "tier_dirty", a gauge for dirty
and clean objects/bytes would be quite useful to me at least.
And clearly the cache tier agent already has those numbers. 
Right now I'm guestimating that most of my cache objects are actually
clean (from VM reboots, only read, never written to), but I have no way
to tell for sure.

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com