On Mon, Mar 20, 2017 at 4:20 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
Just a few corrections, hope you don't mind
> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@lists.ceph.com ] On Behalf Of
> Mike Lovell
> Sent: 20 March 2017 20:30
> To: Webert de Souza Lima <webert.boss@xxxxxxxxx>
> Cc: ceph-users <ceph-users@xxxxxxxxxxxxxx>
> Subject: Re: cephfs cache tiering - hitset
>
> i'm not an expert but here is my understanding of it. a hit_set keeps track of
> whether or not an object was accessed during the timespan of the hit_set.
> for example, if you have a hit_set_period of 600, then the hit_set covers a
> period of 10 minutes. the hit_set_count defines how many of the hit_sets to
> keep a record of. setting this to a value of 12 with the 10 minute
> hit_set_period would mean that there is a record of objects accessed over a
> 2 hour period. the min_read_recency_for_promote, and its newer
> min_write_recency_for_promote sibling, define how many of these hit_sets
> and object must be in before and object is promoted from the storage pool
> into the cache pool. if this were set to 6 with the previous examples, it means
> that the cache tier will promote an object if that object has been accessed at
> least once in 6 of the 12 10-minute periods. it doesn't matter how many
> times the object was used in each period and so 6 requests in one 10-minute
> hit_set will not cause a promotion. it would be any number of access in 6
> separate 10-minute periods over the 2 hours.
Sort of, the recency looks at the last N most recent hitsets. So if set to 6, then the object would have to be in all last 6 hitsets. Because of this, during testing I found setting recency above 2 or 3 made the behavior quite binary. If an object was hot enough, it would probably be in every hitset, if it was only warm it would never be in enough hitsets in row. I did experiment with X out of N promotion logic, ie must be in 3 hitsets out of 10 non sequential. If you could find the right number to configure, you could get improved cache behavior, but if not, then there was a large chance it would be worse.
For promotion I think having more hitsets probably doesn't add much value, but they may help when it comes to determining what to flush.
that's good to know. i just made an assumption without actually digging in to the code. do you recommend keeping the number of hitsets equal to the max of either min_read_recency_for_promote and min_write_recency_for_promote? how are the hitsets checked during flush and/or eviction?
mike
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com