Re: Fwd: Fwd: Reduce read latency and bandwidth for ec pool

Ning Yao <zay11022@xxxxxxxxx> · Wed, 18 Mar 2015 14:15:37 +0800

I have another question about ecpool. Based on current strategy, we
find that we need long time to train the hit-set(if we set it one hour
per hit-set, if we set a small value hit_set_period, the training
result would be bad), and then achieve a relatively high hit rate.
While at the beginning of first several hours, it seems that some
newest promoted objects are evicted. This is because list_object just
scans the objects in the pg and does not concern which one is newest
promoted or not. This means if the system is initialized , the
performance of the cache_pool will be improved after three or four
hours and tend to be stable after ten hours.
So does LRU or LRU2 makes sense here? or  other strategies to make
training process converge faster?
Regards
Ning Yao

2015-03-18 2:18 GMT+08:00 Josh Durgin <jdurgin@xxxxxxxxxx>:
> On 03/17/2015 01:58 AM, Loic Dachary wrote:
>>
>>
>>
>> On 17/03/2015 09:45, Xinze Chi wrote:
>>>
>>> Sorry, I have not measure it.
>>>
>>> But I think it should really reduce latency when hit miss in cache
>>> pool and do_proxy_read.
>>
>>
>> Interesting. I bet Jason or Josh have an opinion about this.
>
>
> Yes, it sounds like a great idea! It seems like we'd need this
> for other potential optimizations in the future anyway:
>
>  * partial-object promotes for cache tiers
>  * client-side ec to eliminate another network hop
>
> This would also enable efficient reads from replicas for
> EC pools in general. That could be useful for rbd parent snapshots
> stored in a cache tier.
>
> This makes me wonder if it would be useful to add write-once
> append-only rbd images that could be stored directly on EC pools, for
> use as parent images.
>
> Josh
>
>
>>
>>>
>>> 2015-03-17 16:39 GMT+08:00 Loic Dachary <loic@xxxxxxxxxxx>:
>>>>
>>>>
>>>>
>>>> On 17/03/2015 09:05, Xinze Chi wrote:
>>>>>
>>>>> RBD.
>>>>
>>>>
>>>> Did you measure that RBD does a significant amount of reads that would
>>>> be optimized in this way ?
>>>>
>>>>> Maybe we could use tier pool.
>>>>>
>>>>> Thanks
>>>>>
>>>>> 2015-03-17 16:02 GMT+08:00 Loic Dachary <loic@xxxxxxxxxxx>:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 17/03/2015 08:52, Xinze Chi wrote:
>>>>>>>
>>>>>>> ---------- Forwarded message ----------
>>>>>>> From: Xinze Chi <xmdxcxz@xxxxxxxxx>
>>>>>>> Date: 2015-03-17 15:52 GMT+08:00
>>>>>>> Subject: Re: Fwd: Reduce read latency and bandwidth for ec pool
>>>>>>> To: Loic Dachary <loic@xxxxxxxxxxx>
>>>>>>>
>>>>>>>
>>>>>>> Yes, In my VDI environment, client read 4k every time. If we can read
>>>>>>> object from only shard. It would reduce the latency and bandwidth a
>>>>>>> lot.
>>>>>>
>>>>>>
>>>>>> I'm curious about your workload. Are you using RadosGW ? RBD ?
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> 2015-03-17 15:48 GMT+08:00 Loic Dachary <loic@xxxxxxxxxxx>:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 17/03/2015 08:27, Xinze Chi wrote:
>>>>>>>>>
>>>>>>>>> hi, loic:
>>>>>>>>>
>>>>>>>>>     I have an idea which could reduce read latency and bandwidth
>>>>>>>>> for ec pool.
>>>>>>>>>
>>>>>>>>>     But, I don't know whether it is feasible.
>>>>>>>>>
>>>>>>>>>     Such as ec pool stripe_width = 16384 = 4 * 4096,  K = 4, M =2
>>>>>>>>>
>>>>>>>>>     So ceph will partition the total of 16384 bytes to 4 data
>>>>>>>>> chunk,
>>>>>>>>> and encoding 2 parity chunk
>>>>>>>>>
>>>>>>>>>     shard_0 include 0 - (4096-1) in original data;
>>>>>>>>>     shard_1 include 4096 - (4096*2 - 1) in original data;
>>>>>>>>>     shard_2 include 4096*2 - (4096 * 3 -1) in original data;
>>>>>>>>>     shard_3 include 4096*3 - (4096 * 4 - 1) in original data
>>>>>>>>>     shard_4 include parity chunk
>>>>>>>>>     shard_5 include parity chunk
>>>>>>>>>
>>>>>>>>>      Now if client read (offset 0, len 4096) from object, it should
>>>>>>>>> read 4 shard (from 0-3) and decode all this 4 chunk.
>>>>>>>>>
>>>>>>>>>      But, this example, maybe we can compute the destination shard
>>>>>>>>> based on ec pool config ,read offset and read len , we only read
>>>>>>>>>
>>>>>>>>> shard_0 and return it to client, because shard_0 has include all
>>>>>>>>> data
>>>>>>>>> as client need.
>>>>>>>>
>>>>>>>>
>>>>>>>> That optimization makes sense to me. I guess you're interested in
>>>>>>>> having small objects in the pool and only read a few bytes at a time ?
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>>
>>>>>>>>>
>>>>>>>>>      Wait for your comment.
>>>>>>>>>
>>>>>>>>>      Thanks.
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel"
>>>>>>> in
>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>>>
>>>>
>>>> --
>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html