Re: Question about writeback performance and content address obejct for deduplication

myoungwon oh <ohmyoungwon@xxxxxxxxx> · Tue, 7 Feb 2017 20:03:37 +0900

Hi sage.

I uploaded the document which describe my overall appoach.
please see it and give me feedback.
slide: https://www.slideshare.net/secret/JZcy3yYEDIHPyg

thanks

2017-01-31 23:24 GMT+09:00 Sage Weil <sage@xxxxxxxxxxxx>:
> On Thu, 26 Jan 2017, myoungwon oh wrote:
>> I have two questions.
>>
>> 1. I would like to ask about CAS location. current our implementation store
>> content address object in storage tier.However, If we store the CAO in the
>> cache tier, we can get a performance advantage. Do you think we can create
>> CAO in cachetier? or create a separate storage pool for CAS?
>
> It depends on the design.  If the you are naming the objects at the
> librados client side, then you can use the rados cluster itself
> unmodified (with or without a cache tier).  This is roughly how I have
> anticipated implementing the CAS storage portion.  If you are doing the
> chunking hashing and within the OSD itself, then you can't do the CAS
> at the first tier because the requests won't be directed at the right OSD.
>
>> 2. The results below are performance result for our current implementation.
>> experiment setup:
>> PROXY (inline dedup), WRITEBACK (lazy dedup, target_max_bytes: 50MB),
>> ORIGINAL(without dedup feature and cache tier),
>> fio, 512K block, seq. I/O, single thread
>>
>> One thing to note is that the writeback case is slower than the proxy.
>> We think there are three problems as follows.
>>
>> A. The current implementation creates a fingerprint by reading the entire
>> object when flushing. Therefore, there is a problem that read and write are
>> mixed.
>
> I expect this is a small factor compared to the fact that in writeback
> mode you have to *write* to the cache tier, which is 3x replicated,
> whereas in proxy mode those writes don't happen at all.
>
>> B. When client request read, the promote_object function reads the object
>> and writes it back to the cache tier, which also causes a mix of read and
>> write.
>
> This can be mitigated by setting the min_read_recency_for_promote pool
> property to something >1.  Then reads will be proxied unless the object
> appears to be hot (because it has been touched over multiple
> hitset intervals).
>
>> C. When flushing, the unchanged part is rewritten because flush operation
>> perform per-object based.
>
> Yes.
>
> Is there a description of your overall approach somewhere?
>
> sage
>
>
>>
>> Do I have something wrong? or Could you give me a suggestion to improve
>> performance?
>>
>>
>> a. Write performance (KB/s)
>>
>> dedup_ratio  0 20 40 60 80 100
>>
>> PROXY  45586 47804 51120 52844 56167 55302
>>
>> WRITEBACK  13151 11078 9531 13010 9518 8319
>>
>> ORIGINAL  121209 124786 122140 121195 122540 132363
>>
>>
>> b. Read performance (KB/s)
>>
>> dedup_ratio  0 20 40 60 80 100
>>
>> PROXY  112231 118994 118070 120071 117884 132748
>>
>> WRITEBACK  34040 29109 19104 26677 24756 21695
>>
>> ORIGINAL  285482 284398 278063 277989 271793 285094
>>
>>
>> thanks,
>> Myoungwon Oh
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html