RE: rbd performance drop a lot with objectmap

Somnath Roy <Somnath.Roy@xxxxxxxxxxx> · Mon, 13 Feb 2017 23:00:09 +0000

James,
It was discussed earlier in the ceph-devel that with exclusive lock enabled , multi thread (job) performance will be hurting. You should increase QD to increase parallelism but not the threads.
I brought that up sometimes back but couldn't provide an use case where multiple client will be accessing an image in parallel , it will be great if you have one. Exporting rbd for the data base use case could be the one (?).

Thanks & Regards
Somnath

-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of LIU, Fei
Sent: Monday, February 13, 2017 2:37 PM
To: dillaman@xxxxxxxxxx
Cc: Ceph Development
Subject: Re: rbd performance drop a lot with objectmap

Hi Jason,
   It makes sense .By the way, we do a random write with qd=1 initially .The performance drops almost 3 times.
  We also tested 10  times fio write test in one image using below fio configuration with each fio write taking 30 seconds.  
[global]
ioengine=rbd
clientname=admin
pool=test-pool
rbdname= image-with-objmap
rw=write
bs=16k
direct=1
runtime=1200
ramp_time=30
group_reporting
time_based
[with_objectmap_rbd_iodepth1_numjobs1]
iodepth=128
numjobs=1

  IOPS increases from 180 to 1200 in the end after 10 times fio rbd write. Once objectmap is warming up , it lowers the latency a lot  and increase IOPS accordingly. 

However, we also tested the rand write with rbd-bench ,we found the performance almost drop 10 times with more contention involved:

1.	rbd create test-pool/no-map-image --size 100G --object-size 16K --image-format 2 --image-feature layering  
2.	rbd create test-pool/with-map-image --size 100G --object-size 16K --image-format 2 --image-feature layering --image-feature exclusive-lock --image-feature object-map

        Using rbd bench to write 1G data randomly：
         rbd bench-write -p test-pool --image no-map-image --io-size 16K --io-threads 16 --io-total 1G --io-pattern rand

 We found the performance drop almost 10 times in terms of IOPS. It shows the performance is getting worse with more jobs. It make us wondering whether the lock is killing factor. Any thoughts?

   Regards,
   James

本邮件及其附件含有阿里巴巴集团的商业秘密信息，仅限于发送给上面地址中列出的个人和群组，禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制和散发）本邮件及其附件中的信息，如果您错收本邮件，请您立即电话或邮件通知发件人并删除本邮件。
This email and its attachments contain confidential information from Alibaba Group.which is intended only for the person or entity whose address is listed above.Any use of information contained herein in any way(including,but not limited to,total or partial disclosure,reproduction or dissemination)by persons other than the intended recipient(s) is prohibited.If you receive this email in error,please notify the sender by phone or email immediately and delete it.

On 2/12/17, 9:59 AM, "Jason Dillaman" <jdillama@xxxxxxxxxx> wrote:

    On Fri, Feb 10, 2017 at 9:50 AM, LIU, Fei <james.liu@xxxxxxxxxxxxxxx> wrote:
    > With FIO single job queue depth 1(W/ vs W/O)  , IOPS drop 3 times and
    > latency increased 3 times. As with more jobs, the IOPS drops more and
    > latency increase higher and higher.

    Assuming you have a random write workload, within a QD=1, it is
    entirely expected for the first write to the object to incur a
    performance penalty since it requires an additional round-trip
    operation to the backing OSDs. Since you only hit this penalty for the
    very first write to the object, its cost is amortized over future
    writes. This is similar to cloned images and the amortized cost of
    copying up the backing parent object to the clone on the first write.

    > The objectmap_locker in pre and post and objectmap update per IO really hurt
    > performance. Lockless queue and new way to caching objectmap? Any thoughts?

    By definition, with a QD=1, there is zero contention on that lock. The
    lock is really only held for a minuscule amount of time and is dropped
    while the OSD operation is in-progress. Do you actually have any
    performance metrics to back up this claim? Note that the post state is
    only hit when you issue a remove / trim / discard operation.

    -- 
    Jason

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at  http://vger.kernel.org/majordomo-info.html
��.n��������+%������w��{.n����z��u���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f