Re: Bad performance when two fio write to the same image

Mark Nelson <mnelson@xxxxxxxxxx> · Thu, 4 Aug 2016 16:18:48 -0500

If you search through the archives, there's been a couple of other 
people that have run into this as well with Jewel.  With the librbd 
engine, you are much better using iodepth and/or multiple fio processes 
vs numjobs.  Even pre-jewel, there were gotchas that might not be 
immediately apparent.  If you for instance increase numjobs and do 
sequential reads, after the first job reads some data, it gets cached on 
the OSD, and then all subsequent jobs will re-read the same cached data 
unless you explicitly change the offsets.

IE it was probably never a good idea to use numjobs, but now it's really 
apparent that it's not a good idea. :)

Mark

On 08/04/2016 03:48 PM, Warren Wang - ISD wrote:
Wow, thanks. I think that¹s the tidbit of info I needed to explain why
increasing numjobs doesn¹t (anymore) scale performance as expected.

Warren Wang

On 8/4/16, 7:49 AM, "ceph-users on behalf of Jason Dillaman"
<ceph-users-bounces@xxxxxxxxxxxxxx on behalf of jdillama@xxxxxxxxxx> wrote:

With exclusive-lock, only a single client can have write access to the
image at a time. Therefore, if you are using multiple fio processes
against the same image, they will be passing the lock back and forth
between each other and you can expect bad performance.

If you have a use-case where you really need to share the same image
between multiple concurrent clients, you will need to disable the
exclusive-lock feature (this can be done with the RBD cli on existing
images or by passing "--image-shared" when creating new images).

On Thu, Aug 4, 2016 at 5:52 AM, Alexandre DERUMIER <aderumier@xxxxxxxxx>
wrote:
Hi,

I think this is because of exclusive-lock feature enabled by default
since jessie on rbd image

----- Mail original -----
De: "Zhiyuan Wang" <zhiyuan.wang@xxxxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Jeudi 4 Août 2016 11:37:04
Objet:  Bad performance when two fio write to the same image

Hi Guys

I am testing the performance of Jewel (10.2.2) with FIO, but found the
performance would drop dramatically when two process write to the same
image.

My environment:

1. Server:

One mon and four OSDs running on the same server.

Intel P3700 400GB SSD which have 4 partitions, and each for one osd
journal (journal size is 10GB)

Inter P3700 400GB SSD which have 4 partitions, and each format to XFS
for one osd data (each data is 90GB)

10GB network

CPU: Intel(R) Xeon(R) CPU E5-2660 (it is not the bottleneck)

Memory: 256GB (it is not the bottleneck)

2. Client

10GB network

CPU: Intel(R) Xeon(R) CPU E5-2660 (it is not the bottleneck)

Memory: 256GB (it is not the bottleneck)

3. Ceph

Default configuration expect use async messager (have tried simple
messager, got nearly the same result)

10GB image with 256 pg num

Test Case

1. One Fio process: bs 4KB; iodepth 256; direct 1; ioengine rbd;
randwrite

The performance is nearly 60MB/s and IOPS is nearly 15K

Four osd are nearly the same busy

2. Two Fio process: bs 4KB; iodepth 256; direct 1; ioengine rbd;
randwrite (write to the same image)

The performance is nearly 4MB/s each, and IOPS is nearly 1.5K each
Terrible

And I found that only one osd is busy, the other three are much more
idle on CPU

And I also run FIO on two clients, the same result

3. Two Fio process: bs 4KB; iodepth 256; direct 1; ioengine rbd
randwrite (one to image1, one to image2)

The performance is nearly 35MB/s each and IOPS is nearly 8.5K each
Reasonable

Four osd are nearly the same busy

Could someone help to explain the reason of TEST 2

Thanks

Email Disclaimer & Confidentiality Notice

This message is confidential and intended solely for the use of the
recipient to whom they are addressed. If you are not the intended
recipient you should not deliver, distribute or copy this e-mail. Please
notify the sender immediately by e-mail and delete this e-mail from your
system. Copyright © 2016 by Istuary Innovation Labs, Inc. All rights
reserved.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

This email and any files transmitted with it are confidential and intended solely for the individual or entity to whom they are addressed. If you have received this email in error destroy it immediately. *** Walmart Confidential ***
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com