Re: [EXTERNAL] Ceph performance is too good (impossible..)...

V Plus <v.plussharp@xxxxxxxxx> · Mon, 12 Dec 2016 20:37:50 -0500

The same..
see:
A: (g=0): rw=read, bs=5M-5M/5M-5M/5M-5M, ioengine=libaio, iodepth=1
...
fio-2.2.10
Starting 16 processes

A: (groupid=0, jobs=16): err= 0: pid=27579: Mon Dec 12 20:36:10 2016
  mixed: io=122515MB, bw=6120.3MB/s, iops=1224, runt= 20018msec

I think at the end, the only one way to solve this issue is to write the image before read test....as suggested....

I have no clue why rbd engine does not work...

On Mon, Dec 12, 2016 at 4:23 PM, Will.Boege <Will.Boege@xxxxxxxxxx> wrote:

Try adding --ioengine=libaio

From: 
V Plus <v.plussharp@xxxxxxxxx>

Date: Monday, December 12, 2016 at 2:40 PM

To: "Will.Boege" <Will.Boege@xxxxxxxxxx>

Subject: Re: [EXTERNAL]  Ceph performance is too good (impossible..)...

Hi Will, 

thanks very much..

However, I tried with your suggestions.

Both are not working...

1. with FIO rbd engine:

[RBD_TEST]

ioengine=rbd

clientname=admin

pool=rbd

rbdname=fio_test

invalidate=1    

direct=1

group_reporting=1

unified_rw_reporting=1

time_based=1

rw=read

bs=4MB

numjobs=16

ramp_time=10

runtime=20

then I run "sudo fio rbd.job" and  got:

RBD_TEST: (g=0): rw=read, bs=4M-4M/4M-4M/4M-4M, ioengine=rbd, iodepth=1

...

fio-2.2.10

Starting 16 processes

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

rbd engine: RBD version: 0.1.10

Jobs: 12 (f=7): [R(5),_(1),R(4),_(1),R(2),_(2),R(1)] [100.0% done] [11253MB/0KB/0KB /s] [2813/0/0 iops] [eta 00m:00s]

RBD_TEST: (groupid=0, jobs=16): err= 0: pid=17504: Mon Dec 12 15:32:52 2016

  mixed: io=212312MB, bw=10613MB/s, iops=2653, runt= 20005msec

2. with blockalign

[A]

direct=1

group_reporting=1

unified_rw_reporting=1

size=100%

time_based=1

filename=/dev/rbd0

rw=read

bs=5MB

numjobs=16

ramp_time=5

runtime=20

blockalign=512b

[B]

direct=1

group_reporting=1

unified_rw_reporting=1

size=100%

time_based=1

filename=/dev/rbd1

rw=read

bs=5MB

numjobs=16

ramp_time=5

runtime=20

blockalign=512b

sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  & wait

Then I got:

A: (groupid=0, jobs=16): err= 0: pid=19320: Mon Dec 12 15:35:32 2016

  mixed: io=88590MB, bw=4424.7MB/s, iops=884, runt= 20022msec

B: (groupid=0, jobs=16): err= 0: pid=19324: Mon Dec 12 15:35:32 2016

  mixed: io=88020MB, bw=4395.6MB/s, iops=879, runt= 20025msec

..............

On Mon, Dec 12, 2016 at 10:45 AM, Will.Boege <Will.Boege@xxxxxxxxxx> wrote:

My understanding is that when using direct=1 on a raw block device FIO (aka-you) will have to handle all the sector alignment
 or the request will get buffered to perform the alignment.  

Try adding the –blockalign=512b option to your jobs, or better yet just use the native FIO RBD engine.

Something like this (untested) -

[A]
ioengine=rbd
clientname=admin
pool=rbd
rbdname=fio_test
direct=1
group_reporting=1
unified_rw_reporting=1
time_based=1
rw=read
bs=4MB
numjobs=16
ramp_time=10
runtime=20

From:
ceph-users <ceph-users-bounces@lists.ceph.com> on behalf of V Plus <v.plussharp@xxxxxxxxx>

Date: Sunday, December 11, 2016 at 7:44 PM

To: "ceph-users@xxxxxxxxxxxxxx" <ceph-users@xxxxxxxxxxxxxx>

Subject: [EXTERNAL]  Ceph performance is too good (impossible..)...

Hi Guys,

we have a ceph cluster with 6 machines (6 OSD per host). 

1. I created 2 images in Ceph, and map them to another host A (outside the Ceph cluster). On host A, I got /dev/rbd0 and /dev/rbd1.

2. I start two fio job to perform READ test on rbd0 and rbd1. (fio job descriptions can be found below)

"sudo fio fioA.job -output a.txt & sudo  fio fioB.job -output b.txt  & wait"

3. After the test, in a.txt, we got bw=1162.7MB/s, in b.txt, we get bw=3579.6MB/s.

The results do NOT make sense because there is only one NIC on host A, and its limit is 10 Gbps (1.25GB/s).

I suspect it is because of the cache setting.

But I am sure that in file /etc/ceph/ceph.conf on host A,I already added:

[client]

rbd cache = false

Could anyone give me a hint what is missing? why....

Thank you very much.

fioA.job:

[A]

direct=1

group_reporting=1

unified_rw_reporting=1

size=100%

time_based=1

filename=/dev/rbd0

rw=read

bs=4MB

numjobs=16

ramp_time=10

runtime=20

fioB.job:

[B]

direct=1

group_reporting=1

unified_rw_reporting=1

size=100%

time_based=1

filename=/dev/rbd1

rw=read

bs=4MB

numjobs=16

ramp_time=10

runtime=20

Thanks...

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com