Re: Poor read performance.

David C <dcsysengineer@xxxxxxxxx> · Wed, 25 Apr 2018 08:56:00 +0000

How does your rados bench look? 
Have you tried playing around with read ahead and striping?

On Tue, 24 Apr 2018 17:53 Jonathan Proulx, <jon@xxxxxxxxxxxxx> wrote:
Hi All,

I seem to be seeing consitently poor read performance on my cluster

relative to both write performance and read perormance of a single

backend disk, by quite a lot.

cluster is luminous with 174 7.2k SAS drives across 12 storage servers

with 10G ethernet and jumbo frames.  Drives are mix 4T and 2T

bluestore with DB on ssd.

The performence I really care about is over rbd for VMs in my

OpenStack but 'rbd bench' seems to line up frety well with 'fio' test

inside VMs so a more or less typical random write rbd bench (from a

monitor node with 10G connection on same net as osds):

rbd bench  --io-total=4G --io-size 4096 --io-type write \

--io-pattern rand --io-threads 16 mypool/myvol

<snip />

elapsed:   361  ops:  1048576  ops/sec:  2903.82  bytes/sec: 11894034.98

same for random read is an order of magnitude lower:

rbd bench  --io-total=4G --io-size 4096 --io-type read \

--io-pattern rand --io-threads 16  mypool/myvol

elapsed:  3354  ops:  1048576  ops/sec:   312.60  bytes/sec: 1280403.47

(sequencial reads and bigger io-size help but not a lot)

ceph -s from during read bench so get a sense of comparing traffic:

  cluster:

    id:     <UUID>

    health: HEALTH_OK

  services:

    mon: 3 daemons, quorum ceph-mon0,ceph-mon1,ceph-mon2

    mgr: ceph-mon0(active), standbys: ceph-mon2, ceph-mon1

    osd: 174 osds: 174 up, 174 in

    rgw: 3 daemon active

  data:

    pools:   19 pools, 10240 pgs

    objects: 17342k objects, 80731 GB

    usage:   240 TB used, 264 TB / 505 TB avail

    pgs:     10240 active+clean

  io:

    client:   4296 kB/s rd, 417 MB/s wr, 1635 op/s rd, 3518 op/s wr

During deep-scrubs overnight I can see the disks doing >500MBps reads

and ~150rx/iops (each at peak), while during read bench (including all

traffic from ~1k VMs) individual osd data partitions peak around 25

rx/iops and 1.5MBps rx bandwidth so it seems like there should be

performance to spare.

Obviosly given my disk choices this isn't designed as a particularly

high performance setup but I do expect a bit mroe performance out of

it.

Are my expectations wrong? If not any clues what I've don (or failed

to do) that is wrong?

Pretty sure rx/wx was much more sysmetric in earlier versions (subset

of same hardware and filestore backend) but used a different perf tool

so don't want to make direct comparisons.

-Jon

-- 

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com