Poor read performance.

Jonathan Proulx <jon@xxxxxxxxxxxxx> · Tue, 24 Apr 2018 12:52:55 -0400

Hi All,

I seem to be seeing consitently poor read performance on my cluster
relative to both write performance and read perormance of a single
backend disk, by quite a lot.

cluster is luminous with 174 7.2k SAS drives across 12 storage servers
with 10G ethernet and jumbo frames.  Drives are mix 4T and 2T
bluestore with DB on ssd.

The performence I really care about is over rbd for VMs in my
OpenStack but 'rbd bench' seems to line up frety well with 'fio' test
inside VMs so a more or less typical random write rbd bench (from a
monitor node with 10G connection on same net as osds):

rbd bench  --io-total=4G --io-size 4096 --io-type write \
--io-pattern rand --io-threads 16 mypool/myvol

<snip />

elapsed:   361  ops:  1048576  ops/sec:  2903.82  bytes/sec: 11894034.98

same for random read is an order of magnitude lower:

rbd bench  --io-total=4G --io-size 4096 --io-type read \
--io-pattern rand --io-threads 16  mypool/myvol

elapsed:  3354  ops:  1048576  ops/sec:   312.60  bytes/sec: 1280403.47

(sequencial reads and bigger io-size help but not a lot)

ceph -s from during read bench so get a sense of comparing traffic:

  cluster:
    id:     <UUID>
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph-mon0,ceph-mon1,ceph-mon2
    mgr: ceph-mon0(active), standbys: ceph-mon2, ceph-mon1
    osd: 174 osds: 174 up, 174 in
    rgw: 3 daemon active

  data:
    pools:   19 pools, 10240 pgs
    objects: 17342k objects, 80731 GB
    usage:   240 TB used, 264 TB / 505 TB avail
    pgs:     10240 active+clean

  io:
    client:   4296 kB/s rd, 417 MB/s wr, 1635 op/s rd, 3518 op/s wr

During deep-scrubs overnight I can see the disks doing >500MBps reads
and ~150rx/iops (each at peak), while during read bench (including all
traffic from ~1k VMs) individual osd data partitions peak around 25
rx/iops and 1.5MBps rx bandwidth so it seems like there should be
performance to spare.

Obviosly given my disk choices this isn't designed as a particularly
high performance setup but I do expect a bit mroe performance out of
it.

Are my expectations wrong? If not any clues what I've don (or failed
to do) that is wrong?

Pretty sure rx/wx was much more sysmetric in earlier versions (subset
of same hardware and filestore backend) but used a different perf tool
so don't want to make direct comparisons.

-Jon

-- 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com