Re: Ceph RBD - High IOWait during the Writes

Frank Schilder <frans@xxxxxx> · Tue, 10 Nov 2020 19:45:08 +0000

If this is a file server, why not use SAMBA on cephfs? That's what we do. RBD is a very cumbersome extra layer for storing files that eats a lot of performance. Add an SSD to each node for the meta-data and primary data pool and use an EC pool on HDDs for data. This will be much better.

Still, EC and small writes don't go well together. You may need to consider 3-times replicated.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: athreyavc <athreyavc@xxxxxxxxx>
Sent: 10 November 2020 20:20:55
To: dillaman@xxxxxxxxxx
Cc: ceph-users
Subject:  Re: Ceph RBD - High IOWait during the Writes

Thanks for the Reply.

We are not really expecting  performance level  which is needed for the
Virtual Machines or Databases. We want to use it as a File store where an
app writes in to the mounts.

I think it is slow for  file write operations  and we see high IO waits.
Anything I can do to increase the throughput ?

Thanks and regards,

Athreya

On Tue, Nov 10, 2020 at 7:10 PM Jason Dillaman <jdillama@xxxxxxxxxx> wrote:

> On Tue, Nov 10, 2020 at 1:52 PM athreyavc <athreyavc@xxxxxxxxx> wrote:
> >
> > Hi All,
> >
> > We have recently deployed a new CEPH cluster Octopus 15.2.4 which
> consists
> > of
> >
> > 12 OSD Nodes(16 Core + 200GB RAM,  30x14TB disks, CentOS 8)
> > 3 Mon Nodes (8 Cores + 15GB, CentOS 8)
> >
> > We use Erasure Coded Pool and RBD block devices.
> >
> > 3 Ceph clients use the RBD devices, each has 25 RBDs  and Each RBD size
> is
> > 10TB. Each RBD is partitioned with the EXT4 file system.
> >
> > Cluster Health Is OK and Hardware is New and good.
> >
> > All the machines have 10Gbps (Active/Passive) bond Interface  configured
> on
> > it.
> >
> > Read operation of the cluster is OK, however, writes are very slow.
> >
> > One one of the RBDs we did the perf test.
> >
> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k
> -iodepth=128
> > -rw=randread -runtime=60 -filename=/dev/rbd40
> >
> > Run status group 0 (all jobs):
> >    READ: bw=401MiB/s (420MB/s), 401MiB/s-401MiB/s (420MB/s-420MB/s),
> > io=23.5GiB (25.2GB), run=60054-60054msec
> >
> > fio -ioengine=libaio -direct=1 -invalidate=1 -name=test -bs=4k
> -iodepth=128
> > -rw=randwrite -runtime=60 -filename=/dev/rbd40
> >
> > Run status group 0 (all jobs):
> >   WRITE: bw=217KiB/s (222kB/s), 217KiB/s-217KiB/s (222kB/s-222kB/s),
> > io=13.2MiB (13.9MB), run=62430-62430msec
> >
> > I see a High IO wait from the client.
> >
> > Any suggestions/pointers address this issue is really appreciated.
>
> EC pools + small random writes + performance: pick two of the three. ;-)
>
> Writes against an EC pool require the chunk to be re-written via an
> expensive read/modify/write cycle.
>
> > Thanks and Regards,
> >
> > Athreya
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
>
> --
> Jason
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx