Re: Benchmark WAL/DB on SSD and HDD for RGW RBD CephFS

Danni Setiawan <danni.n.setiawan@xxxxxxxxx> · Wed, 16 Sep 2020 17:00:36 +0700

Yes, I agree that there are many knob for fine tuning Ceph performance. 
The problem is we don't have data which workload that benefit most from 
WAL/DB in SSD vs in same spinning drive and by how much. Does it really 
help in a cluster that mostly for object storage/RGW? Or may be just 
block storage/RBD workload that benefit most?

IMHO, I think we need some cost-benefit analysis from this because the 
cost placing WAL/DB in SSD is quite noticeable (multiple OSD would be 
fail when SSD fail and capacity reduced).

Thanks.

On 16/09/20 14.45, Janne Johansson wrote:
Den ons 16 sep. 2020 kl 06:27 skrev Danni Setiawan 
<danni.n.setiawan@xxxxxxxxx <mailto:danni.n.setiawan@xxxxxxxxx>>:

    Hi all,

    I'm trying to find performance penalty with OSD HDD when using
    WAL/DB in
    faster device (SSD/NVMe) vs WAL/DB in same device (HDD) for different
    workload (RBD, RGW with index bucket in SSD pool, and CephFS with
    metadata in SSD pool). I want to know if giving up disk slot for
    WAL/DB
    device is worth vs adding more OSD.

    Unfortunately I cannot find the benchmark for these kind workload.
    Has
    anyone ever done this benchmark?

I think this probably is a too vague and broad question. If you ask
"will my cluster handle far more write iops if I have WAL/DB (or journal)
 on SSD/NVME instead of on the same drive as the data", then almost 
everyone
will agree that yes, flash WAL/DB will make your writes (and 
recoveries) lots
quicker, since NVME/SSD will do anything from 10x to 100x the amount 
of small
writes per second than the best spin-HDDs. But how this will affect 
any one single
end-user experience behind S3 or CephFS without diving into a ton of 
implementation
details like "how much ram cache does the MDS have for cephfs, how 
many RGWs
and S3 streams are you using in parallel in order to speed up S3/RGW 
operations"
will be very hard to say in pure numbers.

Also, even if flash devices are "only" used for speeding up writes, 
normal clusters see a lot
of mixed IO so if writes theoretically take 0ms, you get lots more 
free time to do reads
on the HDDs, and reads often can be accelerated with RAM caches in 
various places.

So like any other storage system, if you put a flash device in front 
of the spinners you
will see improvements, especially for many small write ops, but if 
your use case consists
of "copy these 100 10G-images to this pool every night" or
"every hour we unzip the sources to a large program and checksum the files
 and then clean the directory" will have a large impact on how 
flash helps your
cluster.

Also, more boxes add more performance in more ways than just "more 
disk", every extra
cpu, every G ram, every extra network port means the overall perf of 
the cluster goes up
by sharing the total load better. This will not show up in simple 
one-threaded tests but as
you get 2-5-10-100 active clients doing IO it will be noticeable.

--
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx