Re: How to minimise the impact of compaction in ‘rocksdb options’?

"Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx> · Mon, 15 Nov 2021 18:33:26 +0000

I’ll give a try to the script.

Yes, I have only rgw with data on ec 4:2 and have a huge amounts of small objects.

Have a bucket with 1.2 billions of objects and have another with 300 millions of objects and other buckets.

The slow ops most of the time is spending on “waiting for readable” pg sometimes even minutes.

I had all my osd spilledover when I used nvme device in front of the sas ssd as a wal+db, but I removed (migrated) back to the block to avoid spillovers.

So what would be the best practice/solution to handle it?

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
---------------------------------------------------

On 2021. Nov 15., at 17:58, Mark Nelson <mnelson@xxxxxxxxxx> wrote:

Email received from the internet. If in doubt, don't click any link nor open any attachment !
________________________________

Hi,

Compaction can block reads, but on the write path you should be able to
absorb a certain amount of writes via the WAL before rocksdb starts
throttling writes.  The larger and more WAL buffers you have, the more
writes you can absorb, but bigger buffers also take more CPU to keep in
sorted order and more aggregate buffer uses more RAM so it's a double
edged sword.  I'd suggest looking and seeing how much time you actually
spend in compaction.  For clusters that primarily are serving block via
RBD, there's a good chance it's actually fairly minimal.  For RGW
(especially with lots of small objects and/or using erasure coding) you
might be spending more time in compaction, but it's important to see how
much.

FWIW, you can try running the following script against your OSD log to
see a summary of compaction events:

https://github.com/ceph/cbt/blob/master/tools/ceph_rocksdb_log_parser.py

Mark

On 11/15/21 10:48 AM, Szabo, Istvan (Agoda) wrote:
Hello,

If I’m not mistaken in my cluster this can block io on the osds if have a huge amount of objects on that specific osd.

How can I change the values to minimise the impact?

I guess it needs osd restart to make it effective and the “rocksdb options” are the values that needs to be tuned, but what should it be changed?

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx>
---------------------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx