Hi Mark, I've picked some which reported failed today: This is the beginning of the log parser, how does it look like? Compaction Statistics /var/log/ceph/ceph-osd.26.log Total OSD Log Duration (seconds) 41766.799 Number of Compaction Events 90 Avg Compaction Time (seconds) 3.934152933333334 Total Compaction Time (seconds) 354.07376400000004 Avg Output Size: (MB) 557.684199587504 Total Output Size: (MB) 50191.577962875366 Total Input Records 447697809 Total Output Records 430832247 Avg Output Throughput (MB/s) 145.05270280158922 Avg Input Records/second 1433801.859746764 Avg Output Records/second 1211797.1846839893 Avg Output/Input Ratio 0.9493287032959218 And another one: Compaction Statistics /var/log/ceph/ceph-osd.19.log Total OSD Log Duration (seconds) 42004.52 Number of Compaction Events 23 Avg Compaction Time (seconds) 4.211967217391304 Total Compaction Time (seconds) 96.87524599999999 Avg Output Size: (MB) 590.1112143060435 Total Output Size: (MB) 13572.557929039001 Total Input Records 68160780 Total Output Records 66981318 Avg Output Throughput (MB/s) 123.39584478881285 Avg Input Records/second 952025.1911276178 Avg Output Records/second 910383.7987128447 Avg Output/Input Ratio 0.9612154436271764 Istvan Szabo Senior Infrastructure Engineer --------------------------------------------------- Agoda Services Co., Ltd. e: istvan.szabo@xxxxxxxxx --------------------------------------------------- -----Original Message----- From: Szabo, Istvan (Agoda) <Istvan.Szabo@xxxxxxxxx> Sent: Tuesday, November 16, 2021 1:33 AM To: Mark Nelson <mnelson@xxxxxxxxxx> Cc: ceph-users@xxxxxxx Subject: Re: How to minimise the impact of compaction in ‘rocksdb options’? I’ll give a try to the script. Yes, I have only rgw with data on ec 4:2 and have a huge amounts of small objects. Have a bucket with 1.2 billions of objects and have another with 300 millions of objects and other buckets. The slow ops most of the time is spending on “waiting for readable” pg sometimes even minutes. I had all my osd spilledover when I used nvme device in front of the sas ssd as a wal+db, but I removed (migrated) back to the block to avoid spillovers. So what would be the best practice/solution to handle it? Istvan Szabo Senior Infrastructure Engineer --------------------------------------------------- Agoda Services Co., Ltd. e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx> --------------------------------------------------- On 2021. Nov 15., at 17:58, Mark Nelson <mnelson@xxxxxxxxxx> wrote: Email received from the internet. If in doubt, don't click any link nor open any attachment ! ________________________________ Hi, Compaction can block reads, but on the write path you should be able to absorb a certain amount of writes via the WAL before rocksdb starts throttling writes. The larger and more WAL buffers you have, the more writes you can absorb, but bigger buffers also take more CPU to keep in sorted order and more aggregate buffer uses more RAM so it's a double edged sword. I'd suggest looking and seeing how much time you actually spend in compaction. For clusters that primarily are serving block via RBD, there's a good chance it's actually fairly minimal. For RGW (especially with lots of small objects and/or using erasure coding) you might be spending more time in compaction, but it's important to see how much. FWIW, you can try running the following script against your OSD log to see a summary of compaction events: https://github.com/ceph/cbt/blob/master/tools/ceph_rocksdb_log_parser.py Mark On 11/15/21 10:48 AM, Szabo, Istvan (Agoda) wrote: Hello, If I’m not mistaken in my cluster this can block io on the osds if have a huge amount of objects on that specific osd. How can I change the values to minimise the impact? I guess it needs osd restart to make it effective and the “rocksdb options” are the values that needs to be tuned, but what should it be changed? Istvan Szabo Senior Infrastructure Engineer --------------------------------------------------- Agoda Services Co., Ltd. e: istvan.szabo@xxxxxxxxx<mailto:istvan.szabo@xxxxxxxxx> --------------------------------------------------- _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx