Re: octopus garbage collector makes slow ops

mahnoosh shahidi <mahnooosh.shd@xxxxxxxxx> · Mon, 26 Jul 2021 15:53:05 +0430

Thanks for your help. Our hdd osd have separate nvme disks for DB use.

On Mon, Jul 26, 2021 at 3:49 PM Igor Fedotov <ifedotov@xxxxxxx> wrote:

> Unfortunately I'm not an expert in RGW hence nothing to recommend from
> that side.
>
> Apparently your issues are caused by bulk data removal - it appears that
> RocksDB can hardly sustain such things and its performance degrades. We've
> seen that plenty of times before.
>
> So far there are two known workarounds - manual DB compaction with using
> ceph-kvstore-tool and setting bluefs_buffer_io to true. The latter makes
> sense for some Ceph releases which got that parameter set to false by
> default, v15.2.12 is one of them. And indeed that setting might cause high
> RAM usage in cases - you might want to look for relevant recent PRs at
> github or ask Mark Nelson from RH for more details.
>
> Nevertheless current upstream recommendation/default is to have it set to
> true as it greatly improves DB performance.
>
>
> So you might want to try to compact RocksDB as per above but please note
> that's a temporary workaround - DB might start to degrade if removals are
> going on.
>
> There is also a PR to address the bulk removal issue in general:
>
> 1) https://github.com/ceph/ceph/pull/37496 (still pending review and
> unlikely to be backported to Octopus).
>
>
> One more question - do your HDD OSDs  have additional fast (SSD/NVMe)
> drives for DB volumes? Or their DBs reside as spinning drives only? If the
> latter is true I would strongly encourage you to fix that by adding
> respective fast disks - RocksDB tend to works badly when not deployed on
> SSDs...
>
>
> Thanks,
>
> Igor
>
>
> On 7/26/2021 1:28 AM, mahnoosh shahidi wrote:
>
> Hi Igor,
> Thanks for your response.This problem happens on my osds with hdd disks. I
> set the bluefs_buffered_io to true just for these osds but it caused my
> bucket index disks (which are ssd) to produce slow ops. I also tried to set
> bluefs_buffered_io to true in bucket index osds but they filled the entire
> memory (256G) so I had to set the bluefs_buffered_io back to false in all
> osds. Is that the only way to handle the garbage collector problem? Do you
> have any ideas for the bucket index problem?
>
> On Thu, Jul 22, 2021 at 3:37 AM Igor Fedotov <ifedotov@xxxxxxx> wrote:
>
>> Hi Mahnoosh,
>>
>> you might want to set bluefs_buffered_io to true for every OSD.
>>
>> It looks it's false by default in v15.2.12
>>
>>
>> Thanks,
>>
>> Igor
>>
>> On 7/18/2021 11:19 PM, mahnoosh shahidi wrote:
>> > We have a ceph cluster with 408 osds, 3 mons and 3 rgws. We updated our
>> > cluster from nautilus 14.2.14 to octopus 15.2.12 a few days ago. After
>> > upgrading, the garbage collector process which is run after the
>> lifecycle
>> > process, causes slow ops and makes some osds to be restarted. In each
>> > process the garbage collector deletes about 1 million objects. Below are
>> > the one of the osd's logs before it restarts.
>> >
>> > ```
>> > 2021-07-18T00:44:38.807+0430 7fd1cda76700  1 osd.60 1092400 is_healthy
>> > false -- internal heartbeat failed
>> > 2021-07-18T00:44:38.807+0430 7fd1cda76700  1 osd.60 1092400 not
>> > healthy; waiting to boot
>> > 2021-07-18T00:44:39.847+0430 7fd1cda76700  1 heartbeat_map is_healthy
>> > 'OSD::osd_op_tp thread 0x7fd1b4243700' had timed out after 15
>> > 2021-07-18T00:44:39.847+0430 7fd1cda76700  1 osd.60 1092400 is_healthy
>> > false -- internal heartbeat failed
>> > 2021-07-18T00:44:39.847+0430 7fd1cda76700  1 osd.60 1092400 not
>> > healthy; waiting to boot
>> > 2021-07-18T00:44:40.895+0430 7fd1cda76700  1 heartbeat_map is_healthy
>> > 'OSD::osd_op_tp thread 0x7fd1b4243700' had timed out after 15
>> > 2021-07-18T00:44:40.895+0430 7fd1cda76700  1 osd.60 1092400 is_healthy
>> > false -- internal heartbeat failed
>> > 2021-07-18T00:44:40.895+0430 7fd1cda76700  1 osd.60 1092400 not
>> > healthy; waiting to boot
>> > 2021-07-18T00:44:41.859+0430 7fd1cda76700  1 heartbeat_map is_healthy
>> > 'OSD::osd_op_tp thread 0x7fd1b4243700' had timed out after 15
>> > 2021-07-18T00:44:41.859+0430 7fd1cda76700  1 osd.60 1092400 is_healthy
>> > false -- internal heartbeat failed
>> > 2021-07-18T00:44:41.859+0430 7fd1cda76700  1 osd.60 1092400 not
>> > healthy; waiting to boot
>> > 2021-07-18T00:44:42.811+0430 7fd1cda76700  1 heartbeat_map is_healthy
>> > 'OSD::osd_op_tp thread 0x7fd1b4243700' had timed out after 15
>> > 2021-07-18T00:44:42.811+0430 7fd1cda76700  1 osd.60 1092400 is_healthy
>> > false -- internal heartbeat failed
>> >
>> > ```
>> > what is the suitable configuration for gc in such a heavy delete
>> process so
>> > it doesn't make slow ops? We had the same delete load in nautilus but we
>> > didn't have any problem with that.
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx