Re: growing osd_pglog_items (was: increasing PGs OOM kill SSD OSDs (octopus) - unstable OSD behavior)

Boris Behrens <bb@xxxxxxxxx> · Thu, 23 Feb 2023 13:43:02 +0100

After reading a lot about it I still don't understand how this happened and
what I can do to fix it.

This only trims the pglog, but not the duplicates:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-41 --op
trim-pg-log --pgid 8.664

I also try to recreate the OSDs (sync out, crush rm, wipe disk, create new
osd, sync in), but the osd_pglog_items value seems to grow after everything
is synced back in (I have 8TB disks that are at around 10million items, one
day after I synced them back in). It seems not reach the old value which is
around 50million, but still growing.

Is there anything I can do with an octopus cluster, or is the only way to
upgrade?
And why does it happen?

Am Di., 21. Feb. 2023 um 18:31 Uhr schrieb Boris Behrens <bb@xxxxxxxxx>:

> Thanks a lot Josh. That really seems like my problem.
> That does not look healthy in the cluster. oof.
> ~# ceph tell osd.* perf dump |grep 'osd_pglog\|^osd\.[0-9]'
> osd.0: {
>         "osd_pglog_bytes": 459617868,
>         "osd_pglog_items": 2955043,
> osd.1: {
>         "osd_pglog_bytes": 598414548,
>         "osd_pglog_items": 4315956,
> osd.2: {
>         "osd_pglog_bytes": 357056504,
>         "osd_pglog_items": 1942486,
> osd.3: {
>         "osd_pglog_bytes": 436198324,
>         "osd_pglog_items": 2863501,
> osd.4: {
>         "osd_pglog_bytes": 373516972,
>         "osd_pglog_items": 2127588,
> osd.5: {
>         "osd_pglog_bytes": 335471560,
>         "osd_pglog_items": 1822608,
> osd.6: {
>         "osd_pglog_bytes": 391814808,
>         "osd_pglog_items": 2394209,
> osd.7: {
>         "osd_pglog_bytes": 541849048,
>         "osd_pglog_items": 3880437,
> ...
>
>
> Am Di., 21. Feb. 2023 um 18:21 Uhr schrieb Josh Baergen <
> jbaergen@xxxxxxxxxxxxxxxx>:
>
>> Hi Boris,
>>
>> This sounds a bit like https://tracker.ceph.com/issues/53729.
>> https://tracker.ceph.com/issues/53729#note-65 might help you diagnose
>> whether this is the case.
>>
>> Josh
>>
>> On Tue, Feb 21, 2023 at 9:29 AM Boris Behrens <bb@xxxxxxxxx> wrote:
>> >
>> > Hi,
>> > today I wanted to increase the PGs from 2k -> 4k and random OSDs went
>> > offline in the cluster.
>> > After some investigation we saw, that the OSDs got OOM killed (I've
>> seen a
>> > host that went from 90GB used memory to 190GB before OOM kills happen).
>> >
>> > We have around 24 SSD OSDs per host and 128GB/190GB/265GB memory in
>> these
>> > hosts. All of them experienced OOM kills.
>> > All hosts are octopus / ubuntu 20.04.
>> >
>> > And on every step new OSDs crashed with OOM. (We now set the
>> pg_num/pgp_num
>> > to 2516 to stop the process).
>> > The OSD logs do not show anything why this might happen.
>> > Some OSDs also segfault.
>> >
>> > I now started to stop all OSDs on a host, and do a "ceph-bluestore-tool
>> > repair" and a "ceph-kvstore-tool bluestore-kv compact" on all OSDs. This
>> > takes for the 8GB OSDs around 30 minutes. When I start the OSDs I
>> instantly
>> > get a lot of slow OPS from all the other OSDs when the OSD come up (the
>> 8TB
>> > OSDs take around 10 minutes with "load_pgs".
>> >
>> > I am unsure what I can do to restore normal cluster performance. Any
>> ideas
>> > or suggestions or maybe even known bugs?
>> > Maybe a line for what I can search in the logs.
>> >
>> > Cheers
>> >  Boris
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groÃƒ¼en Saal.
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx