Re: laggy OSDs and staling krbd IO after upgrade from nautilus to octopus

Boris Behrens <bb@xxxxxxxxx> · Tue, 13 Sep 2022 12:09:00 +0200

I checked the cluster for other snaptrim operations and they happen all
over the place, so for me it looks like they just happend to be done when
the issue occured, but were not the driving factor.

Am Di., 13. Sept. 2022 um 12:04 Uhr schrieb Boris Behrens <bb@xxxxxxxxx>:

> Because someone mentioned that the attachments did not went through I
> created pastebin links:
>
> monlog: https://pastebin.com/jiNPUrtL
> osdlog: https://pastebin.com/dxqXgqDz
>
> Am Di., 13. Sept. 2022 um 11:43 Uhr schrieb Boris Behrens <bb@xxxxxxxxx>:
>
>> Hi, I need you help really bad.
>>
>> we are currently experiencing a very bad cluster hangups that happen
>> sporadic. (once on 2022-09-08 mid day (48 hrs after the upgrade) and once
>> 2022-09-12 in the evening)
>> We use krbd without cephx for the qemu clients and when the OSDs are
>> getting laggy, the krbd connection comes to a grinding halt, to a point
>> that all IO is staling and we can't even unmap the rbd device.
>>
>> From the logs, it looks like that the cluster starts to snaptrim a lot a
>> PGs, then PGs become laggy and then the cluster snowballs into laggy OSDs.
>> I have attached the monitor log and the osd log (from one OSD) around the
>> time where it happened.
>>
>> - is this a known issue?
>> - what can I do to debug it further?
>> - can I downgrade back to nautilus?
>> - should I upgrade the PGs for the pool to 4096 or 8192?
>>
>> The cluster contains a mixture of 2,4 and 8TB SSDs (no rotating disks)
>> where the 8TB disks got ~120PGs and the 2TB disks got ~30PGs. All hosts
>> have a minimum of 128GB RAM and the kernel logs of all ceph hosts do not
>> show anything for the timeframe.
>>
>> Cluster stats:
>>   cluster:
>>     id:     74313356-3b3d-43f3-bce6-9fb0e4591097
>>     health: HEALTH_OK
>>
>>   services:
>>     mon: 3 daemons, quorum ceph-rbd-mon4,ceph-rbd-mon5,ceph-rbd-mon6 (age
>> 25h)
>>     mgr: ceph-rbd-mon5(active, since 4d), standbys: ceph-rbd-mon4,
>> ceph-rbd-mon6
>>     osd: 149 osds: 149 up (since 6d), 149 in (since 7w)
>>
>>   data:
>>     pools:   4 pools, 2241 pgs
>>     objects: 25.43M objects, 82 TiB
>>     usage:   231 TiB used, 187 TiB / 417 TiB avail
>>     pgs:     2241 active+clean
>>
>>   io:
>>     client:   211 MiB/s rd, 273 MiB/s wr, 1.43k op/s rd, 8.80k op/s wr
>>
>> --- RAW STORAGE ---
>> CLASS  SIZE     AVAIL    USED     RAW USED  %RAW USED
>> ssd    417 TiB  187 TiB  230 TiB   231 TiB      55.30
>> TOTAL  417 TiB  187 TiB  230 TiB   231 TiB      55.30
>>
>> --- POOLS ---
>> POOL                   ID  PGS   STORED   OBJECTS  USED     %USED  MAX
>> AVAIL
>> isos                    7    64  455 GiB  117.92k  1.3 TiB   1.17     38
>> TiB
>> rbd                     8  2048   76 TiB   24.65M  222 TiB  66.31     38
>> TiB
>> archive                 9   128  2.4 TiB  669.59k  7.3 TiB   6.06     38
>> TiB
>> device_health_metrics  10     1   25 MiB      149   76 MiB      0     38
>> TiB
>>
>>
>>
>> --
>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
>> groÃƒ¼en Saal.
>>
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groÃƒ¼en Saal.
>

-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groÃƒ¼en Saal.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx