I was trying to keep things clear and I was aware of the login issue. Sorry. You're right. OSD's are not full. Need balance but I can't activate the balancer because of the issue. ceph osd df tree | grep 'CLASS\|ssd' ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME 19 ssd 0.87320 1.00000 894 GiB 401 GiB 155 GiB 238 GiB 8.6 GiB 493 GiB 44.88 0.83 102 up osd.19 208 ssd 0.87329 1.00000 894 GiB 229 GiB 112 GiB 116 GiB 1.5 GiB 665 GiB 25.64 0.48 95 up osd.208 209 ssd 0.87329 1.00000 894 GiB 228 GiB 110 GiB 115 GiB 3.3 GiB 666 GiB 25.54 0.48 65 up osd.209 199 ssd 0.87320 1.00000 894 GiB 348 GiB 155 GiB 191 GiB 1.3 GiB 546 GiB 38.93 0.72 103 up osd.199 202 ssd 0.87329 1.00000 894 GiB 340 GiB 116 GiB 223 GiB 1.7 GiB 554 GiB 38.04 0.71 97 up osd.202 218 ssd 0.87329 1.00000 894 GiB 214 GiB 95 GiB 118 GiB 839 MiB 680 GiB 23.92 0.44 37 up osd.218 39 ssd 0.87320 1.00000 894 GiB 381 GiB 114 GiB 261 GiB 6.4 GiB 514 GiB 42.57 0.79 91 up osd.39 207 ssd 0.87329 1.00000 894 GiB 277 GiB 115 GiB 155 GiB 6.2 GiB 618 GiB 30.94 0.58 81 up osd.207 210 ssd 0.87329 1.00000 894 GiB 346 GiB 138 GiB 207 GiB 1.6 GiB 548 GiB 38.73 0.72 99 up osd.210 59 ssd 0.87320 1.00000 894 GiB 423 GiB 166 GiB 254 GiB 2.9 GiB 471 GiB 47.29 0.88 97 up osd.59 203 ssd 0.87329 1.00000 894 GiB 363 GiB 127 GiB 229 GiB 7.7 GiB 531 GiB 40.63 0.76 104 up osd.203 211 ssd 0.87329 1.00000 894 GiB 257 GiB 76 GiB 179 GiB 1.9 GiB 638 GiB 28.70 0.53 81 up osd.211 79 ssd 0.87320 1.00000 894 GiB 459 GiB 144 GiB 313 GiB 2.0 GiB 435 GiB 51.32 0.95 102 up osd.79 206 ssd 0.87329 1.00000 894 GiB 339 GiB 140 GiB 197 GiB 2.0 GiB 556 GiB 37.88 0.70 94 up osd.206 212 ssd 0.87329 1.00000 894 GiB 301 GiB 107 GiB 192 GiB 1.5 GiB 593 GiB 33.68 0.63 80 up osd.212 99 ssd 0.87320 1.00000 894 GiB 282 GiB 96 GiB 180 GiB 6.2 GiB 612 GiB 31.59 0.59 85 up osd.99 205 ssd 0.87329 1.00000 894 GiB 309 GiB 115 GiB 186 GiB 7.5 GiB 585 GiB 34.56 0.64 95 up osd.205 213 ssd 0.87329 1.00000 894 GiB 335 GiB 119 GiB 213 GiB 2.5 GiB 559 GiB 37.44 0.70 95 up osd.213 114 ssd 0.87329 1.00000 894 GiB 374 GiB 163 GiB 207 GiB 3.9 GiB 520 GiB 41.84 0.78 99 up osd.114 200 ssd 0.87329 1.00000 894 GiB 271 GiB 104 GiB 163 GiB 3.0 GiB 624 GiB 30.26 0.56 90 up osd.200 214 ssd 0.87329 1.00000 894 GiB 336 GiB 135 GiB 199 GiB 2.7 GiB 558 GiB 37.59 0.70 100 up osd.214 139 ssd 0.87320 1.00000 894 GiB 320 GiB 128 GiB 189 GiB 3.6 GiB 574 GiB 35.82 0.67 96 up osd.139 204 ssd 0.87329 1.00000 894 GiB 362 GiB 153 GiB 206 GiB 3.1 GiB 532 GiB 40.47 0.75 104 up osd.204 215 ssd 0.87329 1.00000 894 GiB 236 GiB 99 GiB 133 GiB 3.4 GiB 659 GiB 26.35 0.49 81 up osd.215 119 ssd 0.87329 1.00000 894 GiB 242 GiB 139 GiB 101 GiB 2.1 GiB 652 GiB 27.09 0.50 99 up osd.119 159 ssd 0.87329 1.00000 894 GiB 253 GiB 127 GiB 123 GiB 2.7 GiB 642 GiB 28.25 0.53 93 up osd.159 216 ssd 0.87329 1.00000 894 GiB 378 GiB 137 GiB 239 GiB 1.8 GiB 517 GiB 42.22 0.79 101 up osd.216 179 ssd 0.87329 1.00000 894 GiB 473 GiB 112 GiB 348 GiB 12 GiB 421 GiB 52.91 0.98 104 up osd.179 201 ssd 0.87329 1.00000 894 GiB 348 GiB 137 GiB 203 GiB 8.5 GiB 546 GiB 38.92 0.72 103 up osd.201 217 ssd 0.87329 1.00000 894 GiB 301 GiB 105 GiB 194 GiB 2.5 GiB 593 GiB 33.64 0.63 89 up osd.217 prosergey07 <prosergey07@xxxxxxxxx>, 9 Kas 2021 Sal, 03:02 tarihinde şunu yazdı: > Are those problematic OSDs getting almost full ? I do not have Ubuntu > account to check their pastebin. > > > > Надіслано з пристрою Galaxy > > > -------- Оригінальне повідомлення -------- > Від: mhnx <morphinwithyou@xxxxxxxxx> > Дата: 08.11.21 15:31 (GMT+02:00) > Кому: Ceph Users <ceph-users@xxxxxxx> > Тема: allocate_bluefs_freespace failed to allocate > > Hello. > > I'm using Nautilus 14.2.16 > I have 30 SSD in my cluster and I use them as Bluestore OSD for RGW index. > Almost every week I'm losing (down) an OSD and when I check osd log I see: > > -6> 2021-11-06 19:01:10.854 7fa799989c40 1 *bluefs _allocate > failed to allocate 0xf4f04 on bdev 1, free 0xb0000; fallback to bdev > 2* > -5> 2021-11-06 19:01:10.854 7fa799989c40 1 *bluefs _allocate > unable to allocate 0xf4f04 on bdev 2, free 0xffffffffffffffff; > fallback to slow device expander* > -4> 2021-11-06 19:01:10.854 7fa799989c40 -1 > bluestore(/var/lib/ceph/osd/ceph-218) *allocate_bluefs_freespace > failed to allocate on* 0x80000000 min_size 0x100000 > allocated total > 0x0 bluefs_shared_alloc_size 0x10000 allocated 0x0 available 0x > a497aab000 > -3> 2021-11-06 19:01:10.854 7fa799989c40 -1 *bluefs _allocate > failed to expand slow device to fit +0xf4f04* > > > Full log: https://paste.ubuntu.com/p/MpJfVjMh7V/plain/ > > And OSD does not start without offline compaction. > Offline compaction log: https://paste.ubuntu.com/p/vFZcYnxQWh/plain/ > > After the Offline compaction I tried to start OSD with bitmap allocator but > it is not getting up because of " FAILED ceph_assert(available >= > allocated)" > Log: https://paste.ubuntu.com/p/2Bbx983494/plain/ > > Then I start the OSD with hybrid allocator and let it recover. > When the recover is done I stop the OSD and start with the bitmap > allocator. > This time it came up but I've got "80 slow ops, oldest one blocked for 116 > sec, osd.218 has slow ops" and I increased "osd_recovery_sleep 10" to give > a breath to cluster and cluster marked the osd as down (it was still > working) after a while the osd marked up and cluster became normal. But > while recovering, other osd's started to give slow ops and I've played > around with "osd_recovery_sleep 0.1 <---> 10" to keep the cluster stable > till recovery finishes. > > Ceph osd df tree before: https://paste.ubuntu.com/p/4K7JXcZ8FJ/plain/ > Ceph osd df tree after osd.218 = bitmap: > https://paste.ubuntu.com/p/5SKbhrbgVM/plain/ > > If I want to change all other osd's allocator to bitmap, I need to repeat > the process 29 time and it will take too much time. > I don't want to heal OSDs with the offline compaction anymore so I will do > that if that's the solution but I want to be sure before doing a lot of > work and maybe with the issue I can provide helpful logs and information > for developers. > > Have a nice day. > Thanks. > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx