Dear Igor, thanks a lot for your assistance. We're still trying to bring OSDs back up... the cluster is not in a great shape right now. > so the log from the ticket I can see a huge ((400+ MB) bluefs log > kept over many small non-adjustent extents. > Presumably it was caused by either setting small bluefs_alloc_size or > high disk space fragmentation or both. Now I'd like more details on > your OSDs. > Could you please collect OSD startup log with debug_bluefs set to 20? Yes, I now have such a log from an OSD that crashed with the assertion in the subject after about 30 seconds. The log file is about 850'000 lines / 100 MB in size. How can I make it available to you? > Also please run the following commands for broken OSD (need results > only, no need to collect the log unless they're failing): > ceph-bluestore-tool --path <path-to-osd> --command bluefs-bdev-sizes ---------------------------------------------------------------------- inferring bluefs devices from bluestore path slot 2 /var/lib/ceph/osd/ceph-46/block -> /dev/dm-7 slot 1 /var/lib/ceph/osd/ceph-46/block.db -> /dev/dm-17 1 : device size 0xa74c00000 : own 0x[2000~6b4bfe000] = 0x6b4bfe000 : using 0x6b4bfe000(27 GiB) 2 : device size 0x74702000000 : own 0x[37e3e600000~4a85400000] = 0x4a85400000 : using 0x4a85400000(298 GiB) ---------------------------------------------------------------------- > ceph-bluestore-tool --path <path-to-osd> --command free-score ---------------------------------------------------------------------- block: { "fragmentation_rating": 0.84012572151981013 } bluefs-db: { "fragmentation_rating": -nan } failure querying 'bluefs-wal' 2020-05-29 16:31:54.882 7fec3c89cd80 -1 asok(0x55c4ec574000) AdminSocket: request '{"prefix": "bluestore allocator score bluefs-wal"}' not defined ---------------------------------------------------------------------- See anything interesting? -- Simon. > Thanks, > Igor > On 5/29/2020 1:05 PM, Simon Leinen wrote: >> Colleague of Harry's here... >> >> Harald Staub writes: >>> This is again about our bad cluster, with too much objects, and the >>> hdd OSDs have a DB device that is (much) too small (e.g. 20 GB, i.e. 3 >>> GB usable). Now several OSDs do not come up any more. >>> Typical error message: >>> /build/ceph-14.2.8/src/os/bluestore/BlueFS.cc: 2261: FAILED >>> ceph_assert(h->file->fnode.ino != 1) >> The context of that line is "we should never run out of log space here": >> >> // previously allocated extents. >> bool must_dirty = false; >> if (allocated < offset + length) { >> // we should never run out of log space here; see the min runway check >> // in _flush_and_sync_log. >> ceph_assert(h->file->fnode.ino != 1); >> >> So I guess we are violating that "should", and the Bluestore code >> doesn't handle that case. And the "min runway" check may not be >> reliable. Should we file a bug? >> >> Again, help on how to proceed would be greatly appreciated... _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx