Re: cannot create new OSDs - ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Martin,

it looks like you're using custom osd settings. Namely:

- bluestore_allocator set to bitmap (which is fine)

- bluestore_min_alloc_size set to 128K

The latter is apparently out-of-sync with bluefs_shared_alloc_size (set to 64K by default). Which causes the assertion at some point due to unexpected allocation request alignment.

Generally one should have bluefs_shared_alloc_size equal or higher than (and aligned to) bluestore_min_alloc_size.

I'm curious why have you raised bluestore_min_alloc_size though? I recall no cases when we've heard of any benefit from that. Particularly for SSD devices...

I'd recommend to set it back to default.


Thanks,

Igor

On 12/09/2023 19:44, Konold, Martin wrote:
Hi Igor,

I recreated the log with full debugging enabled.

https://www.konsec.com/download/full-debug-20-ceph-osd.43.log.gz

and another without the debug settings

https://www.konsec.com/download/failed-ceph-osd.43.log.gz

I hope you can draw some conclusions from it and I am looking forward to your response.

Regards
ppa. Martin Konold

--
Martin Konold - Prokurist, CTO
KONSEC GmbH -⁠ make things real
Amtsgericht Stuttgart, HRB 23690
Geschäftsführer: Andreas Mack
Im Köller 3, 70794 Filderstadt, Germany

On 2023-09-11 22:08, Igor Fedotov wrote:
Hi Martin,

could you please share the full existing log and also set
debug_bluestore and debug_bluefs to 20 and collect new osd startup
log.


Thanks,

Igor

On 11/09/2023 20:53, Konold, Martin wrote:
Hi,

I want to create a new OSD on a 4TB Samsung MZ1L23T8HBLA-00A07 enterprise nvme device in a hyper-converged proxmox 8 environment.

Creating the OSD works but it cannot be initialized and therefore not started.

In the log I see an entry about a failed assert.

./src/os/bluestore/fastbmap_allocator_impl.cc: 405: FAILED ceph_assert((aligned_extent.length % l0_granularity) == 0)

Is this the culprit?

In addition at the end of the logfile there is a failed mount and a failed osd init mentioned.

2023-09-11T16:30:04.708+0200 7f99aa28f3c0 -1 bluefs _check_allocations OP_FILE_UPDATE_INC invalid extent 1: 0x140000~10000: duplicate reference, ino 30 2023-09-11T16:30:04.708+0200 7f99aa28f3c0 -1 bluefs mount failed to replay log: (14) Bad address
2023-09-11T16:30:04.708+0200 7f99aa28f3c0 20 bluefs _stop_alloc
2023-09-11T16:30:04.708+0200 7f99aa28f3c0 -1 bluestore(/var/lib/ceph/osd/ceph-43) _open_bluefs failed bluefs mount: (14) Bad address 2023-09-11T16:30:04.708+0200 7f99aa28f3c0 10 bluefs maybe_verify_layout no memorized_layout in bluefs superblock 2023-09-11T16:30:04.708+0200 7f99aa28f3c0 -1 bluestore(/var/lib/ceph/osd/ceph-43) _open_db failed to prepare db environment: 2023-09-11T16:30:04.708+0200 7f99aa28f3c0  1 bdev(0x5565c261fc00 /var/lib/ceph/osd/ceph-43/block) close 2023-09-11T16:30:04.940+0200 7f99aa28f3c0 -1 osd.43 0 OSD:init: unable to mount object store 2023-09-11T16:30:04.940+0200 7f99aa28f3c0 -1  ** ERROR: osd init failed: (5) Input/output error

I verified that the hardware of the new nvme is working fine.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux