Re: Ceph OSD's take 10+ minutes to start on reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Interestingly this took a really long time (22 minutes overall).

However since the restart of OSD 10 will now restart no problem. I've put
the log below...

2022-03-16T15:26:09.181+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _mount path /var/lib/ceph/osd/ceph-10
2022-03-16T15:26:09.181+0000 7f0a2f26bf00  0
bluestore(/var/lib/ceph/osd/ceph-10) _open_db_and_around read-only:0
repair:0
2022-03-16T15:26:09.181+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _set_cache_sizes cache_size 3221225472
meta 0.45 kv 0.45 data 0.06
2022-03-16T15:26:09.533+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _prepare_db_environment set db_paths
to db,3648715856281 db.slow,3648715856281
2022-03-16T15:37:09.709+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _open_db opened rocksdb path db
options
compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,write_buffer_size=64M,compaction_readahead_size=2M
2022-03-16T15:37:09.725+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _open_super_meta old nid_max 158590
2022-03-16T15:37:09.725+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _open_super_meta old blobid_max 72235
2022-03-16T15:37:09.725+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _open_super_meta freelist_type bitmap
2022-03-16T15:37:09.725+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _open_super_meta ondisk_format 4
compat_ondisk_format 3
2022-03-16T15:37:09.725+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _open_super_meta min_alloc_size 0x1000
2022-03-16T15:37:09.789+0000 7f0a2f26bf00  1 freelist init
2022-03-16T15:37:09.789+0000 7f0a2f26bf00  1 freelist _read_cfg
2022-03-16T15:37:09.789+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _init_alloc opening allocation metadata
2022-03-16T15:37:17.365+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _init_alloc loaded 3.2 TiB in 700598
extents, allocator type hybrid, capacity 0x37e3ec00000, block size 0x1000,
free 0x32cd5323000, fragmentation 0.000821991
2022-03-16T15:37:18.173+0000 7f0a2f26bf00  1 HybridAllocator
_spillover_range constructing fallback allocator
2022-03-16T15:37:18.997+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _prepare_db_environment set db_paths
to db,3648715856281 db.slow,3648715856281
2022-03-16T15:48:20.853+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _open_db opened rocksdb path db
options
compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,write_buffer_size=64M,compaction_readahead_size=2M
2022-03-16T15:48:20.853+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _upgrade_super from 4, latest 4
2022-03-16T15:48:20.853+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _upgrade_super done
2022-03-16T15:48:20.909+0000 7f0a22f5c700  5
bluestore.MempoolThread(0x559d9a1a6b90) _resize_shards cache_size:
2836324857 kv_alloc: 1140850688 kv_used: 7367872 kv_onode_alloc: 167772160
kv_onode_used: 4932200 meta_alloc: 1124073472 meta_used: 19397 data_alloc:
218103808 data_used: 0
2022-03-16T15:48:20.945+0000 7f0a2f26bf00  0 <cls>
./src/cls/hello/cls_hello.cc:316: loading cls_hello
2022-03-16T15:48:20.957+0000 7f0a2f26bf00  0 <cls>
./src/cls/cephfs/cls_cephfs.cc:201: loading cephfs
2022-03-16T15:48:20.957+0000 7f0a2f26bf00  0 _get_class not permitted to
load kvs
2022-03-16T15:48:20.965+0000 7f0a2f26bf00  0 _get_class not permitted to
load lua
2022-03-16T15:48:20.969+0000 7f0a2f26bf00  0 _get_class not permitted to
load sdk
2022-03-16T15:48:20.973+0000 7f0a2f26bf00  0 osd.10 2628 crush map has
features 432629239337189376, adjusting msgr requires for clients
2022-03-16T15:48:20.973+0000 7f0a2f26bf00  0 osd.10 2628 crush map has
features 432629239337189376 was 8705, adjusting msgr requires for mons
2022-03-16T15:48:20.973+0000 7f0a2f26bf00  0 osd.10 2628 crush map has
features 3314933000854323200, adjusting msgr requires for osds
2022-03-16T15:48:21.761+0000 7f0a2f26bf00  0 osd.10 2628 load_pgs
2022-03-16T15:48:25.917+0000 7f0a22f5c700  5
bluestore.MempoolThread(0x559d9a1a6b90) _resize_shards cache_size:
2845415696 kv_alloc: 1258291200 kv_used: 145565552 kv_onode_alloc:
234881024 kv_onode_used: 64844176 meta_alloc: 1107296256 meta_used: 114923
data_alloc: 218103808 data_used: 81920
2022-03-16T15:48:28.377+0000 7f0a2f26bf00  0 osd.10 2628 load_pgs opened
140 pgs
2022-03-16T15:48:28.377+0000 7f0a2f26bf00 -1 osd.10 2628 log_to_monitors
{default=true}
2022-03-16T15:48:29.249+0000 7f0a2f26bf00  0 osd.10 2628 done with init,
starting boot process
2022-03-16T15:48:29.257+0000 7f0a27774700 -1 osd.10 2628 set_numa_affinity
unable to identify public interface '' numa node: (2) No such file or
directory
2022-03-16T15:48:29.345+0000 7f0a29778700  0 log_channel(cluster) log [DBG]
: purged_snaps scrub starts
2022-03-16T15:48:29.349+0000 7f0a29778700  0 log_channel(cluster) log [DBG]
: purged_snaps scrub ok

On Wed, 16 Mar 2022 at 15:33, Chris Page <sirhc.page@xxxxxxxxx> wrote:

>
> Hi Igor,
>
> > Curios what happened to these OSDs and/or node before the first issue
> appearance?  Ceph upgrade ? Unexpected shutdown? Anything else notable?
>
> We originally had Ceph set up with some drives that we chose not to
> pursue. On removing all of the drives things went a bit wayward and we
> ended up completely reinstalling Ceph on each node.
>
> > Have you tried a single OSD restart multiple times and saw that long
> delay everytime? Or that was just a single attempt per OSD?
>
> It's only something that I've noticed when rebooting our pve01 node.
> However, having just tested an OSD on another node we are getting the same.
> I'm currently sat waiting on an OSD to restart:
>
> 2022-03-16T15:26:08.853+0000 7f0a2f26bf00  0 osd.10:6.OSDShard using op
> scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
> 2022-03-16T15:26:08.853+0000 7f0a2f26bf00  1
> bluestore(/var/lib/ceph/osd/ceph-10) _set_cache_sizes cache_size 3221225472
> meta 0.45 kv 0.45 data 0.06
> 2022-03-16T15:26:09.181+0000 7f0a2f26bf00  0 osd.10:7.OSDShard using op
> scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
> 2022-03-16T15:26:09.181+0000 7f0a2f26bf00  1
> bluestore(/var/lib/ceph/osd/ceph-10) _mount path /var/lib/ceph/osd/ceph-10
> 2022-03-16T15:26:09.181+0000 7f0a2f26bf00  0
> bluestore(/var/lib/ceph/osd/ceph-10) _open_db_and_around read-only:0
> repair:0
> 2022-03-16T15:26:09.181+0000 7f0a2f26bf00  1
> bluestore(/var/lib/ceph/osd/ceph-10) _set_cache_sizes cache_size 3221225472
> meta 0.45 kv 0.45 data 0.06
> 2022-03-16T15:26:09.533+0000 7f0a2f26bf00  1
> bluestore(/var/lib/ceph/osd/ceph-10) _prepare_db_environment set db_paths
> to db,3648715856281 db.slow,3648715856281
>
> Once this has restarted I'll test to see if it happens every time.
>
> > And you missed my question about Ceph version?
>
> We're currently running 16.2.7.
>
> Appreciate your help,
> Chris.
>
>
> On Wed, 16 Mar 2022 at 15:07, Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:
>
>> Chris,
>>
>>
>> hmm... so  I can't see any fsck output hence that's apparently not the
>> root cause.
>>
>> Curios what happened to these OSDs and/or node before the first issue
>> appearance?  Ceph upgrade ? Unexpected shutdown? Anything else notable?
>>
>> Have you tried a single OSD restart multiple times and saw that long
>> delay everytime? Or that was just a single attempt per OSD?
>>
>> What's about other nodes? You haven't seen any issues with them or you
>> just haven't tried?
>>
>> And you missed my question about Ceph version?
>>
>>
>> Thanks,
>>
>> Igor
>>
>>
>> On 3/16/2022 5:37 PM, Chris Page wrote:
>> >> Thanks Igor,
>> >
>> >   So I stuck the debugging up to 5 and rebooted, and suddenly the OSD's
>> are
>> > coming back in no time again. Might this be because they were so
>> recently
>> > rebooted? I've added the log with debug below:
>> >
>> > 2022-03-16T14:31:30.031+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _mount path /var/lib/ceph/osd/ceph-9
>> > 2022-03-16T14:31:30.031+0000 7f739fd28f00  0
>> > bluestore(/var/lib/ceph/osd/ceph-9) _open_db_and_around read-only:0
>> repair:0
>> > 2022-03-16T14:31:30.031+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _set_cache_sizes cache_size
>> 3221225472
>> > meta 0.45 kv 0.45 data 0.06
>> > 2022-03-16T14:31:30.083+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _prepare_db_environment set
>> db_paths to
>> > db,3648715856281 db.slow,3648715856281
>> > 2022-03-16T14:31:45.884+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _open_db opened rocksdb path db
>> options
>> >
>> compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,write_buffer_size=64M,compaction_readahead_size=2M
>> > 2022-03-16T14:31:45.884+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta old nid_max 164243
>> > 2022-03-16T14:31:45.884+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta old blobid_max
>> 97608
>> > 2022-03-16T14:31:45.896+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta freelist_type
>> bitmap
>> > 2022-03-16T14:31:45.896+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta ondisk_format 4
>> > compat_ondisk_format 3
>> > 2022-03-16T14:31:45.896+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta min_alloc_size
>> 0x1000
>> > 2022-03-16T14:31:45.904+0000 7f739fd28f00  1 freelist init
>> > 2022-03-16T14:31:45.904+0000 7f739fd28f00  1 freelist _read_cfg
>> > 2022-03-16T14:31:45.904+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _init_alloc opening allocation
>> metadata
>> > 2022-03-16T14:31:50.032+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _init_alloc loaded 3.2 TiB in 772662
>> > extents, allocator type hybrid, capacity 0x37e3ec00000, block size
>> 0x1000,
>> > free 0x328e41eb000, fragmentation 0.000910959
>> > 2022-03-16T14:31:50.172+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _prepare_db_environment set
>> db_paths to
>> > db,3648715856281 db.slow,3648715856281
>> > 2022-03-16T14:32:04.996+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _open_db opened rocksdb path db
>> options
>> >
>> compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,write_buffer_size=64M,compaction_readahead_size=2M
>> > 2022-03-16T14:32:04.996+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _upgrade_super from 4, latest 4
>> > 2022-03-16T14:32:04.996+0000 7f739fd28f00  1
>> > bluestore(/var/lib/ceph/osd/ceph-9) _upgrade_super done
>> > 2022-03-16T14:32:05.036+0000 7f7393a19700  5
>> > bluestore.MempoolThread(0x55b44a7d0b90) _resize_shards cache_size:
>> > 2843595409 kv_alloc: 1174405120 kv_used: 2484424 kv_onode_alloc:
>> 184549376
>> > kv_onode_used: 4864776 meta_alloc: 1174405120 meta_used: 20181
>> data_alloc:
>> > 218103808 data_used: 0
>> > 2022-03-16T14:32:05.040+0000 7f739fd28f00  0 _get_class not permitted to
>> > load sdk
>> > 2022-03-16T14:32:05.044+0000 7f739fd28f00  0 _get_class not permitted to
>> > load lua
>> > 2022-03-16T14:32:05.044+0000 7f739fd28f00  0 _get_class not permitted to
>> > load kvs
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users@xxxxxxx
>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
>> --
>> Igor Fedotov
>> Ceph Lead Developer
>>
>> Looking for help with your Ceph cluster? Contact us at https://croit.io
>>
>> croit GmbH, Freseniusstr. 31h, 81247 Munich
>> CEO: Martin Verges - VAT-ID: DE310638492
>> Com. register: Amtsgericht Munich HRB 231263
>> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
>>
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux