Re: Ceph OSD's take 10+ minutes to start on reboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Igor,

> Curios what happened to these OSDs and/or node before the first issue
appearance?  Ceph upgrade ? Unexpected shutdown? Anything else notable?

We originally had Ceph set up with some drives that we chose not to pursue.
On removing all of the drives things went a bit wayward and we ended up
completely reinstalling Ceph on each node.

> Have you tried a single OSD restart multiple times and saw that long
delay everytime? Or that was just a single attempt per OSD?

It's only something that I've noticed when rebooting our pve01 node.
However, having just tested an OSD on another node we are getting the same.
I'm currently sat waiting on an OSD to restart:

2022-03-16T15:26:08.853+0000 7f0a2f26bf00  0 osd.10:6.OSDShard using op
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2022-03-16T15:26:08.853+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _set_cache_sizes cache_size 3221225472
meta 0.45 kv 0.45 data 0.06
2022-03-16T15:26:09.181+0000 7f0a2f26bf00  0 osd.10:7.OSDShard using op
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2022-03-16T15:26:09.181+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _mount path /var/lib/ceph/osd/ceph-10
2022-03-16T15:26:09.181+0000 7f0a2f26bf00  0
bluestore(/var/lib/ceph/osd/ceph-10) _open_db_and_around read-only:0
repair:0
2022-03-16T15:26:09.181+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _set_cache_sizes cache_size 3221225472
meta 0.45 kv 0.45 data 0.06
2022-03-16T15:26:09.533+0000 7f0a2f26bf00  1
bluestore(/var/lib/ceph/osd/ceph-10) _prepare_db_environment set db_paths
to db,3648715856281 db.slow,3648715856281

Once this has restarted I'll test to see if it happens every time.

> And you missed my question about Ceph version?

We're currently running 16.2.7.

Appreciate your help,
Chris.


On Wed, 16 Mar 2022 at 15:07, Igor Fedotov <igor.fedotov@xxxxxxxx> wrote:

> Chris,
>
>
> hmm... so  I can't see any fsck output hence that's apparently not the
> root cause.
>
> Curios what happened to these OSDs and/or node before the first issue
> appearance?  Ceph upgrade ? Unexpected shutdown? Anything else notable?
>
> Have you tried a single OSD restart multiple times and saw that long
> delay everytime? Or that was just a single attempt per OSD?
>
> What's about other nodes? You haven't seen any issues with them or you
> just haven't tried?
>
> And you missed my question about Ceph version?
>
>
> Thanks,
>
> Igor
>
>
> On 3/16/2022 5:37 PM, Chris Page wrote:
> >> Thanks Igor,
> >
> >   So I stuck the debugging up to 5 and rebooted, and suddenly the OSD's
> are
> > coming back in no time again. Might this be because they were so recently
> > rebooted? I've added the log with debug below:
> >
> > 2022-03-16T14:31:30.031+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _mount path /var/lib/ceph/osd/ceph-9
> > 2022-03-16T14:31:30.031+0000 7f739fd28f00  0
> > bluestore(/var/lib/ceph/osd/ceph-9) _open_db_and_around read-only:0
> repair:0
> > 2022-03-16T14:31:30.031+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _set_cache_sizes cache_size
> 3221225472
> > meta 0.45 kv 0.45 data 0.06
> > 2022-03-16T14:31:30.083+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _prepare_db_environment set db_paths
> to
> > db,3648715856281 db.slow,3648715856281
> > 2022-03-16T14:31:45.884+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _open_db opened rocksdb path db
> options
> >
> compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,write_buffer_size=64M,compaction_readahead_size=2M
> > 2022-03-16T14:31:45.884+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta old nid_max 164243
> > 2022-03-16T14:31:45.884+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta old blobid_max 97608
> > 2022-03-16T14:31:45.896+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta freelist_type bitmap
> > 2022-03-16T14:31:45.896+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta ondisk_format 4
> > compat_ondisk_format 3
> > 2022-03-16T14:31:45.896+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _open_super_meta min_alloc_size
> 0x1000
> > 2022-03-16T14:31:45.904+0000 7f739fd28f00  1 freelist init
> > 2022-03-16T14:31:45.904+0000 7f739fd28f00  1 freelist _read_cfg
> > 2022-03-16T14:31:45.904+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _init_alloc opening allocation
> metadata
> > 2022-03-16T14:31:50.032+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _init_alloc loaded 3.2 TiB in 772662
> > extents, allocator type hybrid, capacity 0x37e3ec00000, block size
> 0x1000,
> > free 0x328e41eb000, fragmentation 0.000910959
> > 2022-03-16T14:31:50.172+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _prepare_db_environment set db_paths
> to
> > db,3648715856281 db.slow,3648715856281
> > 2022-03-16T14:32:04.996+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _open_db opened rocksdb path db
> options
> >
> compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,write_buffer_size=64M,compaction_readahead_size=2M
> > 2022-03-16T14:32:04.996+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _upgrade_super from 4, latest 4
> > 2022-03-16T14:32:04.996+0000 7f739fd28f00  1
> > bluestore(/var/lib/ceph/osd/ceph-9) _upgrade_super done
> > 2022-03-16T14:32:05.036+0000 7f7393a19700  5
> > bluestore.MempoolThread(0x55b44a7d0b90) _resize_shards cache_size:
> > 2843595409 kv_alloc: 1174405120 kv_used: 2484424 kv_onode_alloc:
> 184549376
> > kv_onode_used: 4864776 meta_alloc: 1174405120 meta_used: 20181
> data_alloc:
> > 218103808 data_used: 0
> > 2022-03-16T14:32:05.040+0000 7f739fd28f00  0 _get_class not permitted to
> > load sdk
> > 2022-03-16T14:32:05.044+0000 7f739fd28f00  0 _get_class not permitted to
> > load lua
> > 2022-03-16T14:32:05.044+0000 7f739fd28f00  0 _get_class not permitted to
> > load kvs
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> --
> Igor Fedotov
> Ceph Lead Developer
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux