Re: osds won't start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



My clusters are self rolled.  My start command is as follows

podman run -it --privileged --pid=host --cpuset-cpus 0,1 --memory 2g --name
ceph_osd0 --hostname ceph_osd0 -v /dev:/dev -v
/etc/localtime:/etc/localtime:ro -v /etc/ceph:/etc/ceph/ -v
/var/lib/ceph/osd/ceph-0:/var/lib/ceph/osd/ceph-0 -v
/var/log/ceph:/var/log/ceph -v /run/udev/:/run/udev/
ceph/ceph:v16.2.7-20220201 ceph-osd --id 0 -c /etc/ceph/ceph.conf --cluster
ceph -f


I jumped from the octopus img to the 16.2.7 img.  I've been running well
for awhile with no issues.  The cluster was clean, no backfills in
progressor etc  This latest zyp up and reboot and now I have osds that
don't start.

podman image ls
quay.io/ceph/ceph                                         v16.2.7
231fd40524c4  9 days ago    1.39 GB
quay.io/ceph/ceph                                         v16.2.7-20220201
 231fd40524c4  9 days ago    1.39 GB


bluefs fails to mount up, I guess?  The headers are still readable via
bluestore tool

ceph-bluestore-tool show-label --dev /dev/mapper/ceph-0block
{
    "/dev/mapper/ceph-0block": {
        "osd_uuid": "1234abcd-1234-abcd-1234-1234 abcd1234",
        "size": 6001171365888,
        "btime": "2019-04-11T08:46:36.013428-0700",
        "description": "main",
        "bfm_blocks": "1465129728",
        "bfm_blocks_per_key": "128",
        "bfm_bytes_per_block": "4096",
        "bfm_size": "6001171365888",
        "bluefs": "1",
        "ceph_fsid": "1234abcd-1234-abcd-1234-1234 abcd1234",
        "kv_backend": "rocksdb",
        "magic": "ceph osd volume v026",
        "mkfs_done": "yes",
        "ready": "ready",
        "require_osd_release": "16",
        "whoami": "0"
    }
}


On Fri, Feb 11, 2022 at 1:06 AM Eugen Block <eblock@xxxxxx> wrote:

> Can you share some more information how exactly you upgraded? It looks
> like a cephadm managed cluster. Did you intall OS updates on all nodes
> without waiting for the first one to recover? Maybe I'm misreading so
> please clarify what your update process looked like.
>
>
> Zitat von Mazzystr <mazzystr@xxxxxxxxx>:
>
> > I applied latest os updates and rebooted my hosts.  Now all my osds fail
> to
> > start.
> >
> > # cat /etc/os-release
> > NAME="openSUSE Tumbleweed"
> > # VERSION="20220207"
> > ID="opensuse-tumbleweed"
> > ID_LIKE="opensuse suse"
> > VERSION_ID="20220207"
> >
> > # uname -a
> > Linux cube 5.16.5-1-default #1 SMP PREEMPT Thu Feb 3 05:26:48 UTC 2022
> > (1af4009) x86_64 x86_64 x86_64 GNU/Linux
> >
> > container image: v16.2.7 / v16.2.7-20220201
> >
> > osd debug log shows the following
> >   -11> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1 bluefs
> add_block_device
> > bdev 0 path /var/lib/ceph/osd/ceph-0/block.wal size 50 GiB
> >    -10> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > max_total_wal_size = 1073741824
> >     -9> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > compaction_readahead_size = 2097152
> >     -8> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > max_write_buffer_number = 4
> >     -7> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > max_background_compactions = 2
> >     -6> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > compression = kNoCompression
> >     -5> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > writable_file_max_buffer_size = 0
> >     -4> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > min_write_buffer_number_to_merge = 1
> >     -3> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > recycle_log_file_num = 4
> >     -2> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1  set rocksdb option
> > write_buffer_size = 268435456
> >     -1> 2022-02-10T19:14:48.383-0800 7ff1be4c3080  1 bluefs mount
> >      0> 2022-02-10T19:14:48.387-0800 7ff1be4c3080 -1 *** Caught signal
> > (Aborted) **
> >  in thread 7ff1be4c3080 thread_name:ceph-osd
> >
> >  ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific
> > (stable)
> >  1: /lib64/libpthread.so.0(+0x12c20) [0x7ff1bc465c20]
> >  2: gsignal()
> >  3: abort()
> >  4: /lib64/libstdc++.so.6(+0x9009b) [0x7ff1bba7c09b]
> >  5: /lib64/libstdc++.so.6(+0x9653c) [0x7ff1bba8253c]
> >  6: /lib64/libstdc++.so.6(+0x96597) [0x7ff1bba82597]
> >  7: /lib64/libstdc++.so.6(+0x967f8) [0x7ff1bba827f8]
> >  8: ceph-osd(+0x56301f) [0x559ff6d6301f]
> >  9: (BlueFS::_open_super()+0x18c) [0x559ff745f08c]
> >  10: (BlueFS::mount()+0xeb) [0x559ff748085b]
> >  11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x559ff735e464]
> >  12: (BlueStore::_prepare_db_environment(bool, bool,
> > std::__cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> >*, std::__cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> >*)+0x6d9) [0x559ff735f5b9]
> >  13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x559ff73608b5]
> >  14: (BlueStore::_open_db_and_around(bool, bool)+0x273) [0x559ff73cba33]
> >  15: (BlueStore::_mount()+0x204) [0x559ff73ce974]
> >  16: (OSD::init()+0x380) [0x559ff6ea2400]
> >  17: main()
> >  18: __libc_start_main()
> >  19: _start()
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
> > to interpret this.
> >
> >
> > The process log shows the following
> > 2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
> > dangerous and experimental features are enabled: bluestore,rocksdb
> > 2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
> > dangerous and experimental features are enabled: bluestore,rocksdb
> > 2022-02-10T19:33:31.852-0800 7f22869e8080 -1 WARNING: the following
> > dangerous and experimental features are enabled: bluestore,rocksdb
> > terminate called after throwing an instance of
> > 'ceph::buffer::v15_2_0::malformed_input'
> >   what():  void
> > bluefs_super_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) no
> > longer understand old encoding version 2 < 143: Malformed input
> > *** Caught signal (Aborted) **
> >  in thread 7f22869e8080 thread_name:ceph-osd
> >  ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific
> > (stable)
> >  1: /lib64/libpthread.so.0(+0x12c20) [0x7f228498ac20]
> >  2: gsignal()
> >  3: abort()
> >  4: /lib64/libstdc++.so.6(+0x9009b) [0x7f2283fa109b]
> >  5: /lib64/libstdc++.so.6(+0x9653c) [0x7f2283fa753c]
> >  6: /lib64/libstdc++.so.6(+0x96597) [0x7f2283fa7597]
> >  7: /lib64/libstdc++.so.6(+0x967f8) [0x7f2283fa77f8]
> >  8: ceph-osd(+0x56301f) [0x55e6faf6301f]
> >  9: (BlueFS::_open_super()+0x18c) [0x55e6fb65f08c]
> >  10: (BlueFS::mount()+0xeb) [0x55e6fb68085b]
> >  11: (BlueStore::_open_bluefs(bool, bool)+0x94) [0x55e6fb55e464]
> >  12: (BlueStore::_prepare_db_environment(bool, bool,
> > std::__cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> >*, std::__cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> >*)+0x6d9) [0x55e6fb55f5b9]
> >  13: (BlueStore::_open_db(bool, bool, bool)+0x155) [0x55e6fb5608b5]
> >  14: (BlueStore::_open_db_and_around(bool, bool)+0x273) [0x55e6fb5cba33]
> >  15: (BlueStore::_mount()+0x204) [0x55e6fb5ce974]
> >  16: (OSD::init()+0x380) [0x55e6fb0a2400]
> >  17: main()
> >  18: __libc_start_main()
> >  19: _start()
> > 2022-02-10T19:33:34.620-0800 7f22869e8080 -1 *** Caught signal (Aborted)
> **
> >  in thread 7f22869e8080 thread_name:ceph-osd
> >
> >
> > Doesn't anyone have any ideas what could be going on here?
> >
> > Thanks,
> > /Chris
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux