OSDs failing to start due to crc32 and osdmap error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

we have issue to start some OSDs on one node on our Ceph Quincy 17.2.7 cluster. Some OSDs on that node are running fine, but some failing to start.

Looks like crc32 checksum error, and failing to get OSD map. I found a some discussions on that but nothing helped.

I've also tried to insert current OSD map but that ends with error:

# CEPH_ARGS="--bluestore-ignore-data-csum" ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-888/ --op set-osdmap --file osdmap
osdmap (#-1:20684533:::osdmap.931991:0#) does not exist.

Log is bellow

Any ideas please?

Thank you


From log file:

2023-11-27T16:01:47.691+0100 7f3f17aa13c0 -1 Falling back to public interface

2023-11-27T16:01:51.439+0100 7f3f17aa13c0 -1 bluestore(/var/lib/ceph/osd/ceph-888) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0xb1701b42, expected 0x9ee5ece2, device location [0x10000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#

2023-11-27T16:01:51.439+0100 7f3f17aa13c0 -1 osd.888 0 failed to load OSD map for epoch 927580, got 0 bytes

/build/ceph-17.2.7/src/osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f3f17aa13c0 time 2023-11-27T16:01:51.443522+0100
/build/ceph-17.2.7/src/osd/OSD.h: 696: FAILED ceph_assert(ret)
 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x561ad07d2624]
 2: ceph-osd(+0xc2e836) [0x561ad07d2836]
 3: (OSD::init()+0x4026) [0x561ad08e5a86]
 4: main()
 5: __libc_start_main()
 6: _start()
*** Caught signal (Aborted) **
 in thread 7f3f17aa13c0 thread_name:ceph-osd
2023-11-27T16:01:51.443+0100 7f3f17aa13c0 -1 /build/ceph-17.2.7/src/osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f3f17aa13c0 time 2023-11-27T16:01:51.443522+0100
/build/ceph-17.2.7/src/osd/OSD.h: 696: FAILED ceph_assert(ret)

 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x561ad07d2624]
 2: ceph-osd(+0xc2e836) [0x561ad07d2836]
 3: (OSD::init()+0x4026) [0x561ad08e5a86]
 4: main()
 5: __libc_start_main()
 6: _start()


 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f3f1814b420]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1b7) [0x561ad07d268c]
 5: ceph-osd(+0xc2e836) [0x561ad07d2836]
 6: (OSD::init()+0x4026) [0x561ad08e5a86]
 7: main()
 8: __libc_start_main()
 9: _start()
2023-11-27T16:01:51.447+0100 7f3f17aa13c0 -1 *** Caught signal (Aborted) **
 in thread 7f3f17aa13c0 thread_name:ceph-osd

 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f3f1814b420]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1b7) [0x561ad07d268c]
 5: ceph-osd(+0xc2e836) [0x561ad07d2836]
 6: (OSD::init()+0x4026) [0x561ad08e5a86]
 7: main()
 8: __libc_start_main()
 9: _start()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


  -558> 2023-11-27T16:01:47.691+0100 7f3f17aa13c0 -1 Falling back to public interface

    -5> 2023-11-27T16:01:51.439+0100 7f3f17aa13c0 -1 bluestore(/var/lib/ceph/osd/ceph-888) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0xb1701b42, expected 0x9ee5ece2, device location [0x10000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#

    -2> 2023-11-27T16:01:51.439+0100 7f3f17aa13c0 -1 osd.888 0 failed to load OSD map for epoch 927580, got 0 bytes

    -1> 2023-11-27T16:01:51.443+0100 7f3f17aa13c0 -1 /build/ceph-17.2.7/src/osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f3f17aa13c0 time 2023-11-27T16:01:51.443522+0100
/build/ceph-17.2.7/src/osd/OSD.h: 696: FAILED ceph_assert(ret)

 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x561ad07d2624]
 2: ceph-osd(+0xc2e836) [0x561ad07d2836]
 3: (OSD::init()+0x4026) [0x561ad08e5a86]
 4: main()
 5: __libc_start_main()
 6: _start()


     0> 2023-11-27T16:01:51.447+0100 7f3f17aa13c0 -1 *** Caught signal (Aborted) **
 in thread 7f3f17aa13c0 thread_name:ceph-osd

 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f3f1814b420]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1b7) [0x561ad07d268c]
 5: ceph-osd(+0xc2e836) [0x561ad07d2836]
 6: (OSD::init()+0x4026) [0x561ad08e5a86]
 7: main()
 8: __libc_start_main()
 9: _start()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


  -562> 2023-11-27T16:01:47.691+0100 7f3f17aa13c0 -1 Falling back to public interface

    -9> 2023-11-27T16:01:51.439+0100 7f3f17aa13c0 -1 bluestore(/var/lib/ceph/osd/ceph-888) _verify_csum bad crc32c/0x1000 checksum at blob offset 0x0, got 0xb1701b42, expected 0x9ee5ece2, device location [0x10000~1000], logical extent 0x0~1000, object #-1:7b3f43c4:::osd_superblock:0#

    -6> 2023-11-27T16:01:51.439+0100 7f3f17aa13c0 -1 osd.888 0 failed to load OSD map for epoch 927580, got 0 bytes

    -5> 2023-11-27T16:01:51.443+0100 7f3f17aa13c0 -1 /build/ceph-17.2.7/src/osd/OSD.h: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f3f17aa13c0 time 2023-11-27T16:01:51.443522+0100
/build/ceph-17.2.7/src/osd/OSD.h: 696: FAILED ceph_assert(ret)

 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14f) [0x561ad07d2624]
 2: ceph-osd(+0xc2e836) [0x561ad07d2836]
 3: (OSD::init()+0x4026) [0x561ad08e5a86]
 4: main()
 5: __libc_start_main()
 6: _start()


    -4> 2023-11-27T16:01:51.447+0100 7f3f17aa13c0 -1 *** Caught signal (Aborted) **
 in thread 7f3f17aa13c0 thread_name:ceph-osd

 ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f3f1814b420]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1b7) [0x561ad07d268c]
 5: ceph-osd(+0xc2e836) [0x561ad07d2836]
 6: (OSD::init()+0x4026) [0x561ad08e5a86]
 7: main()
 8: __libc_start_main()
 9: _start()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Aborted
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux