Re: OSD data corruption after node reboot in Rook

"Anthony D'Atri" <aad@xxxxxxxxxxxxxx> · Mon, 5 Aug 2024 09:29:04 -0400

Describe your hardware, please, and are you talking an orderly "shutdown -r" reboot, or a kernel / system crash or power loss?

Often corruptions like this are a result of:

* Using non-enterprise SSDs that lack power loss protection
* Buggy / defective RAID HBAs
* Enabling volatile write cache on drives

> On Aug 5, 2024, at 4:54 AM, Reza Bakhshayeshi <reza.b2008@xxxxxxxxx> wrote:
> 
> Hello,
> 
> Whenever a node reboots in the cluster I get some corrupted OSDs, is there
> any config I should set to prevent this from happening that I am not aware
> of?
> 
> Here is the error log:
> 
> # kubectl logs rook-ceph-osd-1-5dcbd99cc7-2l5g2 -c expand-bluefs
> 
> ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)
> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x135) [0x7f969977ce15]
> 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f969977cfdb]
> 3: (BlueStore::expand_devices(std::ostream&)+0x5ff) [0x55ce89d1f3ff]
> 4: main()
> 5: __libc_start_main()
> 6: _start()
> 
>     0> 2024-07-31T08:39:19.840+0000 7f969b1c0980 -1 *** Caught signal
> (Aborted) **
> in thread 7f969b1c0980 thread_name:ceph-bluestore-
> 
> ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef
> (stable)
> 1: /lib64/libpthread.so.0(+0x12d20) [0x7f969843fd20]
> 2: gsignal()
> 3: abort()
> 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x18f) [0x7f969977ce6f]
> 5: /usr/lib64/ceph/libceph-common.so.2(+0x2a9fdb) [0x7f969977cfdb]
> 6: (BlueStore::expand_devices(std::ostream&)+0x5ff) [0x55ce89d1f3ff]
> 7: main()
> 8: __libc_start_main()
> 9: _start()
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed
> to interpret this.
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx