Great job tracking this down to everyone involved! Mark On 11/14/19 10:10 AM, Sage Weil wrote:
Hi everyone, We've identified a data corruption bug[1], first introduced[2] (by yours truly) in 14.2.3 and affecting both 14.2.3 and 14.2.4. The corruption appears as a rocksdb checksum error or assertion that looks like os/bluestore/fastbmap_allocator_impl.h: 750: FAILED ceph_assert(available >= allocated) or in some cases a rocksdb checksum error. It only affects BlueStore OSDs that have a separate 'db' or 'wal' device. We have a fix[3] that is working its way through testing, and will expedite the next Nautilus point release (14.2.5) once it is ready. If you are running 14.2.2 or 14.2.1 and use BlueStore OSDs with separate 'db' volumes, you should consider waiting to upgrade until 14.2.5 is released. A big thank you to Igor Fedotov and several *extremely* helpful users who managed to reproduce and track down this problem! sage [1] https://tracker.ceph.com/issues/42223 [2] https://github.com/ceph/ceph/commit/096033b9d931312c0688c2eea7e14626bfde0ad7#diff-618db1d3389289a9d25840a4500ef0b0 [3] https://github.com/ceph/ceph/pull/31621 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx