Dear Ceph users.
On behalf of Ceph's developers community I have to inform about a
recently discovered severe bug which might cause data corruption. The
issue occurs during OMAP format conversion for clusters upgraded to
Pacific, new clusters aren't affected. OMAP format conversion's trigger
is BlueStore repair/quick-fix functionality which might be invoked
either manually via ceph-bluestore-tool or automatically by OSD if
'bluestore_fsck_quick_fix_on_mount' is set to true.
Both OSD and MDS daemons are known to be suffering from the issue,
potentially other ones, e.g. RGW might be affected as well - the major
symptom is daemon's inability to startup/proceed operating after some
OSDs have been "repaired".
More details on the bug and its status tracking can be found at:
https://tracker.ceph.com/issues/53062
We're currently working on the fix which is expected to be available in
the upcoming v16.2.7 release.
Meanwhile please DO NOT SET bluestore_fsck_quick_fix_on_mount to true
(please immediately switch it to false if already set) and DO NOT RUN
ceph-bluestore-tool's repair/quick-fix commands.
Appologies for all the troubles this could cause.
--
Igor Fedotov
Ceph Lead Developer
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx