Re: [IMPORTANT NOTICE] Potential data corruption in Pacific

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tobias,

thanks a lot for your input, certain improvment on the critical issue notification process is needed indeed.

We have an active discussion in the dev community (CLT group specifically) on how to better arrange such notifications. Perhaps it would be good to have a wider audience for that topic - so I would suggest to utilize Ceph User+Dev Monthly meeting which has been advertized by Neha yesterday.

The first upcoming event to be held on Nov 18. Please feel free to join and/or update the agenda at

https://pad.ceph.com/p/ceph-user-dev-monthly-minutes


Thanks,

Igor

On 10/29/2021 12:23 PM, Tobias Fischer wrote:
As this is a very important Information it should become special attention.

I think the risk is quite high that when such important news are posted on ceph-users it could get lost in the shuffle.

I would propose to either create a separate Mailing list for these kind of Information from the Ceph Dev Community or use a Mailing list where not that much is happening, e.g. ceph-announce

What do you think?

Am 28.10.21 um 17:37 schrieb Igor Fedotov:
Dear Ceph users.

On behalf of Ceph's developers community I have to inform about a recently discovered severe bug which might cause data corruption. The issue occurs during OMAP format conversion for clusters upgraded to Pacific, new clusters aren't affected. OMAP format conversion's trigger is BlueStore repair/quick-fix functionality which might be invoked either manually via ceph-bluestore-tool or automatically by OSD if 'bluestore_fsck_quick_fix_on_mount' is set to true.

Both OSD and MDS daemons are known to be suffering from the issue, potentially other ones, e.g. RGW might be affected as well - the major symptom is daemon's inability to startup/proceed operating after some OSDs have been "repaired".

More details on the bug and its status tracking can be found at: https://tracker.ceph.com/issues/53062

We're currently working on the fix which is expected to be available in the upcoming v16.2.7 release.

Meanwhile please DO NOT SET bluestore_fsck_quick_fix_on_mount to true (please immediately switch it to false if already set) and DO NOT RUN ceph-bluestore-tool's repair/quick-fix commands.

Appologies for all the troubles this could cause.

best regards, Tobi

Clyso GmbH - Ceph Foundation Member
support@xxxxxxxxx
https://www.clyso.com
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux