Thanks for the idea, I've tried it with 1 thread, and it shredded another OSD. I've updated the tracker ticket :) At least non-racecondition bugs are hopefully easier to spot... I wouldn't just disable the fsck and upgrade anyway until the cause is rooted out. -- Jonas On 29/03/2021 14.34, Dan van der Ster wrote: > Hi, > > Saw that, looks scary! > > I have no experience with that particular crash, but I was thinking > that if you have already backfilled the degraded PGs, and can afford > to try another OSD, you could try: > > "bluestore_fsck_quick_fix_threads": "1", # because > https://github.com/facebook/rocksdb/issues/5068 showed a similar crash > and the dev said it occurs because WriteBatch is not thread safe. > > "bluestore_fsck_quick_fix_on_mount": "false", # should disable the > fsck during upgrade. See https://github.com/ceph/ceph/pull/40198 > > -- Dan > > On Mon, Mar 29, 2021 at 2:23 PM Jonas Jelten <jelten@xxxxxxxxx> wrote: >> >> Hi! >> >> After upgrading MONs and MGRs successfully, the first OSD host I upgraded on Ubuntu Bionic from 14.2.16 to 15.2.10 >> shredded all OSDs on it by corrupting RocksDB, and they now refuse to boot. >> RocksDB complains "Corruption: unknown WriteBatch tag". >> >> The initial crash/corruption occured when the automatic fsck was ran, and when it committed the changes for a lot of "zombie spanning blobs". >> >> Tracker issue with logs: https://tracker.ceph.com/issues/50017 >> >> >> Anyone else encountered this error? I've "suspended" the upgrade for now :) >> >> -- Jonas >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx