Hi Frank, Thank you very much for this! :) > > we just completed a third upgrade test. There are 2 ways to convert the > OSDs: > > A) convert along with the upgrade (quick-fix-on-start=true) > B) convert after setting require-osd-release=octopus (quick-fix-on- > start=false until require-osd-release set to octopus, then restart to > initiate conversion) > > There is a variation A' of A: follow A, then initiate manual compaction > and restart all OSDs. > > Our experiments show that paths A and B do *not* yield the same result. > Following path A leads to a severely performance degraded cluster. As of > now, we cannot confirm that A' fixes this. It seems that the only way > out is to zap and re-deploy all OSDs, basically what Boris is doing > right now. > > We extended now our procedure to adding > > bluestore_fsck_quick_fix_on_mount = false > > to every ceph.conf file and executing > > ceph config set osd bluestore_fsck_quick_fix_on_mount false > > to catch any accidents. After daemon upgrade, we set > bluestore_fsck_quick_fix_on_mount = true host by host in the ceph.conf > and restart OSDs. > > This procedure works like a charm. > > I don't know what the difference between A and B is. It is possible that > B executes an extra step that is missing in A. The performance > degradation only shows up when snaptrim is active, but then it is very > severe. I suspect that many users who complained about snaptrim in the > past have at least 1 A-converted OSD in their cluster. > > If you have a cluster upgraded with B-converted OSDs, it works like a > native octopus cluster. There is very little performance reduction > compared with mimic. In exchange, I have the impression that it operates > more stable. _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx