There was actually a change made to allow upgrading osds in a more parallel fashion nearly a year ago (https://github.com/ceph/ceph/pull/39726) that made its way into pacific but not octopus which could explain the discrepancy here. I guess we need a flag to have the upgrade not do this for users who'd like to maintain higher I/O throughput at the cost of upgrade speed. On Mon, Feb 14, 2022 at 11:21 AM Eugen Block <eblock@xxxxxx> wrote: > It does update only one OSD at a time, I did that in my little test > cluster on Octopus today. I haven’t played too much with Pacific yet, > maybe some things have changed there? > > Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: > > > Hi Eugen, > > > > Thanks for this. All of our pools are size=3 and min_size=2, failure > domain > > is host. For example, we experience random I/O stalls on this pool during > > upgrades: https://pastebin.com/iVVxJ9TF (I pasted pool and crush info > into > > pastebin for better readability), which in theory shouldn't be happening > as > > there always are 2 more hosts with 2 more OSDs per PG when OSDs on 1 host > > are being upgraded. The output of `ceph pg ls-by-pool` is rather lengthy > as > > there are 256 PGs in this particular pool, but I personally verified each > > PG to be supported by 3 distinct OSDs, each of the 3 on a different host. > > > > I was hoping that by forcing cephadm to upgrade 1 OSD at a time instead > of > > 1 host at a time we could resolve this issue. > > > > /Z > > > > On Mon, Feb 14, 2022 at 4:26 PM Eugen Block <eblock@xxxxxx> wrote: > > > >> Hi, > >> > >> what are your rulesets for the affected pools? As far as I remember > >> the orchestrator updates one OSD node at a time, but not multiple OSDs > >> at once, only one by one. It checks with the "ok-to-stop" command if > >> an upgrade of that daemon can proceed, so as long as you have host as > >> failure domain there should be no I/O disruption for clients. Maybe > >> you have some pools with size = 2 and min_size = 2? > >> > >> Regards, > >> Eugen > >> > >> > >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: > >> > >> > Hi! > >> > > >> > Sometimes when we upgrade our cephadm-managed 16.2.x cluster, cephadm > >> > decides that it's safe to upgrade a bunch of OSDs at a time, as a > result > >> > sometimes RBD-backed Openstack VMs appear to get I/O stalls and > read-only > >> > filesystems. Is there a way to make cephadm upgrade fewer OSDs at a > time, > >> > or perhaps upgrade them one by one? I don't care if that takes a lot > more > >> > time, as long as there's no I/O interruption. > >> > > >> > I would appreciate any advice. > >> > > >> > Best regards, > >> > Zakhar > >> > _______________________________________________ > >> > ceph-users mailing list -- ceph-users@xxxxxxx > >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > >> > >> > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx