Quoting Janne Johansson (icepic.dz@xxxxxxxxx): > Yes, when you add a drive (or 10), some PGs decide they should have one or more > replicas on the new drives, a new empty PG is created there, and > _then_ that replica > will make that PG get into the "degraded" mode, meaning if it had 3 > fine active+clean > replicas before, it now has 2 active+clean and one needing backfill to > get into shape. > > It is a slight mistake in reporting it in the same way as an error, > even if it looks to the > cluster just as if it was in error and needs fixing. This gives the > new ceph admins a > sense of urgency or danger whereas it should be perfectly normal to add space to > a cluster. Also, it could have chosen to add a fourth PG in a repl=3 > PG and fill from > the one going out into the new empty PG and somehow keep itself with 3 working > replicas, but ceph chooses to first discard one replica, then backfill > into the empty > one, leading to this kind of "error" report. Thanks for the explanation. I agree with you that it would be more safe to first backfill to the new PG instead of just assuming the new OSD will be fine and discarding a perfectly healthy PG. We do have max_size 3 in the CRUSH ruleset ... I wonder if Ceph would behave differently if we would have max_size 4 ... to actually allow a fourth copy in the first place ... Gr. Stefan -- | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com