Quoting Peter Sabaini (peter@xxxxxxxxxx): > What kind of commit/apply latency increases have you seen when adding a > large numbers of OSDs? I'm nervous how sensitive workloads might react > here, esp. with spinners. You mean when there is backfilling going on? Instead of doing "a big bang" you can also use Dan van der Ster's trick with upmap balancer: https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py See https://www.slideshare.net/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer So you would still have norebalance / nobackfill / norecover and ceph balancer off. Then you run the script as many times as necessary to get "HEALTH_OK" again (on clusters other than nautilus) and there a no more PGs remapped. Unset the flags and enable the ceph balancer ... now the balancer will slowly move PGs to the new OSDs. We've used this trick to increase the number of PGs on a pool, and will use this to expand the cluster in the near future. This only works if you can use the balancer in "upmap" mode. Note that using upmap requires that all clients be Luminous or newer. If you are using cephfs kernel client it might report as not compatible (jewel) but recent linux distributions work well (Ubuntu 18.04 / CentOS 7). Gr. Stefan -- | BIT BV https://www.bit.nl/ Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com