On 26.07.19 15:03, Stefan Kooman wrote: > Quoting Peter Sabaini (peter@xxxxxxxxxx): >> What kind of commit/apply latency increases have you seen when adding a >> large numbers of OSDs? I'm nervous how sensitive workloads might react >> here, esp. with spinners. > > You mean when there is backfilling going on? Instead of doing "a big Yes exactly. I usually tune down max rebalance and max recovery active knobs to lessen impact but still I found the additional write load can substantially increase i/o latencies. Not all workloads like this. > bang" you can also use Dan van der Ster's trick with upmap balancer: > https://github.com/cernceph/ceph-scripts/blob/master/tools/upmap/upmap-remapped.py > > See > https://www.slideshare.net/Inktank_Ceph/ceph-day-berlin-mastering-ceph-operations-upmap-and-the-mgr-balancer Thanks, thats interesting -- though I wish it weren't necessary. cheers, peter. > So you would still have norebalance / nobackfill / norecover and ceph > balancer off. Then you run the script as many times as necessary to get > "HEALTH_OK" again (on clusters other than nautilus) and there a no more > PGs remapped. Unset the flags and enable the ceph balancer ... now the > balancer will slowly move PGs to the new OSDs. > > We've used this trick to increase the number of PGs on a pool, and will > use this to expand the cluster in the near future. > > This only works if you can use the balancer in "upmap" mode. Note that > using upmap requires that all clients be Luminous or newer. If you are > using cephfs kernel client it might report as not compatible (jewel) but > recent linux distributions work well (Ubuntu 18.04 / CentOS 7). > > Gr. Stefan > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com