Den ons 14 aug. 2019 kl 09:49 skrev Simon Oosthoek <s.oosthoek@xxxxxxxxxxxxx>:
Hi all,
Yesterday I marked out all the osds on one node in our new cluster to
reconfigure them with WAL/DB on their NVMe devices, but it is taking
ages to rebalance.
> ceph tell 'osd.*' injectargs '--osd-max-backfills 16'
> ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
Since the cluster is currently hardly loaded, backfilling can take up
all the unused bandwidth as far as I'm concerned...
Is it a good idea to give the above commands or other commands to speed
up the backfilling? (e.g. like increasing "osd max backfills")
OSD max backfills is going to have a very large effect on recovery time, so that
would be the obvious knob to twist first. Check what it defaults to now, raise to 4,8,12,16
in steps and see that it doesn't slow rebalancing down too much.
Spindrives without any ssd/nvme journal/wal/db should perhaps have 1 or 2 at most,
ssds can take more than that and nvme even more before diminishing gains occur.
May the most significant bit of your life be positive.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com