Re: strange backfill delay after outing one node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Den ons 14 aug. 2019 kl 09:49 skrev Simon Oosthoek <s.oosthoek@xxxxxxxxxxxxx>:
Hi all,

Yesterday I marked out all the osds on one node in our new cluster to
reconfigure them with WAL/DB on their NVMe devices, but it is taking
ages to rebalance.


 
> ceph tell 'osd.*' injectargs '--osd-max-backfills 16'
> ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'
Since the cluster is currently hardly loaded, backfilling can take up
all the unused bandwidth as far as I'm concerned...
Is it a good idea to give the above commands or other commands to speed
up the backfilling? (e.g. like increasing "osd max backfills")


OSD max backfills is going to have a very large effect on recovery time, so that
would be the obvious knob to twist first. Check what it defaults to now, raise to 4,8,12,16
in steps and see that it doesn't slow rebalancing down too much.
Spindrives without any ssd/nvme journal/wal/db should perhaps have 1 or 2 at most,
ssds can take more than that and nvme even more before diminishing gains occur.

--
May the most significant bit of your life be positive.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux