Uneven OSD usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



ceph osd reweight-by-utilization is ok to use, as long as it's tempory.
 I've used it while waiting for new hardware to arrive.  It adjusts the
weight displayed in ceph osd tree, but not the weight used in the crushmap.
 Yeah, there are two different weights for an OSD.  Leave the crushmap
weight as the size of the disk in TB, and just adjust the tree weight.


It will cause data migration (obviously, that's what you want).  I prefer
to use ceph osd reweight rather than reweight-by-utilization.  I can slowly
dial down the weight, one OSD at a time.
If I recall, I started with something like ceph osd reweight 9 0.975, and
lowered the weight by 0.025 each step.  Most OSDs were fine after one or
two steps, but some made it down to 0.80 before I was happy with them.  It
was an iterative process; sometimes reweighting the next OSD pushed data
back to the OSD I'd just finished reweighting.


I do remember running into problems with backfill_toofull though.  Doing a
reweight changes the CRUSH rules.  If I recall, I got in a state where two
OSDs wanted to exchange PGs, but they were both too full to accept them.
 They had other PGs they could vacate, but because I had osd_max_backfills
= 1, it was stuck.  In the end, I increased the osd backfill full ratio,
like you did.  You can do it without restarting the daemons, using
ceph tell osd.\* injectargs '--osd_backfill_full_ratio 0.90'

I also recall telling the monitors too (I don't recall why though), using:
ceph tell mon.\* injectargs '--mon_osd_nearfull_ratio 0.90'



Be aware that marking an OSD as OUT will set it's tree weight to 0, and
marking it IN will set the weight to 1.  Once you start using ceph osd
reweight, it's a good idea to keep track of the weights outside of ceph.
 If any OSDs go OUT, you'll want to manually set the
weight, preferably before it backfills itself toofull.


Once you get your new hardware, you should returns all the osd weights to
1, and just live with the uneven distribution until you can take
Christian's suggestion for chooseleaf_vary_r.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140903/5c2210e0/attachment.htm>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux