Re: Strategy for add new osds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've to say I am reading quite some interesting strategies in this
thread and I'd like to shortly take the time to compare them:

1) one by one osd adding

- least amount of pg rebalance
- will potentially re-re-balance data that has just been distributed
  with the next OSD phase in
- limits the impact if you have a bug in the hdd/ssd series

The biggest problem with this approach is that you will re-re-re-balance
data over and over again and that will slowdown the process significantly.

2) reweighted phase in

- Starting slow with reweighting to a small amount of its potential
- Allows to see how the new OSD performs
- Needs manual interaction for growing
- delays the phase in possibly for "longer than necessary"

We use this approach when phasing in multiple, larger OSDs that are from
a newer / not so well known series of disks.

3) noin / norebalance based phase in

- Interesting approach to delay rebalancing until the "proper/final" new
  storage is in place
- Unclear how much of a difference it makes if you insert the new set of
  osds within a short timeframe (i.e. adding 1 osd at minute 0, 2nd at
  minute 1, etc.)


4) All at once / randomly

- Least amount of manual tuning
- In a way something one "would expect" ceph to do right (but in
  practice doesn't all the time)
- Might (likely) cause short term re-adjustments
- Might cause client i/o slowdown (see next point)

5) General slowing down

What we actually do in datacenterlight.ch is slowing down phase ins by
default via the followign tunings:

# Restrain recovery operations so that normal cluster is not affected
[osd]
osd max backfills = 1
osd recovery max active = 1
osd recovery op priority = 2

This works well in about 90% of the cases for us.

Quite an interesting thread, thanks everyone for sharing!

Cheers,

Nico


Anthony D'Atri <anthony.datri@xxxxxxxxx> writes:

>> Hi,
>>
>> as far as I understand it,
>>
>> you get no real benefit with doing them one by one, as each osd add, can cause a lot of data to be moved to a different osd, even tho you just rebalanced it.
>
> Less than with older releases, but yeah.
>
> I’ve known someone who advised against doing them in parallel because one would — for a time — have PGs with multiple remaps in the acting set.  The objection may have been paranoia, I’m not sure.
>
> One compromise is to upweight the new OSDs one node at a time, so the churn is limited to one failure domain at a time.
>
> — aad
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx


--
Sustainable and modern Infrastructures by ungleich.ch
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux