Am 12.01.2018 um 21:21 schrieb Sage Weil: > On Fri, 12 Jan 2018, Stefan Priebe - Profihost AG wrote: >> Am 11.01.2018 um 21:37 schrieb Sage Weil: >>> On Thu, 11 Jan 2018, Stefan Priebe - Profihost AG wrote: >>>> OK it wasn't the balancer. >>>> >>>> It happens after executing all the reweight und crush compat commands: >>>> >>>> An even on a much bigger cluster it's 6% again. Some rounding issue? I >>>> migrated with a script so it's not a typo. >>> >>> Maybe.. can you narrow down which command it is? I'm guessing that one of >>> the 'ceph osd crush weight-set reweight-compat ...' commands does it, but >>> it would be nice to confirm whether it is a rounding issue or if something >>> is broken! >> >> Hi Sage, >> >> it happens while executing: >> ceph osd crush weight-set reweight-compat <osd> <optimized-weight> >> ceph osd crush reweight <osd> <target-weight> >> right after the first command (reweight-compat optimized-weight) for the > > It does this with the optimized-weight is *exactly* the same as the > current (normal) weight? If it matches it should be a no-op. Can you do > a 'ceph osd crush tree' before and after the command so we can compare? > (In fact, I think that first step is pointless because when you create the > compat weight-set it's populated with the regular CRUSH weights, which in > your situation *are* the optimized weights.) > > Actually, looking at this more closely, it looks like the normal > 'reweight' command sets the value in the compat weight-set too, so in > reality we want to reorder those commands (and do them quickly in > succession). e.g., > > ceph osd crush reweight osd.1 <target-weight> > ceph osd crush weight-set reweight-compat osd.1 <optimized-weight> > > but, again, the before and after PG layout should match. A 'ceph osd > crush tree' dump before, after, and between will help sort out what is > going on. Here we go (attached) - but those dumps are in json instead of plain text but i had them already dumped by my script. I hope this helps. Thanks, Stefan > Thanks! > sage > >> 1st OSD it gets: >> 0.4% >> after the second (reweight target-weight): >> 0.7% >> >> for each OSD it gets more worse... >> >> Stefan >> >>> sage >>> >>> > >>>> Stefan >>>> >>>> Am 11.01.2018 um 21:21 schrieb Stefan Priebe - Profihost AG: >>>>> Hi, >>>>> >>>>> Am 11.01.2018 um 21:10 schrieb Sage Weil: >>>>>> On Thu, 11 Jan 2018, Stefan Priebe - Profihost AG wrote: >>>>>>> Am 11.01.2018 um 20:58 schrieb Sage Weil: >>>>>>>> On Thu, 11 Jan 2018, Stefan Priebe - Profihost AG wrote: >>>>>>>>> Hi Sage, >>>>>>>>> >>>>>>>>> this did not work like expected. I tested it in another smaller cluster >>>>>>>>> and it resulted in about 6% misplaced objects. >>>>>>>> >>>>>>>> Can you narrow down at what stage the misplaced objects happened? >>>>>>> >>>>>>> ouch i saw this: >>>>>>> # ceph balancer status >>>>>>> { >>>>>>> "active": true, >>>>>>> "plans": [ >>>>>>> "auto_2018-01-11_19:52:28" >>>>>>> ], >>>>>>> "mode": "crush-compat" >>>>>>> } >>>>>>> >>>>>>> so might it be the balancer beeing executed while i was modifying the tree? >>>>>>> Can i stop it and reexecute it manually? >>>>>> >>>>>> You can always 'ceph balancer off'. And I probably wouldn't turn it on >>>>>> until after you've cleaned this up because it will balance with the >>>>>> current weights being the 'target' weights (when in your case they're not >>>>>> (yet)). >>>>>> >>>>>> To manually see what the balancer would do you can >>>>>> >>>>>> ceph balancer optimize foo >>>>>> ceph balancer show foo >>>>>> ceph balancer eval foo # (see numerical analysis) >>>>>> >>>>>> and if it looks good >>>>>> >>>>>> ceph balancer execute foo >>>>>> >>>>>> to actually apply the changes. >>>>> >>>>> ok thanks but it seems there are still leftovers somewhere: >>>>> [expo-office-node1 ~]# ceph balancer optimize stefan >>>>> Error EINVAL: Traceback (most recent call last): >>>>> File "/usr/lib/ceph/mgr/balancer/module.py", line 303, in handle_command >>>>> self.optimize(plan) >>>>> File "/usr/lib/ceph/mgr/balancer/module.py", line 596, in optimize >>>>> return self.do_crush_compat(plan) >>>>> File "/usr/lib/ceph/mgr/balancer/module.py", line 658, in do_crush_compat >>>>> orig_ws = self.get_compat_weight_set_weights() >>>>> File "/usr/lib/ceph/mgr/balancer/module.py", line 837, in >>>>> get_compat_weight_set_weights >>>>> raise RuntimeError('could not find bucket %s' % b['bucket_id']) >>>>> RuntimeError: could not find bucket -6 >>>>> >>>>> [expo-office-node1 ~]# ceph osd tree >>>>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF >>>>> -1 14.55957 root default >>>>> -4 3.63989 host expo-office-node1 >>>>> 8 ssd 0.90997 osd.8 up 1.00000 1.00000 >>>>> 9 ssd 0.90997 osd.9 up 1.00000 1.00000 >>>>> 10 ssd 0.90997 osd.10 up 1.00000 1.00000 >>>>> 11 ssd 0.90997 osd.11 up 1.00000 1.00000 >>>>> -2 3.63989 host expo-office-node2 >>>>> 0 ssd 0.90997 osd.0 up 1.00000 1.00000 >>>>> 1 ssd 0.90997 osd.1 up 1.00000 1.00000 >>>>> 2 ssd 0.90997 osd.2 up 1.00000 1.00000 >>>>> 3 ssd 0.90997 osd.3 up 1.00000 1.00000 >>>>> -3 3.63989 host expo-office-node3 >>>>> 4 ssd 0.90997 osd.4 up 1.00000 1.00000 >>>>> 5 ssd 0.90997 osd.5 up 1.00000 1.00000 >>>>> 6 ssd 0.90997 osd.6 up 1.00000 1.00000 >>>>> 7 ssd 0.90997 osd.7 up 1.00000 1.00000 >>>>> -5 3.63989 host expo-office-node4 >>>>> 12 ssd 0.90997 osd.12 up 1.00000 1.00000 >>>>> 13 ssd 0.90997 osd.13 up 1.00000 1.00000 >>>>> 14 ssd 0.90997 osd.14 up 1.00000 1.00000 >>>>> 15 ssd 0.90997 osd.15 up 1.00000 1.00000 >>>>> >>>>> >>>>> Stefan >>>>> >>>>>> sage >>>>>> >>>>>> >>>>>>> >>>>>>>> >>>>>>>> sage >>>>>>>> >>>>>>>>> >>>>>>>>> Any ideas? >>>>>>>>> >>>>>>>>> Stefan >>>>>>>>> Am 11.01.2018 um 08:09 schrieb Stefan Priebe - Profihost AG:HI >>>>>>>>>> Thanks! Can this be done while still having jewel clients? >>>>>>>>>> >>>>>>>>>> Stefan >>>>>>>>>> >>>>>>>>>> Excuse my typo sent from my mobile phone. >>>>>>>>>> >>>>>>>>>> Am 10.01.2018 um 22:56 schrieb Sage Weil <sage@xxxxxxxxxxxx >>>>>>>>>> <mailto:sage@xxxxxxxxxxxx>>: >>>>>>>>>> >>>>>>>>>>> On Wed, 10 Jan 2018, Stefan Priebe - Profihost AG wrote: >>>>>>>>>>>> Am 10.01.2018 um 22:23 schrieb Sage Weil: >>>>>>>>>>>>> On Wed, 10 Jan 2018, Stefan Priebe - Profihost AG wrote: >>>>>>>>>>>>>> k, >>>>>>>>>>>>>> >>>>>>>>>>>>>> in the past we used the python crush optimize tool to reweight the osd >>>>>>>>>>>>>> usage - it inserted a 2nd tree with $hostname-target-weight as >>>>>>>>>>>>>> hostnames. >>>>>>>>>>>>> >>>>>>>>>>>>> Can you attach a 'ceph osd crush tree' (or partial output) so I can see >>>>>>>>>>>>> what you mean? >>>>>>>>>>>> >>>>>>>>>>>> Sure - attached. >>>>>>>>>>> >>>>>>>>>>> Got it >>>>>>>>>>> >>>>>>>>>>>>>> Now the quesions are: >>>>>>>>>>>>>> 1.) can we remove the tree? How? >>>>>>>>>>>>>> 2.) Can we do this now or only after all clients are running Luminous? >>>>>>>>>>>>>> 3.) is it enought to enable the mgr balancer module? >>>>>>>>>>> >>>>>>>>>>> First, >>>>>>>>>>> >>>>>>>>>>> ceph osd crush weight-set create-compat >>>>>>>>>>> >>>>>>>>>>> then for each osd, >>>>>>>>>>> ceph osd crush weight-set reweight-compat <osd> <optimized-weight> >>>>>>>>>>> ceph osd crush reweight <osd> <target-weight> >>>>>>>>>>> >>>>>>>>>>> That won't move any data but will keep your current optimized weights in >>>>>>>>>>> the compat weight-set where they belong. >>>>>>>>>>> >>>>>>>>>>> Then you can remove the *-target-weight buckets. For each osd, >>>>>>>>>>> >>>>>>>>>>> ceph osd crush rm <osd> <ancestor>-target-weight >>>>>>>>>>> >>>>>>>>>>> and then for each remaining bucket >>>>>>>>>>> >>>>>>>>>>> ceph osd crush rm <foo>-target-weight >>>>>>>>>>> >>>>>>>>>>> Finally, turn on the balancer (or test it to see what it it wants to do >>>>>>>>>>> with the optimize command.) >>>>>>>>>>> >>>>>>>>>>> HTH! >>>>>>>>>>> sage >>>>>>>>> >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>> >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >>
Attachment:
maps.tar.gz
Description: application/gzip