Re: un-even data filled on OSDs

M Ranga Swami Reddy <swamireddy@xxxxxxxxx> · Fri, 10 Jun 2016 13:48:33 +0530

Thanks Blair. Yes, will plan to upgrade my cluster.

Thanks
Swami

On Fri, Jun 10, 2016 at 7:40 AM, Blair Bethwaite
<blair.bethwaite@xxxxxxxxx> wrote:
> Hi Swami,
>
> That's a known issue, which I believe is much improved in Jewel thanks
> to a priority queue added somewhere in the OSD op path (I think). If I
> were you I'd be planning to get off Firefly and upgrade.
>
> Cheers,
>
> On 10 June 2016 at 12:08, M Ranga Swami Reddy <swamireddy@xxxxxxxxx> wrote:
>> Blair - Thanks for the details. I used to set the low priority for
>> recovery during the rebalance/recovery activity.
>> Even though I set the recovery_priority as 5 (instead of 1) and
>> client-op_priority set as 63, some of my customers complained that
>> their VMs are not reachable for a few mins/secs during the reblancing
>> task. Not sure, these low priority configurations are doing the job as
>> its.
>>
>> Thanks
>> Swami
>>
>> On Thu, Jun 9, 2016 at 5:50 PM, Blair Bethwaite
>> <blair.bethwaite@xxxxxxxxx> wrote:
>>> Swami,
>>>
>>> Run it with the help option for more context:
>>> "./crush-reweight-by-utilization.py --help". In your example below
>>> it's reporting to you what changes it would make to your OSD reweight
>>> values based on the default option settings (because you didn't
>>> specify any options). To make the script actually apply those weight
>>> changes you need the "-d -r" or "--doit --really" flags.
>>>
>>> If you want to get an idea of the impact that the weight changes will
>>> have before actually starting to move data then I suggest setting
>>> norecover and nobackfill (ceph osd set ...) on your cluster before
>>> making the weight changes, you can then examine "ceph -s" output
>>> (looking at "objects misplaced" to determine the scale of recovery
>>> required. Unset the flags once ready to start or back-out the reweight
>>> settings if you change your mind. You'll also want to lower these
>>> recovery and backfill tunables to reduce impact to client I/O (and if
>>> possible do not do this reweight change during peak I/O hours):
>>> ceph tell osd.* injectargs '--osd-max-backfills 1'
>>> ceph tell osd.* injectargs '--osd-max-recovery-threads 1'
>>> ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
>>> ceph tell osd.* injectargs '--osd-client-op-priority 63'
>>> ceph tell osd.* injectargs '--osd-recovery-max-active 1'
>>>
>>> Cheers,
>>>
>>> On 9 June 2016 at 20:20, M Ranga Swami Reddy <swamireddy@xxxxxxxxx> wrote:
>>>> Hi Blari,
>>>> I ran the script and results are below:
>>>> ==
>>>> ./crush-reweight-by-utilization.py
>>>> average_util: 0.587024, overload_util: 0.704429, underload_util: 0.587024.
>>>> reweighted:
>>>> 43 (0.852690 >= 0.704429) [1.000000 -> 0.950000]
>>>> 238 (0.845154 >= 0.704429) [1.000000 -> 0.950000]
>>>> 104 (0.827908 >= 0.704429) [1.000000 -> 0.950000]
>>>> 173 (0.817063 >= 0.704429) [1.000000 -> 0.950000]
>>>> ==
>>>>
>>>> is the above scripts says to reweight 43 -> 0.95?
>>>>
>>>> Thanks
>>>> Swami
>>>>
>>>> On Wed, Jun 8, 2016 at 10:34 AM, M Ranga Swami Reddy
>>>> <swamireddy@xxxxxxxxx> wrote:
>>>>> Blair - Thanks for the script...Btw, is this script has option for dry run?
>>>>>
>>>>> Thanks
>>>>> Swami
>>>>>
>>>>> On Wed, Jun 8, 2016 at 6:35 AM, Blair Bethwaite
>>>>> <blair.bethwaite@xxxxxxxxx> wrote:
>>>>>> Swami,
>>>>>>
>>>>>> Try https://github.com/cernceph/ceph-scripts/blob/master/tools/crush-reweight-by-utilization.py,
>>>>>> that'll work with Firefly and allow you to only tune down weight of a
>>>>>> specific number of overfull OSDs.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> On 7 June 2016 at 23:11, M Ranga Swami Reddy <swamireddy@xxxxxxxxx> wrote:
>>>>>>> OK, understood...
>>>>>>> To fix the nearfull warn, I am reducing the weight of a specific OSD,
>>>>>>> which filled >85%..
>>>>>>> Is this work-around advisable?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Swami
>>>>>>>
>>>>>>> On Tue, Jun 7, 2016 at 6:37 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>>>>>>>> On Tue, 7 Jun 2016, M Ranga Swami Reddy wrote:
>>>>>>>>> Hi Sage,
>>>>>>>>> >Jewel and the latest hammer point release have an improved
>>>>>>>>> >reweight-by-utilization (ceph osd test-reweight-by-utilization ... to dry
>>>>>>>>> > run) to correct this.
>>>>>>>>>
>>>>>>>>> Thank you....But not planning to upgrade the cluster soon.
>>>>>>>>> So, in this case - are there any tunable options will help? like
>>>>>>>>> "crush tunable optimal" or so?
>>>>>>>>> OR any other configuration options change will help?
>>>>>>>>
>>>>>>>> Firefly also has reweight-by-utilization... it's just a bit less friendly
>>>>>>>> than the newer versions.  CRUSH tunables don't generally help here unless
>>>>>>>> you have lots of OSDs that are down+out.
>>>>>>>>
>>>>>>>> Note that firefly is no longer supported.
>>>>>>>>
>>>>>>>> sage
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Swami
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jun 7, 2016 at 6:00 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
>>>>>>>>> > On Tue, 7 Jun 2016, M Ranga Swami Reddy wrote:
>>>>>>>>> >> Hello,
>>>>>>>>> >> I have aorund 100 OSDs in my ceph cluster. In this a few OSDs filled
>>>>>>>>> >> with >85% of data and few OSDs filled with ~60%-70% of data.
>>>>>>>>> >>
>>>>>>>>> >> Any reason why the unevenly OSDs filling happned? do I need to any
>>>>>>>>> >> tweaks on configuration to fix the above? Please advise.
>>>>>>>>> >>
>>>>>>>>> >> PS: Ceph version is - 0.80.7
>>>>>>>>> >
>>>>>>>>> > Jewel and the latest hammer point release have an improved
>>>>>>>>> > reweight-by-utilization (ceph osd test-reweight-by-utilization ... to dry
>>>>>>>>> > run) to correct this.
>>>>>>>>> >
>>>>>>>>> > sage
>>>>>>>>> >
>>>>>>>>> --
>>>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>>>
>>>>>>>>>
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Cheers,
>>>>>> ~Blairo
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> ~Blairo
>
>
>
> --
> Cheers,
> ~Blairo
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html