Re: ceph status reporting non-existing osd

Andrey Korolyov <andrey@xxxxxxx> · Mon, 16 Jul 2012 22:55:12 +0400



On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
> "ceph pg set_full_ratio 0.95"
> "ceph pg set_nearfull_ratio 0.94"
>
>
> On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:
>
>> On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum <greg@xxxxxxxxxxx (mailto:greg@xxxxxxxxxxx)> wrote:
>> > On Saturday, July 14, 2012 at 7:20 AM, Andrey Korolyov wrote:
>> > > On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil <sage@xxxxxxxxxxx (mailto:sage@xxxxxxxxxxx)> wrote:
>> > > > On Fri, 13 Jul 2012, Gregory Farnum wrote:
>> > > > > On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov <andrey@xxxxxxx (mailto:andrey@xxxxxxx)> wrote:
>> > > > > > Hi,
>> > > > > >
>> > > > > > Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
>> > > > > > six-node,
>> > > > > > and I have removed a bunch of rbd objects during recovery to avoid
>> > > > > > overfill.
>> > > > > > Right now I`m constantly receiving a warn about nearfull state on
>> > > > > > non-existing osd:
>> > > > > >
>> > > > > > health HEALTH_WARN 1 near full osd(s)
>> > > > > > monmap e3: 3 mons at
>> > > > > > {0=192.168.10.129:6789/0,1=192.168.10.128:6789/0,2=192.168.10.127:6789/0},
>> > > > > > election epoch 240, quorum 0,1,2 0,1,2
>> > > > > > osdmap e2098: 4 osds: 4 up, 4 in
>> > > > > > pgmap v518696: 464 pgs: 464 active+clean; 61070 MB data, 181 GB
>> > > > > > used, 143 GB / 324 GB avail
>> > > > > > mdsmap e181: 1/1/1 up {0=a=up:active}
>> > > > > >
>> > > > > > HEALTH_WARN 1 near full osd(s)
>> > > > > > osd.4 is near full at 89%
>> > > > > >
>> > > > > > Needless to say, osd.4 remains only in ceph.conf, but not at crushmap.
>> > > > > > Reducing has been done 'on-line', e.g. without restart entire cluster.
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > Whoops! It looks like Sage has written some patches to fix this, but
>> > > > > for now you should be good if you just update your ratios to a larger
>> > > > > number, and then bring them back down again. :)
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > Restarting ceph-mon should also do the trick.
>> > > >
>> > > > Thanks for the bug report!
>> > > > sage
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > Should I restart mons simultaneously?
>> > I don't think restarting will actually do the trick for you — you actually will need to set the ratios again.
>> >
>> > > Restarting one by one has no
>> > > effect, same as filling up data pool up to ~95 percent(btw, when I
>> > > deleted this 50Gb file on cephfs, mds was stuck permanently and usage
>> > > remained same until I dropped and recreated data pool - hope it`s one
>> > > of known posix layer bugs). I also deleted entry from config, and then
>> > > restarted mons, with no effect. Any suggestions?
>> >
>> >
>> >
>> > I'm not sure what you're asking about here?
>> > -Greg
>>
>>
>>
>> Oh, sorry, I have mislooked and thought that you suggested filling up
>> osds. How do I can set full/nearfull ratios correctly?
>>
>> $ceph injectargs '--mon_osd_full_ratio 96'
>> parsed options
>> $ ceph injectargs '--mon_osd_near_full_ratio 94'
>> parsed options
>>
>> ceph pg dump | grep 'full'
>> full_ratio 0.95
>> nearfull_ratio 0.85
>>
>> Setting parameters in the ceph.conf and then restarting mons does not
>> affect ratios either.
>
>
>

Thanks, it worked, but setting values back result to turn warning back.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html