Re: Running Ceph issues: HEALTH_WARN, unknown auth protocol, others

Gregory Farnum <greg@xxxxxxxxxxx> · Wed, 1 May 2013 13:41:19 -0700



On Wed, May 1, 2013 at 1:32 PM, Dino Yancey <dino2gnt@xxxxxxxxx> wrote:
> Hi Wyatt,
>
> This is almost certainly a configuration issue.  If i recall, there is a
> min_size setting in the CRUSH rules for each pool that defaults to two which
> you may also need to reduce to one.  I don't have the documentation in front
> of me, so that's just off the top of my head...

Hmm, no. The min_size should be set automatically to 1/2 of the
specified size (rounded up), which would be 1 in this case.
What's the full output of ceph -s? Can you pastebin the output of
"ceph pg dump" please?
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com

>
> Dino
>
>
> On Wed, May 1, 2013 at 3:19 PM, Wyatt Gorman <wyattgorman@xxxxxxxxxxxxxxx>
> wrote:
>>
>> Okay! Dino, thanks for your response. I reduced my metadata pool size and
>> data pool size to 1, which eliminated the "recovery 21/42 degraded
>> (50.000%)" at the end of my HEALTH_WARN error. So now, when I run "ceph
>> health" I get the following:
>>
>> HEALTH_WARN 384 pgs degraded; 384 pgs stale; 384 pgs stuck unclean
>>
>> So this seems to be from one single root cause. Any ideas? Again, is this
>> a corrupted drive issue that I can clean up, or is this still a ceph
>> configuration error?
>>
>>
>> On Wed, May 1, 2013 at 12:52 PM, Dino Yancey <dino2gnt@xxxxxxxxx> wrote:
>>>
>>> Hi Wyatt,
>>>
>>> You need to reduce the replication level on your existing pools to 1, or
>>> bring up another OSD.  The default configuration specifies a replication
>>> level of 2, and the default crush rules want to place a replica on two
>>> distinct OSDs.  With one OSD, CRUSH can't determine placement for the
>>> replica and so Ceph is reporting a degraded state.
>>>
>>> Dino
>>>
>>>
>>> On Wed, May 1, 2013 at 11:45 AM, Wyatt Gorman
>>> <wyattgorman@xxxxxxxxxxxxxxx> wrote:
>>>>
>>>> Well, those points solved the issue of the redefined host and the
>>>> unidentified protocol. The
>>>>
>>>>
>>>> "HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery 21/42
>>>> degraded (50.000%)"
>>>>
>>>> error is still an issue, though. Is this something simple like some hard
>>>> drive corruption that I can clean up with a fsck, or is this a ceph issue?
>>>>
>>>>
>>>>
>>>> On Wed, May 1, 2013 at 12:31 PM, Mike Dawson
>>>> <mike.dawson@xxxxxxxxxxxxxxxx> wrote:
>>>>>
>>>>> Wyatt,
>>>>>
>>>>> A few notes:
>>>>>
>>>>> - Yes, the second "host = ceph" under mon.a is redundant and should be
>>>>> deleted.
>>>>>
>>>>> - "auth client required = cephx [osd]" should be simply
>>>>> auth client required = cephx".
>>>>>
>>>>> - Looks like you only have one OSD. You need at least as many (and
>>>>> hopefully more) OSDs than highest replication level out of your pools.
>>>>>
>>>>> Mike
>>>>>
>>>>>
>>>>> On 5/1/2013 12:23 PM, Wyatt Gorman wrote:
>>>>>>
>>>>>> Here is my ceph.conf. I just figured out that the second host = isn't
>>>>>> necessary, though it is like that on the 5-minute quick start guide...
>>>>>> (Perhaps I'll submit my couple of fixes that I've had to implement so
>>>>>> far). That fixes the "redefined host" issue, but none of the others.
>>>>>>
>>>>>> [global]
>>>>>>      # For version 0.55 and beyond, you must explicitly enable or
>>>>>>      # disable authentication with "auth" entries in [global].
>>>>>>
>>>>>>      auth cluster required = cephx
>>>>>>      auth service required = cephx
>>>>>>      auth client required = cephx [osd]
>>>>>>      osd journal size = 1000
>>>>>>
>>>>>>      #The following assumes ext4 filesystem.
>>>>>>      filestore xattr use omap = true
>>>>>>      # For Bobtail (v 0.56) and subsequent versions, you may add
>>>>>>      #settings for mkcephfs so that it will create and mount the file
>>>>>>      #system on a particular OSD for you. Remove the comment `#`
>>>>>>      #character for the following settings and replace the values in
>>>>>>      #braces with appropriate values, or leave the following settings
>>>>>>      #commented out to accept the default values. You must specify
>>>>>>      #the --mkfs option with mkcephfs in order for the deployment
>>>>>>      #script to utilize the following settings, and you must define
>>>>>>      #the 'devs' option for each osd instance; see below. osd mkfs
>>>>>>      #type = {fs-type} osd mkfs options {fs-type} = {mkfs options} #
>>>>>>      #default for xfs is "-f" osd mount options {fs-type} = {mount
>>>>>>      #options} # default mount option is "rw,noatime"
>>>>>>      # For example, for ext4, the mount option might look like this:
>>>>>>
>>>>>>      #osd mkfs options ext4 = user_xattr,rw,noatime
>>>>>>      # Execute $ hostname to retrieve the name of your host, and
>>>>>>      # replace {hostname} with the name of your host. For the
>>>>>>      # monitor, replace {ip-address} with the IP address of your
>>>>>>      # host.
>>>>>> [mon.a]
>>>>>>      host = ceph
>>>>>>      mon addr = 10.81.2.100:6789 <http://10.81.2.100:6789> [osd.0]
>>>>>>
>>>>>>      host = ceph
>>>>>>
>>>>>>      # For Bobtail (v 0.56) and subsequent versions, you may add
>>>>>>      # settings for mkcephfs so that it will create and mount the
>>>>>>      # file system on a particular OSD for you. Remove the comment
>>>>>>      # `#` character for the following setting for each OSD and
>>>>>>      # specify a path to the device if you use mkcephfs with the
>>>>>>      # --mkfs option.
>>>>>>
>>>>>>      #devs = {path-to-device}
>>>>>> [osd.1]
>>>>>>      host = ceph
>>>>>>      #devs = {path-to-device}
>>>>>> [mds.a]
>>>>>>      host = ceph
>>>>>>
>>>>>>
>>>>>> On Wed, May 1, 2013 at 12:14 PM, Mike Dawson
>>>>>> <mike.dawson@xxxxxxxxxxxxxxxx <mailto:mike.dawson@xxxxxxxxxxxxxxxx>>
>>>>>> wrote:
>>>>>>
>>>>>>     Wyatt,
>>>>>>
>>>>>>     Please post your ceph.conf.
>>>>>>
>>>>>>     - mike
>>>>>>
>>>>>>
>>>>>>     On 5/1/2013 12:06 PM, Wyatt Gorman wrote:
>>>>>>
>>>>>>         Hi everyone,
>>>>>>
>>>>>>         I'm setting up a test ceph cluster and am having trouble
>>>>>> getting it
>>>>>>         running (great for testing, huh?). I went through the
>>>>>>         installation on
>>>>>>         Debian squeeze, had to modify the mkcephfs script a bit
>>>>>> because
>>>>>>         it calls
>>>>>>         monmaptool with too many paramaters in the $args variable
>>>>>> (mine had
>>>>>>         "--add a [ip address]:[port] [osd1]" and I had to get rid of
>>>>>> the
>>>>>>         [osd1]
>>>>>>         part for the monmaptool command to take it). Anyway, so I got
>>>>>> it
>>>>>>         installed, started the service, waiting a little while for it
>>>>>> to
>>>>>>         build
>>>>>>         the fs, and ran "ceph health" and got (and am still getting
>>>>>>         after a day
>>>>>>         and a reboot) the following error: (note: I have also been
>>>>>>         getting the
>>>>>>         first line in various calls, unsure why it is complaining, I
>>>>>>         followed
>>>>>>         the instructions...)
>>>>>>
>>>>>>         warning: line 34: 'host' in section 'mon.a' redefined
>>>>>>         2013-05-01 12:04:39.801102 b733b710 -1 WARNING: unknown auth
>>>>>>         protocol
>>>>>>         defined: [osd]
>>>>>>         HEALTH_WARN 384 pgs degraded; 384 pgs stuck unclean; recovery
>>>>>> 21/42
>>>>>>         degraded (50.000%)
>>>>>>
>>>>>>         Can anybody tell me the root of this issue, and how I can fix
>>>>>>         it? Thank you!
>>>>>>
>>>>>>         - Wyatt Gorman
>>>>>>
>>>>>>
>>>>>>         _________________________________________________
>>>>>>         ceph-users mailing list
>>>>>>         ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
>>>>>>         http://lists.ceph.com/__listinfo.cgi/ceph-users-ceph.__com
>>>>>>         <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>>>>>>
>>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>>
>>>
>>> --
>>> ______________________________
>>> Dino Yancey
>>> 2GNT.com Admin
>>
>>
>
>
>
> --
> ______________________________
> Dino Yancey
> 2GNT.com Admin
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com