Re: ceph osd metadata fails if any osd is down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Done - http://tracker.ceph.com/issues/17685

thanks!

On Mon, Oct 24, 2016 at 1:17 PM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> On Mon, 24 Oct 2016, Wyllys Ingersoll wrote:
>> I think there is still a bug in the "osd metadata" reporting in 10.2.3
>> - the JSON structure returned is not terminated when the OSD is added
>> but not running or added to the crush map yet.
>>
>> Its an odd condition to get into, but when adding a disk and there is
>> an issue that causes it to fail to complete the add operation such as
>> when the permissions in the /var/lib/ceph/osd/osd/XXX are incorrectly
>> set to root:root instead of ceph:ceph, the metadata output does not
>> terminate with the final closing bracket "]".
>>
>> Here is the end of the (truncated) output from "ceph osd tree" showing
>> the disk just recently added but without any weight and marked "down".
>
> It looks like the code in OSDMonitor.cc is just buggy.. it was some odd
> error handling that probably doesn't need to be there and then does a goto
> reply instead of a simple break, so that we skip the close_section().
>
> Should be a quick fix and backport.  Do you mind opening a trakcer ticket?
>
> Thanks!
> sage
>
>
>
>>  -2 130.67999             host ic1ss06
>>   0   3.62999                 osd.0         up  1.00000          1.00000
>>   6   3.62999                 osd.6         up  1.00000          1.00000
>>   7   3.62999                 osd.7         up  1.00000          1.00000
>>  13   3.62999                 osd.13        up  1.00000          1.00000
>>  21   3.62999                 osd.21        up  1.00000          1.00000
>>  27   3.62999                 osd.27        up  1.00000          1.00000
>>  33   3.62999                 osd.33        up  1.00000          1.00000
>>  39   3.62999                 osd.39        up  1.00000          1.00000
>>  46   3.62999                 osd.46        up  1.00000          1.00000
>>  48   3.62999                 osd.48        up  1.00000          1.00000
>>  55   3.62999                 osd.55        up  1.00000          1.00000
>>  60   3.62999                 osd.60        up  1.00000          1.00000
>>  66   3.62999                 osd.66        up  1.00000          1.00000
>>  72   3.62999                 osd.72        up  1.00000          1.00000
>>  75   3.62999                 osd.75        up  1.00000          1.00000
>>  81   3.62999                 osd.81        up  1.00000          1.00000
>>  88   3.62999                 osd.88        up  1.00000          1.00000
>>  97   3.62999                 osd.97        up  1.00000          1.00000
>>  99   3.62999                 osd.99        up  1.00000          1.00000
>> 102   3.62999                 osd.102       up  1.00000          1.00000
>> 110   3.62999                 osd.110       up  1.00000          1.00000
>> 120   3.62999                 osd.120       up  1.00000          1.00000
>> 127   3.62999                 osd.127       up  1.00000          1.00000
>> 129   3.62999                 osd.129       up  1.00000          1.00000
>> 136   3.62999                 osd.136       up  1.00000          1.00000
>> 140   3.62999                 osd.140       up  1.00000          1.00000
>> 147   3.62999                 osd.147       up  1.00000          1.00000
>> 155   3.62999                 osd.155       up  1.00000          1.00000
>> 165   3.62999                 osd.165       up  1.00000          1.00000
>> 166   3.62999                 osd.166       up  1.00000          1.00000
>> 174   3.62999                 osd.174       up  1.00000          1.00000
>> 184   3.62999                 osd.184       up  1.00000          1.00000
>> 190   3.62999                 osd.190       up  1.00000          1.00000
>> 194   3.62999                 osd.194       up  1.00000          1.00000
>> 202   3.62999                 osd.202       up  1.00000          1.00000
>> 209   3.62999                 osd.209       up  1.00000          1.00000
>> 173         0 osd.173                     down  1.00000          1.00000
>>
>>
>> Now when I run "ceph osd metadata", note that the closing "]" is missing.
>>
>> $ ceph osd metadata
>>
>> [
>> ...
>>         "osd": {
>>             "id": 213,
>>             "arch": "x86_64",
>>             "back_addr": "10.10.21.54:6861\/168468",
>>             "backend_filestore_dev_node": "unknown",
>>             "backend_filestore_partition_path": "unknown",
>>             "ceph_version": "ceph version 10.2.3
>> (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)",
>>             "cpu": "Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz",
>>             "distro": "Ubuntu",
>>             "distro_codename": "trusty",
>>             "distro_description": "Ubuntu 14.04.3 LTS",
>>             "distro_version": "14.04",
>>             "filestore_backend": "xfs",
>>             "filestore_f_type": "0x58465342",
>>             "front_addr": "10.10.20.54:6825\/168468",
>>             "hb_back_addr": "10.10.21.54:6871\/168468",
>>             "hb_front_addr": "10.10.20.54:6828\/168468",
>>             "hostname": "ic1ss04",
>>             "kernel_description": "#26~14.04.1-Ubuntu SMP Fri Jul 24
>> 21:16:20 UTC 2015",
>>             "kernel_version": "3.19.0-25-generic",
>>             "mem_swap_kb": "15998972",
>>             "mem_total_kb": "131927464",
>>             "os": "Linux",
>>             "osd_data": "\/var\/lib\/ceph\/osd\/ceph-213",
>>             "osd_journal": "\/var\/lib\/ceph\/osd\/ceph-213\/journal",
>>             "osd_objectstore": "filestore"
>>         },
>>         "osd": {
>>             "id": 214,
>>             "arch": "x86_64",
>>             "back_addr": "10.10.21.55:6877\/177645",
>>             "backend_filestore_dev_node": "unknown",
>>             "backend_filestore_partition_path": "unknown",
>>             "ceph_version": "ceph version 10.2.3
>> (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)",
>>             "cpu": "Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz",
>>             "distro": "Ubuntu",
>>             "distro_codename": "trusty",
>>             "distro_description": "Ubuntu 14.04.3 LTS",
>>             "distro_version": "14.04",
>>             "filestore_backend": "xfs",
>>             "filestore_f_type": "0x58465342",
>>             "front_addr": "10.10.20.55:6844\/177645",
>>             "hb_back_addr": "10.10.21.55:6879\/177645",
>>             "hb_front_addr": "10.10.20.55:6848\/177645",
>>             "hostname": "ic1ss05",
>>             "kernel_description": "#26~14.04.1-Ubuntu SMP Fri Jul 24
>> 21:16:20 UTC 2015",
>>             "kernel_version": "3.19.0-25-generic",
>>             "mem_swap_kb": "15998972",
>>             "mem_total_kb": "131927464",
>>             "os": "Linux",
>>             "osd_data": "\/var\/lib\/ceph\/osd\/ceph-214",
>>             "osd_journal": "\/var\/lib\/ceph\/osd\/ceph-214\/journal",
>>             "osd_objectstore": "filestore"
>>         }
>>     }
>> ^^^^
>> Missing closing "]"
>>
>>
>> -Wyllys Ingersoll
>>  Keeper Technology, LLC
>>
>>
>> On Wed, Sep 21, 2016 at 5:12 PM, John Spray <jspray@xxxxxxxxxx> wrote:
>> > On Wed, Sep 21, 2016 at 6:29 PM, Wyllys Ingersoll
>> > <wyllys.ingersoll@xxxxxxxxxxxxxx> wrote:
>> >> In 10.2.2 when running "ceph osd metadata" (defaulting to get metdata for
>> >> "all" OSDs), if even 1 OSD is currently marked "down", the entire command
>> >> fails and returns an error:
>> >>
>> >> $ ceph osd metadata
>> >> Error ENOENT:
>> >>
>> >> - One OSD in the cluster was "down", I removed that OSD and re-ran the
>> >> command successfully.
>> >>
>> >> It seems that the "metadata" command should be able to dump the data for
>> >> the OSDs that are up and ignore the ones that are down.  Is this a known
>> >> bug?
>> >
>> > Probably fixed by
>> > https://github.com/ceph/ceph/commit/f5db5a4b0bb52fed544f277c28ab5088d1c3fc79
>> > which is in 10.2.3
>> >
>> > John
>> >
>> >>
>> >> -Wyllys Ingersoll
>> >>  Keeper Technology, LLC
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux