Re: Fwd: Upgrade Woes on suse leap with OBS ceph.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kefu has just pointed out that this has the hallmarks of
https://github.com/ceph/ceph/pull/13275

On Fri, Feb 24, 2017 at 3:00 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
> Hmm,
>
> What's interesting is the feature set reported by the servers has only
> changed from
>
> e0106b84a846a42
>
> Bit 1 set Bit 6 set Bit 9 set Bit 11 set Bit 13 set Bit 14 set Bit 18
> set Bit 23 set Bit 25 set Bit 27 set Bit 30 set Bit 35 set Bit 36 set
> Bit 37 set Bit 39 set Bit 41 set Bit 42 set Bit 48 set Bit 57 set Bit
> 58 set Bit 59 set
>
> to
>
> e0106b84a846a52
>
> Bit 1 set Bit 4 set Bit 6 set Bit 9 set Bit 11 set Bit 13 set Bit 14
> set Bit 18 set Bit 23 set Bit 25 set Bit 27 set Bit 30 set Bit 35 set
> Bit 36 set Bit 37 set Bit 39 set Bit 41 set Bit 42 set Bit 48 set Bit
> 57 set Bit 58 set Bit 59 set
>
> So all it's done is *added* Bit 4 which is DEFINE_CEPH_FEATURE( 4, 1,
> SUBSCRIBE2)
>
>
> On Fri, Feb 24, 2017 at 1:40 PM, Schlacta, Christ <aarcane@xxxxxxxxxxx> wrote:
>> # begin crush map
>> tunable choose_local_tries 0
>> tunable choose_local_fallback_tries 0
>> tunable choose_total_tries 50
>> tunable chooseleaf_descend_once 1
>> tunable chooseleaf_vary_r 1
>> tunable straw_calc_version 1
>> tunable allowed_bucket_algs 54
>>
>> # devices
>> device 0 osd.0
>> device 1 osd.1
>> device 2 osd.2
>>
>> # types
>> type 0 osd
>> type 1 host
>> type 2 chassis
>> type 3 rack
>> type 4 row
>> type 5 pdu
>> type 6 pod
>> type 7 room
>> type 8 datacenter
>> type 9 region
>> type 10 root
>>
>> # buckets
>> host densetsu {
>>         id -2           # do not change unnecessarily
>>         # weight 0.293
>>         alg straw
>>         hash 0  # rjenkins1
>>         item osd.0 weight 0.146
>>         item osd.1 weight 0.146
>> }
>> host density {
>>         id -3           # do not change unnecessarily
>>         # weight 0.145
>>         alg straw
>>         hash 0  # rjenkins1
>>         item osd.2 weight 0.145
>> }
>> root default {
>>         id -1           # do not change unnecessarily
>>         # weight 0.438
>>         alg straw
>>         hash 0  # rjenkins1
>>         item densetsu weight 0.293
>>         item density weight 0.145
>> }
>>
>> # rules
>> rule replicated_ruleset {
>>         ruleset 0
>>         type replicated
>>         min_size 1
>>         max_size 10
>>         step take default
>>         step chooseleaf firstn 0 type host
>>         step emit
>> }
>>
>> # end crush map
>>
>> On Thu, Feb 23, 2017 at 7:37 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
>>> Did you dump out the crushmap and look?
>>>
>>> On Fri, Feb 24, 2017 at 1:36 PM, Schlacta, Christ <aarcane@xxxxxxxxxxx> wrote:
>>>> insofar as I can tell, yes.  Everything indicates that they are in effect.
>>>>
>>>> On Thu, Feb 23, 2017 at 7:14 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
>>>>> Is your change reflected in the current crushmap?
>>>>>
>>>>> On Fri, Feb 24, 2017 at 12:07 PM, Schlacta, Christ <aarcane@xxxxxxxxxxx> wrote:
>>>>>> ---------- Forwarded message ----------
>>>>>> From: Schlacta, Christ <aarcane@xxxxxxxxxxx>
>>>>>> Date: Thu, Feb 23, 2017 at 6:06 PM
>>>>>> Subject: Re:  Upgrade Woes on suse leap with OBS ceph.
>>>>>> To: Brad Hubbard <bhubbard@xxxxxxxxxx>
>>>>>>
>>>>>>
>>>>>> So setting the above to 0 by sheer brute force didn't work, so it's
>>>>>> not crush or osd problem..  also, the errors still say mon0, so I
>>>>>> suspect it's related to communication between libceph in kernel and
>>>>>> the mon.
>>>>>>
>>>>>> aarcane@densetsu:/etc/target$ sudo ceph --cluster rk osd crush tunables hammer
>>>>>> adjusted tunables profile to hammer
>>>>>> aarcane@densetsu:/etc/target$ ceph --cluster rk osd crush show-tunables
>>>>>> {
>>>>>>     "choose_local_tries": 0,
>>>>>>     "choose_local_fallback_tries": 0,
>>>>>>     "choose_total_tries": 50,
>>>>>>     "chooseleaf_descend_once": 1,
>>>>>>     "chooseleaf_vary_r": 1,
>>>>>>     "chooseleaf_stable": 0,
>>>>>>     "straw_calc_version": 1,
>>>>>>     "allowed_bucket_algs": 54,
>>>>>>     "profile": "hammer",
>>>>>>     "optimal_tunables": 0,
>>>>>>     "legacy_tunables": 0,
>>>>>>     "minimum_required_version": "firefly",
>>>>>>     "require_feature_tunables": 1,
>>>>>>     "require_feature_tunables2": 1,
>>>>>>     "has_v2_rules": 0,
>>>>>>     "require_feature_tunables3": 1,
>>>>>>     "has_v3_rules": 0,
>>>>>>     "has_v4_buckets": 0,
>>>>>>     "require_feature_tunables5": 0,
>>>>>>     "has_v5_rules": 0
>>>>>> }
>>>>>>
>>>>>> aarcane@densetsu:/etc/target$ sudo rbd --cluster rk map rt1
>>>>>> rbd: sysfs write failed
>>>>>> In some cases useful info is found in syslog - try "dmesg | tail" or so.
>>>>>> rbd: map failed: (110) Connection timed out
>>>>>> aarcane@densetsu:~$ dmesg | tail
>>>>>> [10118.778868] libceph: mon0 10.0.0.67:6789 feature set mismatch, my
>>>>>> 40106b84a842a52 < server's e0106b84a846a52, missing a00000000004000
>>>>>> [10118.779597] libceph: mon0 10.0.0.67:6789 missing required protocol features
>>>>>> [10119.834634] libceph: mon0 10.0.0.67:6789 feature set mismatch, my
>>>>>> 40106b84a842a52 < server's e0106b84a846a52, missing a00000000004000
>>>>>> [10119.835174] libceph: mon0 10.0.0.67:6789 missing required protocol features
>>>>>> [10120.762983] libceph: mon0 10.0.0.67:6789 feature set mismatch, my
>>>>>> 40106b84a842a52 < server's e0106b84a846a52, missing a00000000004000
>>>>>> [10120.763707] libceph: mon0 10.0.0.67:6789 missing required protocol features
>>>>>> [10121.787128] libceph: mon0 10.0.0.67:6789 feature set mismatch, my
>>>>>> 40106b84a842a52 < server's e0106b84a846a52, missing a00000000004000
>>>>>> [10121.787847] libceph: mon0 10.0.0.67:6789 missing required protocol features
>>>>>> [10122.911117] libceph: mon0 10.0.0.67:6789 feature set mismatch, my
>>>>>> 40106b84a842a52 < server's e0106b84a846a52, missing a00000000004000
>>>>>> [10122.911872] libceph: mon0 10.0.0.67:6789 missing required protocol features
>>>>>> aarcane@densetsu:~$
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 23, 2017 at 5:56 PM, Schlacta, Christ <aarcane@xxxxxxxxxxx> wrote:
>>>>>>> They're from the suse leap ceph team.  They maintain ceph, and build
>>>>>>> up to date versions for suse leap.  What I don't know is how to
>>>>>>> disable it.  When I try, I get the following mess:
>>>>>>>
>>>>>>> aarcane@densetsu:/etc/target$ ceph --cluster rk osd crush set-tunable
>>>>>>> require_feature_tunables5 0
>>>>>>> Invalid command:  require_feature_tunables5 not in straw_calc_version
>>>>>>> osd crush set-tunable straw_calc_version <int> :  set crush tunable
>>>>>>> <tunable> to <value>
>>>>>>> Error EINVAL: invalid command
>>>>>>>
>>>>>>> On Thu, Feb 23, 2017 at 5:54 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
>>>>>>>> On Fri, Feb 24, 2017 at 11:00 AM, Schlacta, Christ <aarcane@xxxxxxxxxxx> wrote:
>>>>>>>>> aarcane@densetsu:~$ ceph --cluster rk osd crush show-tunables
>>>>>>>>> {
>>>>>>>>>     "choose_local_tries": 0,
>>>>>>>>>     "choose_local_fallback_tries": 0,
>>>>>>>>>     "choose_total_tries": 50,
>>>>>>>>>     "chooseleaf_descend_once": 1,
>>>>>>>>>     "chooseleaf_vary_r": 1,
>>>>>>>>>     "chooseleaf_stable": 1,
>>>>>>>>>     "straw_calc_version": 1,
>>>>>>>>>     "allowed_bucket_algs": 54,
>>>>>>>>>     "profile": "jewel",
>>>>>>>>>     "optimal_tunables": 1,
>>>>>>>>>     "legacy_tunables": 0,
>>>>>>>>>     "minimum_required_version": "jewel",
>>>>>>>>>     "require_feature_tunables": 1,
>>>>>>>>>     "require_feature_tunables2": 1,
>>>>>>>>>     "has_v2_rules": 0,
>>>>>>>>>     "require_feature_tunables3": 1,
>>>>>>>>>     "has_v3_rules": 0,
>>>>>>>>>     "has_v4_buckets": 0,
>>>>>>>>>     "require_feature_tunables5": 1,
>>>>>>>>
>>>>>>>> I suspect setting the above to 0 would resolve the issue with the
>>>>>>>> client but there may be a reason why this is set?
>>>>>>>>
>>>>>>>> Where did those packages come from?
>>>>>>>>
>>>>>>>>>     "has_v5_rules": 0
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> On Thu, Feb 23, 2017 at 4:45 PM, Brad Hubbard <bhubbard@xxxxxxxxxx> wrote:
>>>>>>>>>> On Thu, Feb 23, 2017 at 5:18 PM, Schlacta, Christ <aarcane@xxxxxxxxxxx> wrote:
>>>>>>>>>>> So I updated suse leap, and now I'm getting the following error from
>>>>>>>>>>> ceph.  I know I need to disable some features, but I'm not sure what
>>>>>>>>>>> they are..  Looks like 14, 57, and 59, but I can't figure out what
>>>>>>>>>>> they correspond to, nor therefore, how to turn them off.
>>>>>>>>>>>
>>>>>>>>>>> libceph: mon0 10.0.0.67:6789 feature set mismatch, my 40106b84a842a42
>>>>>>>>>>> < server's e0106b84a846a42, missing a00000000004000
>>>>>>>>>>
>>>>>>>>>> http://cpp.sh/2rfy says...
>>>>>>>>>>
>>>>>>>>>> Bit 14 set
>>>>>>>>>> Bit 57 set
>>>>>>>>>> Bit 59 set
>>>>>>>>>>
>>>>>>>>>> Comparing this to
>>>>>>>>>> https://github.com/ceph/ceph/blob/master/src/include/ceph_features.h
>>>>>>>>>> shows...
>>>>>>>>>>
>>>>>>>>>> DEFINE_CEPH_FEATURE(14, 2, SERVER_KRAKEN)
>>>>>>>>>> DEFINE_CEPH_FEATURE(57, 1, MON_STATEFUL_SUB)
>>>>>>>>>> DEFINE_CEPH_FEATURE(57, 1, MON_ROUTE_OSDMAP) // overlap
>>>>>>>>>> DEFINE_CEPH_FEATURE(57, 1, OSDSUBOP_NO_SNAPCONTEXT) // overlap
>>>>>>>>>> DEFINE_CEPH_FEATURE(57, 1, SERVER_JEWEL) // overlap
>>>>>>>>>> DEFINE_CEPH_FEATURE(59, 1, FS_BTIME)
>>>>>>>>>> DEFINE_CEPH_FEATURE(59, 1, FS_CHANGE_ATTR) // overlap
>>>>>>>>>> DEFINE_CEPH_FEATURE(59, 1, MSG_ADDR2) // overlap
>>>>>>>>>>
>>>>>>>>>> $ echo "obase=16;ibase=16;$(echo e0106b84a846a42-a00000000004000|tr
>>>>>>>>>> '[a-z]' '[A-Z]')"|bc -qi
>>>>>>>>>> obase=16;ibase=16;E0106B84A846A42-A00000000004000
>>>>>>>>>> 40106B84A842A42
>>>>>>>>>>
>>>>>>>>>> So "me" (the client kernel) does not have the above features that are
>>>>>>>>>> present on the servers.
>>>>>>>>>>
>>>>>>>>>> Can you post the output of "ceph osd crush show-tunables"?
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> SuSE Leap 42.2 is Up to date as of tonight, no package updates available.
>>>>>>>>>>> All the ceph packages have the following version:
>>>>>>>>>>>
>>>>>>>>>>> 11.1.0+git.1486588482.ba197ae-72.1
>>>>>>>>>>>
>>>>>>>>>>> And the kernel has version:
>>>>>>>>>>>
>>>>>>>>>>> 4.4.49-16.1
>>>>>>>>>>>
>>>>>>>>>>> It was working perfectly before the upgrade.
>>>>>>>>>>>
>>>>>>>>>>> Thank you very much
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> ceph-users mailing list
>>>>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Cheers,
>>>>>>>>>> Brad
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Cheers,
>>>>>>>> Brad
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cheers,
>>>>> Brad
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Brad
>
>
>
> --
> Cheers,
> Brad



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux