Re: strange remap on host failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



See the release notes for the jewel releases which include
instructions for upgrading from hammer.

On Wed, May 31, 2017 at 1:53 PM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:
> Hi Brad,
>
> Thank you for the answer.
> We are aware of the fact that hammer is close to retirement, and we are
> planning for the upgrade. BTW: can you recommend some documentation to read
> before the hammer -> jewel upgrade? I know
> http://docs.ceph.com/docs/jewel/install/upgrading-ceph/ and that google is
> my friend, but I'm asking just to be sure I'm not missing something that
> could be important.
>
> Thank you.
> Laszlo
>
>
> On 31.05.2017 02:51, Brad Hubbard wrote:
>>
>> It should also be noted that hammer is pretty close to retirement and
>> is a poor choice for new clusters.
>>
>> On Wed, May 31, 2017 at 6:17 AM, Gregory Farnum <gfarnum@xxxxxxxxxx>
>> wrote:
>>>
>>> On Mon, May 29, 2017 at 4:58 AM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx>
>>> wrote:
>>>>
>>>>
>>>> Hello all,
>>>>
>>>> We have a ceph cluster with 72 OSDs distributed on 6 hosts, in 3
>>>> chassis. In
>>>> our crush map the we are distributing the PGs on chassis (complete crush
>>>> map
>>>> below):
>>>>
>>>> # rules
>>>> rule replicated_ruleset {
>>>>          ruleset 0
>>>>          type replicated
>>>>          min_size 1
>>>>          max_size 10
>>>>          step take default
>>>>          step chooseleaf firstn 0 type chassis
>>>>          step emit
>>>> }
>>>>
>>>> We had a host failure, and I can see that ceph is using 2 OSDs from the
>>>> same
>>>> chassis for a lot of the remapped PGs. Even worse, I can see that there
>>>> are
>>>> cases when a PG is using two OSDs from the same host like here:
>>>>
>>>> 3.5f6   37      0       4       37      0       149446656       3040
>>>> 3040
>>>> active+remapped 2017-05-26 11:29:23.122820      61820'222074
>>>> 61820:158025
>>>> [52,39] 52      [52,39,3]       52      61488'198356    2017-05-23
>>>> 23:51:56.210597      61488'198356    2017-05-23 23:51:56.210597
>>>>
>>>> I have tis in the log:
>>>> 2017-05-26 11:26:53.244424 osd.52 10.12.193.69:6801/7044 1510 : cluster
>>>> [INF] 3.5f6 restarting backfill on osd.39 from (0'0,0'0] MAX to
>>>> 61488'203000
>>>>
>>>> What can be wrong?
>>>
>>>
>>> It's not clear from the output you've provided whether your pools have
>>> size 2 or 3. From what you've shown, I'm guessing you have size 2, and
>>> the OSD failure prompted a move of the PG in question away from OSD 3
>>> to OSD 39. Since 39 doesn't have any of the data yet, OSD 3 is being
>>> maintained in the acting set to maintain redundancy, but it will go
>>> away one the backfill is done.
>>>
>>> In general, it's a failure of CRUSH's design goals if you see moves of
>>> the replica within buckets which didn't experience failure, but they
>>> do sometimes happen. There have been a lot of improvements over the
>>> years to reduce how often that happens, some of which are supported by
>>> Hammer but not on by default (because it prevents use of older
>>> clients), some of which are only in very new code like the Luminous
>>> dev releases. I suspect you'd find things behave better under your
>>> cluster if you upgrade to Jewel and set the CRUSH flags it recommends
>>> to you.
>>> -Greg
>>>
>>>>
>>>>
>>>> Our crush map looks like this:
>>>>
>>>> # begin crush map
>>>> tunable choose_local_tries 0
>>>> tunable choose_local_fallback_tries 0
>>>> tunable choose_total_tries 50
>>>> tunable chooseleaf_descend_once 1
>>>> tunable straw_calc_version 1
>>>>
>>>> # devices
>>>> device 0 osd.0
>>>> device 1 osd.1
>>>> device 2 osd.2
>>>> device 3 osd.3
>>>> ....
>>>> device 69 osd.69
>>>> device 70 osd.70
>>>> device 71 osd.71
>>>>
>>>> # types
>>>> type 0 osd
>>>> type 1 host
>>>> type 2 chassis
>>>> type 3 rack
>>>> type 4 row
>>>> type 5 pdu
>>>> type 6 pod
>>>> type 7 room
>>>> type 8 datacenter
>>>> type 9 region
>>>> type 10 root
>>>>
>>>> # buckets
>>>> host tv-c1-al01 {
>>>>          id -7           # do not change unnecessarily
>>>>          # weight 21.840
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item osd.5 weight 1.820
>>>>          item osd.11 weight 1.820
>>>>          item osd.17 weight 1.820
>>>>          item osd.23 weight 1.820
>>>>          item osd.29 weight 1.820
>>>>          item osd.35 weight 1.820
>>>>          item osd.41 weight 1.820
>>>>          item osd.47 weight 1.820
>>>>          item osd.53 weight 1.820
>>>>          item osd.59 weight 1.820
>>>>          item osd.65 weight 1.820
>>>>          item osd.71 weight 1.820
>>>> }
>>>> host tv-c1-al02 {
>>>>          id -3           # do not change unnecessarily
>>>>          # weight 21.840
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item osd.1 weight 1.820
>>>>          item osd.7 weight 1.820
>>>>          item osd.13 weight 1.820
>>>>          item osd.19 weight 1.820
>>>>          item osd.25 weight 1.820
>>>>          item osd.31 weight 1.820
>>>>          item osd.37 weight 1.820
>>>>          item osd.43 weight 1.820
>>>>          item osd.49 weight 1.820
>>>>          item osd.55 weight 1.820
>>>>          item osd.61 weight 1.820
>>>>          item osd.67 weight 1.820
>>>> }
>>>> chassis tv-c1 {
>>>>          id -8           # do not change unnecessarily
>>>>          # weight 43.680
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item tv-c1-al01 weight 21.840
>>>>          item tv-c1-al02 weight 21.840
>>>> }
>>>> host tv-c2-al01 {
>>>>          id -5           # do not change unnecessarily
>>>>          # weight 21.840
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item osd.3 weight 1.820
>>>>          item osd.9 weight 1.820
>>>>          item osd.15 weight 1.820
>>>>          item osd.21 weight 1.820
>>>>          item osd.27 weight 1.820
>>>>          item osd.33 weight 1.820
>>>>          item osd.39 weight 1.820
>>>>          item osd.45 weight 1.820
>>>>          item osd.51 weight 1.820
>>>>          item osd.57 weight 1.820
>>>>          item osd.63 weight 1.820
>>>>          item osd.70 weight 1.820
>>>> }
>>>> host tv-c2-al02 {
>>>>          id -2           # do not change unnecessarily
>>>>          # weight 21.840
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item osd.0 weight 1.820
>>>>          item osd.6 weight 1.820
>>>>          item osd.12 weight 1.820
>>>>          item osd.18 weight 1.820
>>>>          item osd.24 weight 1.820
>>>>          item osd.30 weight 1.820
>>>>          item osd.36 weight 1.820
>>>>          item osd.42 weight 1.820
>>>>          item osd.48 weight 1.820
>>>>          item osd.54 weight 1.820
>>>>          item osd.60 weight 1.820
>>>>          item osd.66 weight 1.820
>>>> }
>>>> chassis tv-c2 {
>>>>          id -9           # do not change unnecessarily
>>>>          # weight 43.680
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item tv-c2-al01 weight 21.840
>>>>          item tv-c2-al02 weight 21.840
>>>> }
>>>> host tv-c1-al03 {
>>>>          id -6           # do not change unnecessarily
>>>>          # weight 21.840
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item osd.4 weight 1.820
>>>>          item osd.10 weight 1.820
>>>>          item osd.16 weight 1.820
>>>>          item osd.22 weight 1.820
>>>>          item osd.28 weight 1.820
>>>>          item osd.34 weight 1.820
>>>>          item osd.40 weight 1.820
>>>>          item osd.46 weight 1.820
>>>>          item osd.52 weight 1.820
>>>>          item osd.58 weight 1.820
>>>>          item osd.64 weight 1.820
>>>>          item osd.69 weight 1.820
>>>> }
>>>> host tv-c2-al03 {
>>>>          id -4           # do not change unnecessarily
>>>>          # weight 21.840
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item osd.2 weight 1.820
>>>>          item osd.8 weight 1.820
>>>>          item osd.14 weight 1.820
>>>>          item osd.20 weight 1.820
>>>>          item osd.26 weight 1.820
>>>>          item osd.32 weight 1.820
>>>>          item osd.38 weight 1.820
>>>>          item osd.44 weight 1.820
>>>>          item osd.50 weight 1.820
>>>>          item osd.56 weight 1.820
>>>>          item osd.62 weight 1.820
>>>>          item osd.68 weight 1.820
>>>> }
>>>> chassis tv-c3 {
>>>>          id -10          # do not change unnecessarily
>>>>          # weight 43.680
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item tv-c1-al03 weight 21.840
>>>>          item tv-c2-al03 weight 21.840
>>>> }
>>>> root default {
>>>>          id -1           # do not change unnecessarily
>>>>          # weight 131.040
>>>>          alg straw
>>>>          hash 0  # rjenkins1
>>>>          item tv-c1 weight 43.680
>>>>          item tv-c2 weight 43.680
>>>>          item tv-c3 weight 43.680
>>>> }
>>>>
>>>> # rules
>>>> rule replicated_ruleset {
>>>>          ruleset 0
>>>>          type replicated
>>>>          min_size 1
>>>>          max_size 10
>>>>          step take default
>>>>          step chooseleaf firstn 0 type chassis
>>>>          step emit
>>>> }
>>>>
>>>> # end crush map
>>>>
>>>>
>>>> Thank you,
>>>> Laszlo
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>>
>



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux