osd id == 2147483647 (2^31 - 1)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Loic,

Thanks for quick response. Before I started putting osd in/out there
were no such problems.
Cluster health has been OK. And second thing is that I'm using *rack*
failure domain (and there
are three racks) so shouldn't be there two missing OSDs?


mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph osd pool get ecpool
erasure_code_profile
erasure_code_profile: caas



mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph osd
erasure-code-profile get caas
directory=/usr/lib/ceph/erasure-code
k=3
m=2
plugin=jerasure
ruleset-failure-domain=rack
technique=reed_sol_van



mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph osd pool get ecpool
crush_ruleset
crush_ruleset: 2



mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph osd crush rule dump
[
    { "rule_id": 0,
      "rule_name": "replicated_ruleset",
      "ruleset": 0,
      "type": 1,
      "min_size": 1,
      "max_size": 10,
      "steps": [
            { "op": "take",
              "item": -1,
              "item_name": "default"},
            { "op": "chooseleaf_firstn",
              "num": 0,
              "type": "rack"},
            { "op": "emit"}]},
    { "rule_id": 1,
      "rule_name": "erasure-code",
      "ruleset": 1,
      "type": 3,
      "min_size": 3,
      "max_size": 20,
      "steps": [
            { "op": "set_chooseleaf_tries",
              "num": 5},
            { "op": "take",
              "item": -1,
              "item_name": "default"},
            { "op": "chooseleaf_indep",
              "num": 0,
              "type": "host"},
            { "op": "emit"}]},
    { "rule_id": 2,
      "rule_name": "ecpool",
      "ruleset": 2,
      "type": 3,
      "min_size": 3,
      "max_size": 20,
      "steps": [
            { "op": "set_chooseleaf_tries",
              "num": 5},
            { "op": "take",
              "item": -1,
              "item_name": "default"},
            { "op": "chooseleaf_indep",
              "num": 0,
              "type": "rack"},
            { "op": "emit"}]}]





Regards,
PS

On 05/26/2015 11:36 AM, Loic Dachary wrote:
> Hi,
>
> When an erasure coded pool
>
>    pool 4 'ecpool' erasure size 5 min_
>
> does not have enough OSDs to map a PG, the missing OSDs shows as 2147483647 and that's what you have in
>
>    [7,2,2147483647,6,10]
>
> in the case of a replicated pool, the missing OSDs would be omitted instead. In Hammer 2147483647 shows as NONE which is less confusing.
>
> Cheers
>
> On 26/05/2015 09:16, Pawe? Sadowski wrote:
>> Has anyone saw something like this: osd id == 2147483647
>> (2147483647 == 2^31 - 1). Looks like some int casting bug
>> but I have no idea where to look for it (and I don't know
>> exact steps to reproduce this - I was just doing osd in/osd out
>> multiple times to test recovery speed under some client load).
>>
>>
>>
>> mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph --version
>> ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047)
>>
>>
>>
>> mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph osd tree
>> # id    weight    type name    up/down    reweight
>> -1    65.04    root default
>> -3    21.68        rack v2_2
>> -2    21.68            host cephhost-v3-6T-1600-1
>> 0     5.42                osd.0    up    1   
>> 5     5.42                osd.5    up    1   
>> 7     5.42                osd.7    up    1   
>> 9     5.42                osd.9    up    1   
>> -5    21.68        rack v2_1
>> -4    21.68            host cephhost-v3-6T-1600-3
>> 1     5.42                osd.1    up    1   
>> 3     5.42                osd.3    up    1   
>> 6     5.42                osd.6    up    1   
>> 10    5.42                osd.10   up    1   
>> -7    21.68        rack v2_0
>> -6    21.68            host cephhost-v3-6T-1600-2
>> 2     5.42                osd.2    up    1   
>> 4     5.42                osd.4    up    1   
>> 8     5.42                osd.8    up    1   
>> 11    5.42                osd.11   up    1   
>>
>>
>>
>> mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph -s
>>     cluster 01525673-bc76-433e-8a68-12578d797b1c
>>      health HEALTH_WARN 7 pgs stuck unclean; recovery 7018/585688
>> objects degraded (1.198%)
>>      monmap e1: 3 mons at
>> {mon-01-01525673-bc76-433e-8a68-12578d797b1c=xx.yy.192.50:6789/0,mon-02-01525673-bc76-433e-8a68-12578d797b1c=xx.yy.196.150:6789/0,mon-03-01525673-bc76-433e-8a68-12578d797b1c=xx.yy.196.50:6789/0},
>> election epoch 124, quorum 0,1,2
>> mon-01-01525673-bc76-433e-8a68-12578d797b1c,mon-03-01525673-bc76-433e-8a68-12578d797b1c,mon-02-01525673-bc76-433e-8a68-12578d797b1c
>>      osdmap e749: 12 osds: 12 up, 12 in
>>       pgmap v354398: 256 pgs, 4 pools, 657 GB data, 164 kobjects
>>             1553 GB used, 65097 GB / 66651 GB avail
>>             7018/585688 objects degraded (1.198%)
>>                  249 active+clean
>>                    7 active+remapped
>>   client io 8564 kB/s wr, 4282 op/s
>>
>>
>>
>> mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph health detail
>> HEALTH_WARN 7 pgs stuck unclean; recovery 7018/585688 objects degraded
>> (1.198%)
>> pg 4.6 is stuck unclean for 65882.220407, current state active+remapped,
>> last acting [2,5,2147483647,3,6]
>> pg 4.3d is stuck unclean for 329578.358665, current state
>> active+remapped, last acting [10,11,6,2147483647,7]
>> pg 4.26 is stuck unclean for 65921.804191, current state
>> active+remapped, last acting [7,2,2147483647,6,10]
>> pg 4.4 is stuck unclean for 67215.534376, current state active+remapped,
>> last acting [0,6,2147483647,8,1]
>> pg 4.2e is stuck unclean for 67215.524683, current state
>> active+remapped, last acting [4,0,6,2147483647,1]
>> pg 4.1f is stuck unclean for 67215.518614, current state
>> active+remapped, last acting [11,1,5,2147483647,6]
>> pg 4.b is stuck unclean for 65882.230284, current state active+remapped,
>> last acting [0,2147483647,3,6,2]
>> recovery 7018/585688 objects degraded (1.198%)
>>
>>
>>
>> mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph osd dump
>> epoch 749
>> fsid 01525673-bc76-433e-8a68-12578d797b1c
>> created 2015-05-14 10:15:08.505823
>> modified 2015-05-26 07:11:55.702597
>> flags
>> pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash
>> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool
>> crash_replay_interval 45 stripe_width 0
>> pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0
>> object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool
>> stripe_width 0
>> pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash
>> rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0
>> pool 4 'ecpool' erasure size 5 min_size 3 crush_ruleset 2 object_hash
>> rjenkins pg_num 64 pgp_num 64 last_change 37 flags hashpspool
>> stripe_width 4128
>> max_osd 12
>> osd.0 up   in  weight 1 up_from 562 up_thru 728 down_at 560
>> last_clean_interval [556,559) xx.yy.196.51:6800/13911
>> xx.yy.210.50:6800/13911 xx.yy.210.50:6801/13911 xx.yy.196.51:6801/13911
>> exists,up 9383bc9b-e9b5-43ed-9508-51dc03027be4
>> osd.1 up   in  weight 1 up_from 717 up_thru 742 down_at 687
>> last_clean_interval [562,686) xx.yy.196.151:6800/12122
>> xx.yy.210.150:6800/12122 xx.yy.210.150:6801/12122
>> xx.yy.196.151:6801/12122 exists,up 870f7364-e258-47bc-83db-bd08f0881480
>> osd.2 up   in  weight 1 up_from 562 up_thru 748 down_at 560
>> last_clean_interval [557,559) xx.yy.192.51:6800/19295
>> xx.yy.208.50:6800/19295 xx.yy.208.50:6802/19295 xx.yy.192.51:6801/19295
>> exists,up 0f47c22b-b1e8-4329-93f6-1f0be320b4fe
>> osd.3 up   in  weight 1 up_from 675 up_thru 729 down_at 660
>> last_clean_interval [562,662) xx.yy.196.152:6800/13559
>> xx.yy.210.151:6800/13559 xx.yy.210.151:6801/13559
>> xx.yy.196.152:6801/13559 exists,up 714322e7-9b30-42c1-849d-10051dd4c070
>> osd.4 up   in  weight 1 up_from 562 up_thru 738 down_at 560
>> last_clean_interval [557,559) xx.yy.192.52:6800/19660
>> xx.yy.208.51:6800/19660 xx.yy.208.51:6801/19660 xx.yy.192.52:6801/19660
>> exists,up db93930d-fe02-499d-ab23-a68d023336c8
>> osd.5 up   in  weight 1 up_from 561 up_thru 735 down_at 560
>> last_clean_interval [557,559) xx.yy.196.52:6800/29702
>> xx.yy.210.51:6800/29702 xx.yy.210.51:6802/29702 xx.yy.196.52:6801/29702
>> exists,up 688ffcd6-ce68-4714-afa1-ea5963f40faf
>> osd.6 up   in  weight 1 up_from 562 up_thru 741 down_at 560
>> last_clean_interval [558,559) xx.yy.196.153:6800/2705
>> xx.yy.210.152:6800/2705 xx.yy.210.152:6802/2705 xx.yy.196.153:6801/2705
>> exists,up 9b9f09cf-0a0f-4e34-85d9-e95660bc2e0a
>> osd.7 up   in  weight 1 up_from 563 up_thru 744 down_at 562
>> last_clean_interval [557,559) xx.yy.196.53:6801/29560
>> xx.yy.210.52:6800/29560 xx.yy.210.52:6801/29560 xx.yy.196.53:6802/29560
>> exists,up c6e7bbd0-b060-4ddf-90e7-dacbb0c38c0a
>> osd.8 up   in  weight 1 up_from 562 up_thru 740 down_at 560
>> last_clean_interval [558,559) xx.yy.192.53:6800/15749
>> xx.yy.208.52:6801/15749 xx.yy.208.52:6802/15749 xx.yy.192.53:6801/15749
>> exists,up 5e40b427-b93d-4465-91ec-457395b71ddd
>> osd.9 up   in  weight 1 up_from 563 up_thru 707 down_at 562
>> last_clean_interval [557,559) xx.yy.196.54:6801/28813
>> xx.yy.210.53:6800/28813 xx.yy.210.53:6802/28813 xx.yy.196.54:6802/28813
>> exists,up 184e6b4d-4b19-4ee1-af94-507b349344d0
>> osd.10 up   in  weight 1 up_from 672 up_thru 689 down_at 658
>> last_clean_interval [561,659) xx.yy.196.154:6800/882
>> xx.yy.210.153:6800/882 xx.yy.210.153:6801/882 xx.yy.196.154:6801/882
>> exists,up 8a0807b3-548b-47fd-a4a2-f40c981753b0
>> osd.11 up   in  weight 1 up_from 563 up_thru 722 down_at 562
>> last_clean_interval [558,559) xx.yy.192.54:6800/15610
>> xx.yy.208.53:6800/15610 xx.yy.208.53:6802/15610 xx.yy.192.54:6802/15610
>> exists,up f1e462c0-e844-4290-8c71-f35d778641bd
>> pg_temp 4.4 [0,6,2147483647,8,1]
>> pg_temp 4.6 [2,5,2147483647,3,6]
>> pg_temp 4.b [0,2147483647,3,6,2]
>> pg_temp 4.1f [11,1,5,2147483647,6]
>> pg_temp 4.26 [7,2,2147483647,6,10]
>> pg_temp 4.2e [4,0,6,2147483647,1]
>> pg_temp 4.3d [10,11,6,2147483647,7]


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux