Hi, When an erasure coded pool pool 4 'ecpool' erasure size 5 min_ does not have enough OSDs to map a PG, the missing OSDs shows as 2147483647 and that's what you have in [7,2,2147483647,6,10] in the case of a replicated pool, the missing OSDs would be omitted instead. In Hammer 2147483647 shows as NONE which is less confusing. Cheers On 26/05/2015 09:16, Pawe? Sadowski wrote: > Has anyone saw something like this: osd id == 2147483647 > (2147483647 == 2^31 - 1). Looks like some int casting bug > but I have no idea where to look for it (and I don't know > exact steps to reproduce this - I was just doing osd in/osd out > multiple times to test recovery speed under some client load). > > > > mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph --version > ceph version 0.80.9 (b5a67f0e1d15385bc0d60a6da6e7fc810bde6047) > > > > mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph osd tree > # id weight type name up/down reweight > -1 65.04 root default > -3 21.68 rack v2_2 > -2 21.68 host cephhost-v3-6T-1600-1 > 0 5.42 osd.0 up 1 > 5 5.42 osd.5 up 1 > 7 5.42 osd.7 up 1 > 9 5.42 osd.9 up 1 > -5 21.68 rack v2_1 > -4 21.68 host cephhost-v3-6T-1600-3 > 1 5.42 osd.1 up 1 > 3 5.42 osd.3 up 1 > 6 5.42 osd.6 up 1 > 10 5.42 osd.10 up 1 > -7 21.68 rack v2_0 > -6 21.68 host cephhost-v3-6T-1600-2 > 2 5.42 osd.2 up 1 > 4 5.42 osd.4 up 1 > 8 5.42 osd.8 up 1 > 11 5.42 osd.11 up 1 > > > > mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph -s > cluster 01525673-bc76-433e-8a68-12578d797b1c > health HEALTH_WARN 7 pgs stuck unclean; recovery 7018/585688 > objects degraded (1.198%) > monmap e1: 3 mons at > {mon-01-01525673-bc76-433e-8a68-12578d797b1c=xx.yy.192.50:6789/0,mon-02-01525673-bc76-433e-8a68-12578d797b1c=xx.yy.196.150:6789/0,mon-03-01525673-bc76-433e-8a68-12578d797b1c=xx.yy.196.50:6789/0}, > election epoch 124, quorum 0,1,2 > mon-01-01525673-bc76-433e-8a68-12578d797b1c,mon-03-01525673-bc76-433e-8a68-12578d797b1c,mon-02-01525673-bc76-433e-8a68-12578d797b1c > osdmap e749: 12 osds: 12 up, 12 in > pgmap v354398: 256 pgs, 4 pools, 657 GB data, 164 kobjects > 1553 GB used, 65097 GB / 66651 GB avail > 7018/585688 objects degraded (1.198%) > 249 active+clean > 7 active+remapped > client io 8564 kB/s wr, 4282 op/s > > > > mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph health detail > HEALTH_WARN 7 pgs stuck unclean; recovery 7018/585688 objects degraded > (1.198%) > pg 4.6 is stuck unclean for 65882.220407, current state active+remapped, > last acting [2,5,2147483647,3,6] > pg 4.3d is stuck unclean for 329578.358665, current state > active+remapped, last acting [10,11,6,2147483647,7] > pg 4.26 is stuck unclean for 65921.804191, current state > active+remapped, last acting [7,2,2147483647,6,10] > pg 4.4 is stuck unclean for 67215.534376, current state active+remapped, > last acting [0,6,2147483647,8,1] > pg 4.2e is stuck unclean for 67215.524683, current state > active+remapped, last acting [4,0,6,2147483647,1] > pg 4.1f is stuck unclean for 67215.518614, current state > active+remapped, last acting [11,1,5,2147483647,6] > pg 4.b is stuck unclean for 65882.230284, current state active+remapped, > last acting [0,2147483647,3,6,2] > recovery 7018/585688 objects degraded (1.198%) > > > > mon-01-01525673-bc76-433e-8a68-12578d797b1c:~ # ceph osd dump > epoch 749 > fsid 01525673-bc76-433e-8a68-12578d797b1c > created 2015-05-14 10:15:08.505823 > modified 2015-05-26 07:11:55.702597 > flags > pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash > rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool > crash_replay_interval 45 stripe_width 0 > pool 1 'metadata' replicated size 3 min_size 2 crush_ruleset 0 > object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool > stripe_width 0 > pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash > rjenkins pg_num 64 pgp_num 64 last_change 1 flags hashpspool stripe_width 0 > pool 4 'ecpool' erasure size 5 min_size 3 crush_ruleset 2 object_hash > rjenkins pg_num 64 pgp_num 64 last_change 37 flags hashpspool > stripe_width 4128 > max_osd 12 > osd.0 up in weight 1 up_from 562 up_thru 728 down_at 560 > last_clean_interval [556,559) xx.yy.196.51:6800/13911 > xx.yy.210.50:6800/13911 xx.yy.210.50:6801/13911 xx.yy.196.51:6801/13911 > exists,up 9383bc9b-e9b5-43ed-9508-51dc03027be4 > osd.1 up in weight 1 up_from 717 up_thru 742 down_at 687 > last_clean_interval [562,686) xx.yy.196.151:6800/12122 > xx.yy.210.150:6800/12122 xx.yy.210.150:6801/12122 > xx.yy.196.151:6801/12122 exists,up 870f7364-e258-47bc-83db-bd08f0881480 > osd.2 up in weight 1 up_from 562 up_thru 748 down_at 560 > last_clean_interval [557,559) xx.yy.192.51:6800/19295 > xx.yy.208.50:6800/19295 xx.yy.208.50:6802/19295 xx.yy.192.51:6801/19295 > exists,up 0f47c22b-b1e8-4329-93f6-1f0be320b4fe > osd.3 up in weight 1 up_from 675 up_thru 729 down_at 660 > last_clean_interval [562,662) xx.yy.196.152:6800/13559 > xx.yy.210.151:6800/13559 xx.yy.210.151:6801/13559 > xx.yy.196.152:6801/13559 exists,up 714322e7-9b30-42c1-849d-10051dd4c070 > osd.4 up in weight 1 up_from 562 up_thru 738 down_at 560 > last_clean_interval [557,559) xx.yy.192.52:6800/19660 > xx.yy.208.51:6800/19660 xx.yy.208.51:6801/19660 xx.yy.192.52:6801/19660 > exists,up db93930d-fe02-499d-ab23-a68d023336c8 > osd.5 up in weight 1 up_from 561 up_thru 735 down_at 560 > last_clean_interval [557,559) xx.yy.196.52:6800/29702 > xx.yy.210.51:6800/29702 xx.yy.210.51:6802/29702 xx.yy.196.52:6801/29702 > exists,up 688ffcd6-ce68-4714-afa1-ea5963f40faf > osd.6 up in weight 1 up_from 562 up_thru 741 down_at 560 > last_clean_interval [558,559) xx.yy.196.153:6800/2705 > xx.yy.210.152:6800/2705 xx.yy.210.152:6802/2705 xx.yy.196.153:6801/2705 > exists,up 9b9f09cf-0a0f-4e34-85d9-e95660bc2e0a > osd.7 up in weight 1 up_from 563 up_thru 744 down_at 562 > last_clean_interval [557,559) xx.yy.196.53:6801/29560 > xx.yy.210.52:6800/29560 xx.yy.210.52:6801/29560 xx.yy.196.53:6802/29560 > exists,up c6e7bbd0-b060-4ddf-90e7-dacbb0c38c0a > osd.8 up in weight 1 up_from 562 up_thru 740 down_at 560 > last_clean_interval [558,559) xx.yy.192.53:6800/15749 > xx.yy.208.52:6801/15749 xx.yy.208.52:6802/15749 xx.yy.192.53:6801/15749 > exists,up 5e40b427-b93d-4465-91ec-457395b71ddd > osd.9 up in weight 1 up_from 563 up_thru 707 down_at 562 > last_clean_interval [557,559) xx.yy.196.54:6801/28813 > xx.yy.210.53:6800/28813 xx.yy.210.53:6802/28813 xx.yy.196.54:6802/28813 > exists,up 184e6b4d-4b19-4ee1-af94-507b349344d0 > osd.10 up in weight 1 up_from 672 up_thru 689 down_at 658 > last_clean_interval [561,659) xx.yy.196.154:6800/882 > xx.yy.210.153:6800/882 xx.yy.210.153:6801/882 xx.yy.196.154:6801/882 > exists,up 8a0807b3-548b-47fd-a4a2-f40c981753b0 > osd.11 up in weight 1 up_from 563 up_thru 722 down_at 562 > last_clean_interval [558,559) xx.yy.192.54:6800/15610 > xx.yy.208.53:6800/15610 xx.yy.208.53:6802/15610 xx.yy.192.54:6802/15610 > exists,up f1e462c0-e844-4290-8c71-f35d778641bd > pg_temp 4.4 [0,6,2147483647,8,1] > pg_temp 4.6 [2,5,2147483647,3,6] > pg_temp 4.b [0,2147483647,3,6,2] > pg_temp 4.1f [11,1,5,2147483647,6] > pg_temp 4.26 [7,2,2147483647,6,10] > pg_temp 4.2e [4,0,6,2147483647,1] > pg_temp 4.3d [10,11,6,2147483647,7] > > > -- Lo?c Dachary, Artisan Logiciel Libre -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: OpenPGP digital signature URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20150526/d6d483f9/attachment.pgp>