On 02/23/2018 12:42 AM, Mike Lovell wrote:
was the pg-upmap feature used to force a pg to get mapped to a
particular osd?
Yes it was. This is a semi-production cluster where the balancer module
has been enabled with the upmap feature.
It remapped PGs it seems to OSDs on the same host.
root@man:~# ceph osd dump|grep pg_upmap|grep 1.41
pg_upmap_items 1.41 [9,15,11,7,10,2]
root@man:~#
I don't know exactly what I have to extract from that output, but it
does seem to be the case here.
I removed the upmap entry for this PG and fixed it there:
$ ceph osd rm-pg-upmap-items 1.41
I also disabled the balancer for now (will report a issue) and removed
all other upmap entries:
$ ceph osd dump|grep pg_upmap_items|awk '{print $2}'|xargs -n 1 ceph osd
rm-pg-upmap-items
Thanks for the hint!
Wido
mike
On Thu, Feb 22, 2018 at 10:28 AM, Wido den Hollander <wido@xxxxxxxx
<mailto:wido@xxxxxxxx>> wrote:
Hi,
I have a situation with a cluster which was recently upgraded to
Luminous and has a PG mapped to OSDs on the same host.
root@man:~# ceph pg map 1.41
osdmap e21543 pg 1.41 (1.41) -> up [15,7,4] acting [15,7,4]
root@man:~#
root@man:~# ceph osd find 15|jq -r '.crush_location.host'
n02
root@man:~# ceph osd find 7|jq -r '.crush_location.host'
n01
root@man:~# ceph osd find 4|jq -r '.crush_location.host'
n02
root@man:~#
As you can see, OSD 15 and 4 are both on the host 'n02'.
This PG went inactive when the machine hosting both OSDs went down
for maintenance.
My first suspect was the CRUSHMap and the rules, but those are fine:
rule replicated_ruleset {
id 0
type replicated
min_size 1
max_size 10
step take default
step chooseleaf firstn 0 type host
step emit
}
This is the only rule in the CRUSHMap.
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 19.50325 root default
-2 2.78618 host n01
5 ssd 0.92999 osd.5 up 1.00000 1.00000
7 ssd 0.92619 osd.7 up 1.00000 1.00000
14 ssd 0.92999 osd.14 up 1.00000 1.00000
-3 2.78618 host n02
4 ssd 0.92999 osd.4 up 1.00000 1.00000
8 ssd 0.92619 osd.8 up 1.00000 1.00000
15 ssd 0.92999 osd.15 up 1.00000 1.00000
-4 2.78618 host n03
3 ssd 0.92999 osd.3 up 0.94577 1.00000
9 ssd 0.92619 osd.9 up 0.82001 1.00000
16 ssd 0.92999 osd.16 up 0.84885 1.00000
-5 2.78618 host n04
2 ssd 0.92999 osd.2 up 0.93501 1.00000
10 ssd 0.92619 osd.10 up 0.76031 1.00000
17 ssd 0.92999 osd.17 up 0.82883 1.00000
-6 2.78618 host n05
6 ssd 0.92999 osd.6 up 0.84470 1.00000
11 ssd 0.92619 osd.11 up 0.80530 1.00000
18 ssd 0.92999 osd.18 up 0.86501 1.00000
-7 2.78618 host n06
1 ssd 0.92999 osd.1 up 0.88353 1.00000
12 ssd 0.92619 osd.12 up 0.79602 1.00000
19 ssd 0.92999 osd.19 up 0.83171 1.00000
-8 2.78618 host n07
0 ssd 0.92999 osd.0 up 1.00000 1.00000
13 ssd 0.92619 osd.13 up 0.86043 1.00000
20 ssd 0.92999 osd.20 up 0.77153 1.00000
Here you see osd.15 and osd.4 on the same host 'n02'.
This cluster was upgraded from Hammer to Jewel and now Luminous and
it doesn't have the latest tunables yet, but should that matter? I
never encountered this before.
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable chooseleaf_stable 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
I don't want to touch this yet in the case this is a bug or glitch
in the matrix somewhere.
I hope it's just a admin mistake, but so far I'm not able to find a
clue pointing to that.
root@man:~# ceph osd dump|head -n 12
epoch 21545
fsid 0b6fb388-6233-4eeb-a55c-476ed12bdf0a
created 2015-04-28 14:43:53.950159
modified 2018-02-22 17:56:42.497849
flags sortbitwise,recovery_deletes,purged_snapdirs
crush_version 22
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client luminous
min_compat_client luminous
require_osd_release luminous
root@man:~#
I also downloaded the CRUSHmap and ran crushtool with --test and
--show-mappings, but that didn't show any PG mapped to the same host.
Any ideas on what might be going on here?
Wido
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com