unknown PGs after adding hosts in different subtree

Eugen Block <eblock@xxxxxx> · Tue, 21 May 2024 12:26:08 +0000

Hi,

I got into a weird and unexpected situation today. I added 6 hosts to  
an existing Pacific cluster (16.2.13, 20 existing OSD hosts across 2  
DCs). The hosts were added to the root=default subtree, their  
designated location is one of two datacenters underneath the default  
root. Nothing unusual, I believe many people use different subtrees to  
organize their cluster, as do we in our own (and haven't seen above  
issue yet).

The main application is RGW, the main pool is erasure-coded (k=7,  
m=11). The crush rule looks like this:

rule rule-ec-k7m11 {
	id 1
	type erasure
	min_size 3
	max_size 18
	step set_chooseleaf_tries 5
	step set_choose_tries 100
	step take default class hdd
	step choose indep 2 type datacenter
	step chooseleaf indep 9 type host
	step emit
}

After almost all peering had finished the status showed 6 inactive +  
peering PGs for a while. I had to fail the mgr because it didn't  
report correct stats anymore, then it showed 16 unknown PGs. Their  
application noticed the (unexpected) disruption, after putting the  
hosts into their designated crush bucket (datacenter) the situation  
resolved. But I can't make any sense of it, I tried to reproduce it in  
my lab environment (Quincy), but to no avail. In my tests it behaves  
as expected, after new OSDs become active there are remapped PGs, but  
nothing happens until I add them to their designated location.

I know I could have prevented that with either  
osd_crush_initial_weight = 0, then move the crush buckets, then  
reweight, or by adding the crush buckets first, but usually I don't  
need to bother about these things.

Does anyone have an explanation? I'd appreciate any comments.

Thanks!
Eugen
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx