Re: crushmap history

Eugen Block <eblock@xxxxxx> · Wed, 17 Apr 2024 08:29:33 +0000

Hi,

I'm not sure if and how that could help, there's a get-crushmap  
command for the ceph-monstore-tool:

[ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/  
show-versions -- --map-type crushmap > show-versions

[ceph: root@host1 /]# cat show-versions
first committed:        0
last  committed:        0

[ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/  
get-crushmap --version 0 > crushmap-version-0

[ceph: root@host1 /]# cat crushmap-version-0
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)

I don't have the option to shut down a MON in production right now to  
compare if there are more committed versions or something. And  
obviously, the result is not what I would usually expect from a  
crushmap. I also injected a modified monmap to provoke a new version:

# ceph osd setcrushmap -i 20240417-crushmap.new
363

But the result doesn't really change, so I'm not sure how that can help:

[ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/  
get-crushmap --version 363
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)

It seems that all the commands print the same output:

[ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/  
get-crushmap --version 5885
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
[ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/  
get-osdmap --version 5885
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
[ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/  
get-monmap --version 5885
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)

Maybe one of the devs can shed some light if there's a way.

Regards,
Eugen

Zitat von Blair Bethwaite <blair.bethwaite@xxxxxxxxx>:

Hi all,

Do the Mons store any crushmap history, and if so how does one get at it
please?

I ask because we've recently encountered an issue in a medium scale (~5PB
raw) EC based RGW focused cluster where "something" happened, which we
still don't know, that suddenly caused us to see 94% of objects (5.4
billion of them) misplaced. We've tracked down the first log message of
that pgmap state change:

Mar 29 10:30:31 mon1 bash\[5804\]: debug 2024-03-29T10:30:31.152+0000
7f3b6e378700  0 log\_channel(cluster) log \[DBG\] : pgmap v44327: 2273 pgs:
225 active+clean, 2038 active+remapped+backfill\_wait, 10
active+remapped+backfilling; 1.6 PiB data, 2.1 PiB used, 2.2 PiB / 4.3 PiB
avail; 5426274136/5752755429 objects misplaced (94.325%); 248 MiB/s, 109
objects/s recovering

This appears to have been preceded (aside from a single HTTP HEAD request
coming into RGW) by a 5 minute gap in logs where either journald couldn't
keep up with debug messages or the Mons were stuck. The last log before
that occurs seems to be a compaction event kicking off:

mon1 bash\[25927\]: Int      0/0    0.00 KB   0.0      0.0     0.0      0.0
      0.0      0.0       0.0   0.0      0.0      0.0      0.00
 0.00         0    0.000       0      0
Mar 29 10:24:14 mon1 bash\[25927\]: \*\* Compaction Stats \[L\] \*\*
Mar 29 10:24:14 mon1 bash\[25927\]: Priority    Files   Size     Score
Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s)
Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
Mar 29 10:24:14 mon1 bash\[25927\]:
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Mar 29 10:24:14 mon1 bash\[25927\]: Low      0/0    0.00 KB   0.0      0.0
    0.0      0.0       0.0      0.0       0.0   0.0    116.0     11.4
 0.02              0.01         7    0.003     490    462
Mar 29 10:24:14 mon1 bash\[25927\]: High      0/0    0.00 KB   0.0      0.0
    0.0      0.0       0.0      0.0       0.0   0.0      0.0      1.9
 1.23              1.20        28    0.044       0      0
Mar 29 10:24:14 mon1 bash\[25927\]: User      0/0    0.00 KB   0.0      0.0
    0.0      0.0       0.0      0.0       0.0   0.0      0.0     16.4
 0.00              0.00         1    0.001       0      0

We're left wondering what the heck has happened to cause such a huge
redistribution of data in the cluster when we've not made any corresponding
changes, so wanting to see if there's any breadcrumbs we can find.

Appreciate any pointers!

--
Cheers,
~Blairo
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx