Re: crushmap history

Blair Bethwaite <blair.bethwaite@xxxxxxxxx> · Thu, 18 Apr 2024 15:52:25 +1000

Thanks Eugen,

That indeed looks like it should be relevant. Will take a look at what it
gives us on our cluster/s.

Cheers,
Blair

On Wed, 17 Apr 2024, 18:29 Eugen Block, <eblock@xxxxxx> wrote:

> Hi,
>
> I'm not sure if and how that could help, there's a get-crushmap
> command for the ceph-monstore-tool:
>
> [ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/
> show-versions -- --map-type crushmap > show-versions
>
> [ceph: root@host1 /]# cat show-versions
> first committed:        0
> last  committed:        0
>
> [ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/
> get-crushmap --version 0 > crushmap-version-0
>
> [ceph: root@host1 /]# cat crushmap-version-0
> ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
> (stable)
>
> I don't have the option to shut down a MON in production right now to
> compare if there are more committed versions or something. And
> obviously, the result is not what I would usually expect from a
> crushmap. I also injected a modified monmap to provoke a new version:
>
> # ceph osd setcrushmap -i 20240417-crushmap.new
> 363
>
> But the result doesn't really change, so I'm not sure how that can help:
>
> [ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/
> get-crushmap --version 363
> ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
> (stable)
>
> It seems that all the commands print the same output:
>
> [ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/
> get-crushmap --version 5885
> ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
> (stable)
> [ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/
> get-osdmap --version 5885
> ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
> (stable)
> [ceph: root@host1 /]# ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/
> get-monmap --version 5885
> ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy
> (stable)
>
>
> Maybe one of the devs can shed some light if there's a way.
>
> Regards,
> Eugen
>
> Zitat von Blair Bethwaite <blair.bethwaite@xxxxxxxxx>:
>
> > Hi all,
> >
> > Do the Mons store any crushmap history, and if so how does one get at it
> > please?
> >
> > I ask because we've recently encountered an issue in a medium scale (~5PB
> > raw) EC based RGW focused cluster where "something" happened, which we
> > still don't know, that suddenly caused us to see 94% of objects (5.4
> > billion of them) misplaced. We've tracked down the first log message of
> > that pgmap state change:
> >
> > Mar 29 10:30:31 mon1 bash\[5804\]: debug 2024-03-29T10:30:31.152+0000
> > 7f3b6e378700  0 log\_channel(cluster) log \[DBG\] : pgmap v44327: 2273
> pgs:
> > 225 active+clean, 2038 active+remapped+backfill\_wait, 10
> > active+remapped+backfilling; 1.6 PiB data, 2.1 PiB used, 2.2 PiB / 4.3
> PiB
> > avail; 5426274136/5752755429 objects misplaced (94.325%); 248 MiB/s, 109
> > objects/s recovering
> >
> > This appears to have been preceded (aside from a single HTTP HEAD request
> > coming into RGW) by a 5 minute gap in logs where either journald couldn't
> > keep up with debug messages or the Mons were stuck. The last log before
> > that occurs seems to be a compaction event kicking off:
> >
> > mon1 bash\[25927\]: Int      0/0    0.00 KB   0.0      0.0     0.0
> 0.0
> >       0.0      0.0       0.0   0.0      0.0      0.0      0.00
> >  0.00         0    0.000       0      0
> > Mar 29 10:24:14 mon1 bash\[25927\]: \*\* Compaction Stats \[L\] \*\*
> > Mar 29 10:24:14 mon1 bash\[25927\]: Priority    Files   Size     Score
> > Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s)
> > Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
> > Mar 29 10:24:14 mon1 bash\[25927\]:
> >
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > Mar 29 10:24:14 mon1 bash\[25927\]: Low      0/0    0.00 KB   0.0
> 0.0
> >     0.0      0.0       0.0      0.0       0.0   0.0    116.0     11.4
> >  0.02              0.01         7    0.003     490    462
> > Mar 29 10:24:14 mon1 bash\[25927\]: High      0/0    0.00 KB   0.0
> 0.0
> >     0.0      0.0       0.0      0.0       0.0   0.0      0.0      1.9
> >  1.23              1.20        28    0.044       0      0
> > Mar 29 10:24:14 mon1 bash\[25927\]: User      0/0    0.00 KB   0.0
> 0.0
> >     0.0      0.0       0.0      0.0       0.0   0.0      0.0     16.4
> >  0.00              0.00         1    0.001       0      0
> >
> > We're left wondering what the heck has happened to cause such a huge
> > redistribution of data in the cluster when we've not made any
> corresponding
> > changes, so wanting to see if there's any breadcrumbs we can find.
> >
> > Appreciate any pointers!
> >
> > --
> > Cheers,
> > ~Blairo
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx