On 1/30/23 15:15, Ana Aviles wrote:
Hi,
Josh already suggested, but I will one more time. We had similar
behaviour upgrading from Nautilus to Pacific. In our case compacting
the OSDs did the trick.
Thanks for chimming in! Unfortunately, in my case neither an online
compaction (ceph tell osd.ID compact) or an offline repair
(ceph-bluestore-tool repair --path /var/lib/ceph/osd/OSD_ID) does help.
Compactions seem to compact some amount. I think that OSD log dumps
information about the size of rocksdb. It went from this:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
L0 0/0 0.00 KB 0.0 0.0 0.0 0.0 3.9
3.9 0.0 1.0 0.0 62.2 64.81 61.59
89 0.728 0 0
L1 3/0 132.84 MB 0.5 7.0 3.9 3.1 5.0
2.0 0.0 1.3 63.8 46.1 112.11 108.52
23 4.874 56M 7276K
L2 12/0 690.99 MB 0.8 6.5 1.8 4.7 5.6
0.9 0.1 3.2 21.4 18.5 310.78 307.14
28 11.099 165M 3077K
L3 54/0 3.37 GB 0.1 0.9 0.3 0.6 0.5
-0.1 0.0 1.6 35.9 20.2 24.84 24.49
4 6.210 24M 15M
Sum 69/0 4.17 GB 0.0 14.4 6.0 8.3 15.1
6.7 0.1 3.8 28.7 30.1 512.54 501.74
144 3.559 246M 26M
Int 0/0 0.00 KB 0.0 0.8 0.3 0.5 0.6
0.1 0.0 14.1 27.5 20.7 31.13 30.73
4 7.783 18M 4086K
To this:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
L0 2/0 72.42 MB 0.5 0.0 0.0 0.0 0.1
0.1 0.0 1.0 0.0 63.2 1.14 0.84
2 0.572 0 0
L3 48/0 3.10 GB 0.1 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.00 0.00
0 0.000 0 0
Sum 50/0 3.17 GB 0.0 0.0 0.0 0.0 0.1
0.1 0.0 1.0 0.0 63.2 1.14 0.84
2 0.572 0 0
Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.00 0.00
0 0.000 0 0
Still, it feels "too big" compared to some other OSD in other similarily
sized clusters, making me think that there's some kind of "garbage"
making the trim to go crazy.
For us there was no performance impact running the compaction (ceph
osd daemon osd.0 compact) although we run them in batches and not all
at once on all OSDs just in case. Also, no need to restart OSDs for
this operation.
Yes, compacting had no perceived impact on client performance, just some
higher CPU usage for the OSD process.
Does anyone knows by any chance the meaning of "num_pgmeta_omap" on ceph
daemon osd.ID calc_objectstore_db_histogram output? As I mentioned, the
OSDs in this cluster have very different values in that field but all
other clusters have much similar values:
osd.0: "num_pgmeta_omap": 17526766,
osd.1: "num_pgmeta_omap": 2653379,
osd.2: "num_pgmeta_omap": 12358703,
osd.3: "num_pgmeta_omap": 6404975,
osd.6: "num_pgmeta_omap": 19845318,
osd.7: "num_pgmeta_omap": 6043083,
osd.12: "num_pgmeta_omap": 18666776,
osd.13: "num_pgmeta_omap": 615846,
osd.14: "num_pgmeta_omap": 13190188,
Thanks a lot!
--
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx