Too many PGs during filestore=>bluestore migration

Herbert Alexander Faleiros <herbert@xxxxxxxxxxx> · Fri, 15 Mar 2019 10:23:31 -0300

Hi,

I'm migrating my OSDs to bluestore (Luminous 12.2.10) recreating the
OSDs, everything looks good (just a few OSDs left), but yesterday a
weird thing happened after I set a wrong weigth to a newly migrate
OSD: should be 2 but I put 6 (hardcoded in my salt state, oops). I got
slow requests, when I realized that I stop this OSD.

I found things like this on my logs:

osd.125 492578 maybe_wait_for_max_pg withhold creation of pg 5.95e: 750 >= 750

So I destroyed this OSD again (when I realized that weight was 6) then
recreated it with 2. Things OK again...

Then I continued to migrate the OSDs, but after a few ones on this
same node I got slow requests again (all weights is 2 now). This time
slow req stopped only when I out all recent migrated OSDs on this
node.

Looking why I found the same kind of log, then I found what I don't
understand:

# ceph osd pool get <pool> pg_num
pg_num: 4096

*but* counting the PGs (using ceph osd df or mgr dash) I have 8964!

Why? Obviously this should not happened.

This OSD is part of a group of 3 nodes that serve only a small
specialized pool (not part of my main/large cluster). As I'm using
only 4TB I can easily migrate data to another pool then completely
destroy (and recreate) this buggy pool.

Anyway I just want to understand how/why that happened. Any clues?

--
Herbert
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com