Basically, these are the steps to remove all OSDs from that host (OSDs
are not "replaced" so they aren't marked "destroyed") [1]:
1) Call 'ceph osd out $id'
2) Call systemctl stop ceph-osd@$id
3) ceph osd purge $id --yes-i-really-mean-it
4) call ceph-volume lvm zap --osd-id $id --destroy
After all disks have been wiped there's a salt runner to deploy all
available OSDs on that host again[2]. All OSDs are created with a
normal weight. All OSD restarts I did were on different hosts, not on
the rebuilt host. The only difference I can think of that may have an
impact is that this cluster consists of two datacenters, the others
were not devided into several buckets. Could that be an issue?
[1]
https://github.com/SUSE/DeepSea/blob/master/srv/modules/runners/osd.py#L179
[2] https://github.com/SUSE/DeepSea/blob/master/srv/salt/_modules/dg.py#L1396
Zitat von Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx>:
On Wed, Apr 6, 2022 at 11:20 AM Eugen Block <eblock@xxxxxx> wrote:
I'm pretty sure that their cluster isn't anywhere near the limit for
mon_max_pg_per_osd, they currently have around 100 PGs per OSD and the
configs have not been touched, it's pretty basic.
How is the host being "rebuilt"? Depending on the CRUSH rule, if the
host's OSDs are all marked destroyed and then re-created one at a time
with normal weight, CRUSH may decide to put a large number of PGs on
the first OSD that is created, and so on, until the rest of the host's
OSDs are available to take those OSDs.
Josh
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx