Thank you very much,
manual changing of all config files got us back in production again.
Recovery should be done in the next hours.
cluster:
id: 10489760-1723-11ec-8050-cb54d51756be
health: HEALTH_WARN
545 pgs not deep-scrubbed in time
545 pgs not scrubbed in time
services:
mon: 4 daemons, quorum ml2rsn06,ml2rsn03,ml2rsn05,ml2rsn07 (age 9m)
mgr: ml2rsn05.rivwqx(active, since 17m), standbys: ml2rsn03.ufxzjh
mds: 1/1 daemons up, 2 standby
osd: 36 osds: 36 up (since 8m), 36 in (since 8m); 101 remapped pgs
data:
volumes: 1/1 healthy
pools: 3 pools, 545 pgs
objects: 58.64M objects, 73 TiB
usage: 221 TiB used, 282 TiB / 503 TiB avail
pgs: 11654543/175922865 objects misplaced (6.625%)
444 active+clean
101 active+remapped+backfilling
io:
recovery: 2.3 GiB/s, 7.17k keys/s, 1.91k objects/s
Cheers
Dominik
Am 19.12.2022 um 19:49 schrieb Eugen Block:
Maybe in this case you actually should update the local config
(/var/lib/ceph/<FSID>/<SERVICE>/config) to reflect the new MONs. It
seems that adding the MONs with their new address didn't update the
local ceph.conf for some reason, but try it manually and restart the
OSD services. I didn't have to go through these steps in a while (with
cephadm only once in a test cluster), not sure what we could be
missing here.
Zitat von Dominik Baack <dominik.baack@xxxxxxxxxxxxxxxxxx>:
Hi,
OSD (9 SSDs) and Mons are currently on the same nodes with 2
dedicated 200GBe Cards and a third out of band 10GBe connection.
I changed public network to
public_network 129.217.31.176/28
in between for crosscheck but did it now again to capture the steps
and output, sadly no visible changes.
I do the operations in the following order:
./cephadm shell
ceph config dump
ceph config set global public_network 129.217.31.176/28
ceph config set global cluster_network 129.217.31.184/29
ceph config dump # check output
<-
restart ceph.target on all nodes
ceph orch daemon reconfig osd.28
Check osd.28 containers ceph.conf :
[root@ml2rsn05 /]# cat /etc/ceph/ceph.conf
# minimal ceph.conf for ...
[global]
fsid = ...
mon_host =
[v2:129.217.31.171:3300/0,v1:129.217.31.171:6789/0]
[v2:129.217.31.172:3300/0,v1:129.217.31.172:6789/0]
[v2:129.217.31.175:3300/0,v1:129.217.31.175:6789/0]
Cheers
Dominik
Am 19.12.2022 um 18:55 schrieb Eugen Block:
I just looked at the previous mails again, is it possible that you
mixed up public and cluster network? MONs require only access to the
public network, except if they are colocated with OSDs, of course.
You stated this earlier:
I could move the mons over to the new address range and they
connect into the cluster network. The OSD create more of a problem,
even after setting
public_network 129.217.31.176/29
cluster_network 129.217.31.184/29
and this:
mon_host = [v2:129.217.31.186:3300/0,v1:129.217.31.186:6789/0]
[v2:129.217.31.188:3300/0,v1:129.217.31.188:6789/0]
[v2:129.217.31.189:3300/0,v1:129.217.31.189:6789/0]
[v2:129.217.31.190:3300/0,v1:129.217.31.190:6789/0]
The MONs addresses are from the cluster network but should be from
the public network. If 129.217.31.184/29 is supposed to be the
public network you should modify both networks and restart OSD
services.
Zitat von Dominik Baack <dominik.baack@xxxxxxxxxxxxxxxxxx>:
Hi,
Reconfiguration
ceph orch daemon reconfig osd.28
Scheduled to reconfig osd.28 on host 'ml2rsn05'
cephadm ['--image',
'quay.io/ceph/ceph@sha256:12a0a4f43413fd97a14a3d47a3451b2d2df50020835bb93db666209f3f77617a',
'deploy', '--fsid', '10489760-1723-11ec-8050-cb54d51756be',
'--name', 'osd.28', '--meta-json', '{"service_name": "osd",
"ports": [], "ip": null, "deployed_by":
["quay.io/ceph/ceph@sha256:12a0a4f43413fd97a14a3d47a3451b2d2df50020835bb93db666209f3f77617a"],
"rank": null, "rank_generation": null, "extra_container_args":
null}', '--config-json', '-', '--osd-fsid',
'0c15ce43-ed2d-4348-88b2-785c25159894', '--reconfig']
2022-12-19 16:48:23,502 7f1cb6f4c740 DEBUG Acquiring lock
139761289301680 on
/run/cephadm/10489760-1723-11ec-8050-cb54d51756be.lock
2022-12-19 16:48:23,502 7f1cb6f4c740 DEBUG Lock 139761289301680
acquired on /run/cephadm/10489760-1723-11ec-8050-cb54d51756be.lock
2022-12-19 16:48:23,513 7f1cb6f4c740 DEBUG systemctl: enabled
2022-12-19 16:48:23,523 7f1cb6f4c740 DEBUG systemctl: active
2022-12-19 16:48:23,524 7f1cb6f4c740 INFO Reconfig daemon osd.28 ...
2022-12-19 16:48:23,714 7f1cb6f4c740 DEBUG stat: 167 167
2022-12-19 16:48:23,777 7f1cb6f4c740 DEBUG firewalld does not
appear to be present
2022-12-19 16:48:23,777 7f1cb6f4c740 DEBUG Not possible to enable
service <osd>. firewalld.service is not available
2022-12-19 16:49:24,592 7f9a07c9f740 DEBUG
--------------------------------------------------------------------------------
seems to be applied but has no effect on the ceph.conf file present
in the osd's container.
Cheers
Dominik
Am 19.12.2022 um 17:23 schrieb Eugen Block:
ceph orch daemon reconfig osd.<ID>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx