Re: After dockerized ceph cluster to Pacific, the fsid changed in the output of 'ceph -s'

Eugen Block <eblock@xxxxxx> · Thu, 02 May 2024 12:42:50 +0000

Hi,

did you maybe have some test clusters leftovers on the hosts so  
cephadm might have picked up the wrong FSID?
Does that mean that you adopted all daemons and only afterwards looked  
into ceph -s? I would have adopted the first daemon and checked  
immediately if everything still was as expected.
What you can't do is injecting a different FSID directly into the cluster:

quincy-1:~ # ceph config set mon fsid <FSID>
Error EINVAL: fsid is special and cannot be stored by the mon

I assume you would have to update the monmap, but I've never tried  
that. Something like:

- stop one mon
- cephadm shell --name mon.<MON>
- ceph-monstore-tool /var/lib/ceph/mon/ceph-host1/ get monmap -- --out  
monmap.bin (as backup)
- monmaptool --create --fsid <DIFFERENT_FSID> --addv host1  
[v2:IP:3300/0,v1:IP:6789/0] --addv host2 [v2:IP:3300/0,v1:IP:6789/0]  
--addv host3 [v2:IP:3300/0,v1:IP:6789/0] --set-min-mon-release 17  
monmap.new
- ceph-mon -i host1 --inject-monmap monmap.new

This is just an example from a quincy test cluster, I have no idea if  
that will work if the other two MONs still have the other FSID. If  
that works, I assume the orchestrator would reconfigure the rest  
automatically, but again, I don't know if that will work. If you  
decide to try this approach, let us know how it went.

Regards,
Eugen

Zitat von wjsherry075@xxxxxxxxxxx:

Hello,
I had a problem after I finished the 'cephadm adopt' from services  
to docker containers for mon and mgr. The fsid of `ceph -s` is not  
the same as the /etc/ceph/ceph.conf. The ceph.conf is correct, but  
`ceph -s` is incorrect. I followed the  
https://docs.ceph.com/en/quincy/cephadm/adoption/
```
2024-04-25T19:49:17.460652+0000 mgr.cloud-lab-test-mon01 (mgr.65113)  
109 : cephadm [ERR] cephadm exited with an error code: 1,  
stderr:ERROR: fsid does not match ceph.conf
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1538, in  
_remote_connection
    yield (conn, connr)
  File "/usr/share/ceph/mgr/cephadm/serve.py", line 1426, in _run_cephadm
    code, '\n'.join(err)))
orchestrator._interface.OrchestratorError: cephadm exited with an  
error code: 1, stderr:ERROR: fsid does not match ceph.conf
#ceph health detail shows the warning:
[WRN] CEPHADM_REFRESH_FAILED: failed to probe daemons or devices
```

Does anyone else have any ideas?

Thanks,
Sherry
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx