Hi,
If I look at the container name in docker it has the dots changed to
hyphens, but if I try to connect to the name with hyphens it doesn't
work either:
that is correct, that switch from dots to hyphens was introduced in
pacific [1]. Can you share the content of the unit.run file for that
container? Can you enter other containers that were changed? Maybe the
conversion doesn't work as expected?
[1] https://github.com/ceph/ceph/pull/42242
Zitat von Luis Calero Muñoz <luis.calero@xxxxxxxxxxxxxx>:
Hello Eugen, thanks for your answer. I was able to connect like you
showed me until I updated my cluster to ceph version 16.2.10
(pacific). But now it doesn't work anymore:
root@ceph-mds2:~# cephadm ls |grep ceph-mds | grep name
"name": "mds.cephfs.ceph-mds2.cjpsjm",
root@ceph-mds2:~# cephadm enter --name mds.cephfs.ceph-mds2.cjpsjm
Inferring fsid d1fd0678-88c0-47fb-90da-e40a7a364442
Error: No such container:
ceph-d1fd0678-88c0-47fb-90da-e40a7a364442-mds.cephfs.ceph-mds2.cjpsjm
If I look at the container name in docker it has the dots changed to
hyphens, but if I try to connect to the name with hyphens it doesn't
work either:
root@ceph-mds2:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED
STATUS PORTS NAMES
38635f6de533 quay.io/ceph/ceph "/usr/bin/ceph-mds -…" About an
hour ago Up About an hour
ceph-d1fd0678-88c0-47fb-90da-e40a7a364442-mds-cephfs-ceph-mds2-cjpsjm
root@ceph-mds2:~# cephadm enter --name mds-cephfs-ceph-mds2-cjpsjm
ERROR: must pass --fsid to specify cluster
root@ceph-mds2:~# cephadm enter --name mds-cephfs-ceph-mds2-cjpsjm
--fsid d1fd0678-88c0-47fb-90da-e40a7a364442
Traceback (most recent call last):
File "/usr/sbin/cephadm", line 6158, in <module>
r = args.func()
File "/usr/sbin/cephadm", line 1309, in _infer_fsid
return func()
File "/usr/sbin/cephadm", line 3580, in command_enter
(daemon_type, daemon_id) = args.name.split('.', 1)
ValueError: not enough values to unpack (expected 2, got 1)
What can I do?
--
Luis
Luis Calero Muñoz
Head of Infrastructure
luis.calero@xxxxxxxxxxxxxx
T. +34 91 787 0000
C/ Apolonio Morales 13C - 28036 Madrid
letsrebold.com
El jue, 3 nov 2022 a las 11:48, Eugen Block (<eblock@xxxxxx>) escribió:
Hi,
you can use cephadm for that now [1]. To attach to a running daemon
you run (run 'cephadm ls' to see all cephadm daemons):
cephadm enter --name <DAEMON> [--fsid <FSID>]
There you can query the daemon as you used to:
storage01:~ # cephadm ls |grep mds
"name": "mds.cephfs.storage01.ozpeev",
storage01:~ # cephadm enter --name mds.cephfs.storage01.ozpeev
Inferring fsid 877636d0-d118-11ec-83c7-fa163e990a3e
[ceph: root@storage01 /]# ceph daemon mds.cephfs.storage01.ozpeev ops
{
"ops": [],
"num_ops": 0
}
You can still restart the daemons with systemctl:
storage01:~ # systemctl restart
ceph-877636d0-d118-11ec-83c7-fa163e990a3e@mds.cephfs.storage01.ozpeev.service
Regards,
Eugen
[1] https://docs.ceph.com/en/latest/man/8/cephadm/?highlight=cephadm%20enter
Zitat von Luis Calero Muñoz <luis.calero@xxxxxxxxxxxxxx>:
> Hello, I'm running a ceph 15.2.15 Octopus cluster, and in preparation to
> update it I've first transformed it to cephadm following the instructions
> in the website. All went well but now i'm having a problem running "ceph
> daemon mds.* dump_ops_in_flight" because it gives me an error:
>
> root@ceph-mds2:~# ceph -s |grep mds
> mds: cephfs:2
>
{0=cephfs.ceph-mds1.edwbhe=up:active,1=cephfs.ceph-mds2.cjpsjm=up:active}
2
> up:standby
>
> root@ceph-mds2:~# ceph daemon mds.cephfs.ceph-mds2.cjpsjm
> dump_ops_in_flight
>
> admin_socket: exception getting command descriptions: [Errno 2] No such
> file or directory
>
> One thing i've noticed is that the name of the MDS daemons has changed,
> before cephadm I could would refer them like mds.ceph-mds2 and now they're
> called like mds.cephfs.ceph-mds2.cjpsjm, where the last part is a random
> string that changes when the daemon is restarted. Running an strace on the
> ceph daemon command I've find out that the problem is that the command is
> looking for a socket in a location that doesn't exist:
>
> root@ceph-mds2:~# strace ceph daemon mds.cephfs.ceph-mds2.cjpsjm
> dump_ops_in_flight
> [...]
> connect(3, {sa_family=AF_UNIX,
> sun_path="/var/run/ceph/ceph-mds.cephfs.ceph-mds2.cjpsjm.asok"}, 53) = -1
> ENOENT (No such file or directory)
> write(2, "admin_socket: exception getting "..., 90admin_socket: exception
> getting command descriptions: [Errno 2] No such file or directory
>
>
> Because the socket is actually in a folder inside /var/run/ceph:
>
> root@ceph-mds2:~# ls /var/run/ceph/
>
> d1fd0678-88c0-47fb-90da-e40a7a364442/
>
>
> root@ceph-mds2:~# ls /var/run/ceph/d1fd0678-88c0-47fb-90da-e40a7a364442/
>
> ceph-mds.cephfs.ceph-mds2.cjpsjm.asok
>
> So if I link the socket to
> /var/run/ceph/ceph-mds.cephfs.ceph-mds2.cjpsjm.asok then the command runs
> without problems. That would be a fix but I would need to make the link
> every time the daemon restarts, so I think that something is not
right here
> and should work out of the box. What can I do?
>
> Besides that, I've noticed that after updating to cephadm and docker I
> can't restart the MDS servers with "service ceph-mds@ceph-mds1 restart"
> anymore, what's the proper method to restart them now?
>
> Regards.
>
>
> --
> Luis
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx