Re: Can't connect to MDS admin socket after updating to cephadm

Eugen Block <eblock@xxxxxx> · Thu, 03 Nov 2022 10:47:39 +0000

Hi,

you can use cephadm for that now [1]. To attach to a running daemon  
you run (run 'cephadm ls' to see all cephadm daemons):

cephadm enter --name <DAEMON> [--fsid <FSID>]

There you can query the daemon as you used to:

storage01:~ # cephadm ls |grep mds
        "name": "mds.cephfs.storage01.ozpeev",

storage01:~ # cephadm enter --name mds.cephfs.storage01.ozpeev
Inferring fsid 877636d0-d118-11ec-83c7-fa163e990a3e
[ceph: root@storage01 /]# ceph daemon mds.cephfs.storage01.ozpeev ops
{
    "ops": [],
    "num_ops": 0
}

You can still restart the daemons with systemctl:

storage01:~ # systemctl restart  
ceph-877636d0-d118-11ec-83c7-fa163e990a3e@mds.cephfs.storage01.ozpeev.service

Regards,
Eugen

[1] https://docs.ceph.com/en/latest/man/8/cephadm/?highlight=cephadm%20enter

Zitat von Luis Calero Muñoz <luis.calero@xxxxxxxxxxxxxx>:

Hello, I'm running a ceph 15.2.15 Octopus cluster, and in preparation to
update it I've first transformed it to cephadm following the instructions
in the website. All went well but now i'm having a problem running "ceph
daemon mds.* dump_ops_in_flight" because it gives me an error:

root@ceph-mds2:~# ceph -s |grep mds
    mds: cephfs:2
{0=cephfs.ceph-mds1.edwbhe=up:active,1=cephfs.ceph-mds2.cjpsjm=up:active} 2
up:standby

root@ceph-mds2:~# ceph daemon mds.cephfs.ceph-mds2.cjpsjm
 dump_ops_in_flight

admin_socket: exception getting command descriptions: [Errno 2] No such
file or directory

  One thing i've noticed is that the name of the MDS daemons has changed,
before cephadm I could would refer them like mds.ceph-mds2 and now they're
called like mds.cephfs.ceph-mds2.cjpsjm,  where the last part is a random
string that changes when the daemon is restarted. Running an strace on the
ceph daemon command I've find out that the problem is that the command is
looking for a socket in a location that doesn't exist:

root@ceph-mds2:~# strace ceph daemon mds.cephfs.ceph-mds2.cjpsjm
 dump_ops_in_flight
[...]
connect(3, {sa_family=AF_UNIX,
sun_path="/var/run/ceph/ceph-mds.cephfs.ceph-mds2.cjpsjm.asok"}, 53) = -1
ENOENT (No such file or directory)
write(2, "admin_socket: exception getting "..., 90admin_socket: exception
getting command descriptions: [Errno 2] No such file or directory

  Because the socket is actually in a folder inside /var/run/ceph:

root@ceph-mds2:~# ls /var/run/ceph/

d1fd0678-88c0-47fb-90da-e40a7a364442/

root@ceph-mds2:~# ls /var/run/ceph/d1fd0678-88c0-47fb-90da-e40a7a364442/

ceph-mds.cephfs.ceph-mds2.cjpsjm.asok

   So if I link the socket to
/var/run/ceph/ceph-mds.cephfs.ceph-mds2.cjpsjm.asok then the command runs
without problems. That would be a fix but I would need to make the link
every time the daemon restarts, so I think that something is not right here
and should work out of the box. What can I do?

   Besides that, I've noticed that after updating to cephadm and docker I
can't restart the MDS servers with "service ceph-mds@ceph-mds1 restart"
anymore, what's the proper method to restart them now?

  Regards.

--
  Luis
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx