admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

two days ago we upgraded our cluster from octopus to pacific. Everything went well and we see lots of improvements. Thanks for releasing the last stable version with all its fixes. I do have some questions though and this hiccup is one for starters:

After the upgrade to pacific we started getting the error message "admin_socket: exception getting command descriptions: [Errno 2] No such file or directory" when using the ceph daemon command. Here is the output of a full session:

[root@ceph-adm:ceph-26 ~]# ceph daemon mon.ceph-26 version | jq .release 
"pacific"

[root@ceph-adm:ceph-26 ~]# ceph --id admin daemon mon.ceph-26 version | jq .release
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

[root@ceph-adm:ceph-26 ~]# ceph --id admin daemon /var/run/ceph/ceph-mon.ceph-26.asok version | jq .release
"pacific"

[root@ceph-adm:ceph-26 ~]# ceph daemon /var/run/ceph/ceph-mon.ceph-26.asok version | jq .release
"pacific"

We observe that it is impossible to use the ceph daemon command in its simple form whenever a --id argument is present. This, unfortunately, creates an unnecessary restrictions, we can't use non-admin users any more. here is why this fails:

[root@ceph-adm:ceph-26 ~]# strace ceph daemon mon.ceph-26 version |& grep asok
stat("/var/run/ceph/ceph-mon.ceph-26.asok", {st_mode=S_IFSOCK|0755, st_size=0, ...}) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, 37) = 0
getpeername(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, [110 => 38]) = 0
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, 37) = 0
getpeername(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.ceph-26.asok"}, [110 => 38]) = 0

[root@ceph-adm:ceph-26 ~]# strace ceph --id admin daemon mon.ceph-26 version |& grep asok
stat("/var/run/ceph/ceph-mon.admin.asok", 0x7fffa65e9f00) = -1 ENOENT (No such file or directory)
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/ceph/ceph-mon.admin.asok"}, 35) = -1 ENOENT (No such file or directory)

As you can see, the daemon name "ceph-26" was replaced with the user name "admin" passed with the argument to --id. As a result the command looks for a non-existent file. Passing the full path "fixes" this. This is clearly a bug and I wonder if there is a way out, for example, by setting an explicit daemon path template in the config.

I will open a tracker if a user on quincy or newer confirms that this is present in newer versions as well. I wonder if this is a fall-out of https://docs.ceph.com/en/latest/releases/pacific/#id39 Point 3: "$pid expansion in config paths like admin_socket will now properly expand to the daemon pid for commands like ceph-mds or ceph-osd. Previously only ceph-fuse/rbd-nbd expanded $pid with the actual daemon pid."

Thanks for any pointers on how to work around this issue.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux