Failure to start ceph-mon in docker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We are trying to set up a new Nautilus cluster using ceph-ansible with containers. We got things deployed, but I couldn't run `ceph s` on the host so decided to `apt install ceph-common and installed the Luminous version from Ubuntu 18.04. For some reason the docker container that was running the monitor restarted and won't restart. I added the repo for Nautilus and upgraded ceph-common, but the problem persists. The Manager and OSD docker containers don't seem to be affected at all. I see this in the journal:

Aug 28 20:40:55 sun-gcs02-osd01 systemd[1]: Starting Ceph Monitor...
Aug 28 20:40:55 sun-gcs02-osd01 docker[2926]: Error: No such container: ceph-mon-sun-gcs02-osd01
Aug 28 20:40:55 sun-gcs02-osd01 systemd[1]: Started Ceph Monitor.
Aug 28 20:40:55 sun-gcs02-osd01 docker[2949]: WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
Aug 28 20:40:56 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:40:56  /opt/ceph-container/bin/entrypoint.sh: Existing mon, trying to rejoin cluster...
Aug 28 20:40:56 sun-gcs02-osd01 docker[2949]: warning: line 41: 'osd_memory_target' in section 'osd' redefined
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:41:03  /opt/ceph-container/bin/entrypoint.sh: /etc/ceph/ceph.conf is already memory tuned
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:41:03  /opt/ceph-container/bin/entrypoint.sh: SUCCESS
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: exec: PID 368: spawning /usr/bin/ceph-mon --cluster ceph --default-log-to-file=false --default-mon-cluster-log-to-file=false --setuser ceph --setgroup ceph -d --mon-cluster-log-to-stderr --log-stderr-prefix=debug  -i sun-gcs02-osd01 --mon-data /var/lib/ceph/mon/ceph-sun-gcs02-osd01 --public-addr 10.65.101.21                                                                                                                                                                                              
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: exec: Waiting 368 to quit
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: warning: line 41: 'osd_memory_target' in section 'osd' redefined
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835 7f401283c180  0 set uid:gid to 167:167 (ceph:ceph)
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835 7f401283c180  0 ceph version 14.2.2 (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process ceph-mon, pid 368
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835 7f401283c180 -1 stat(/var/lib/ceph/mon/ceph-sun-gcs02-osd01) (13) Permission denied
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835 7f401283c180 -1 error accessing monitor data directory at '/var/lib/ceph/mon/ceph-sun-gcs02-osd01': (13) Permission denied
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: managing teardown after SIGCHLD
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Waiting PID 368 to terminate
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Process 368 is terminated
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Bye Bye, container will die with return code -1
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: if you don't want me to die and have access to a shell to debug this situation, next time run me with '-e DEBUG=stayalive'
Aug 28 20:41:04 sun-gcs02-osd01 systemd[1]: ceph-mon@sun-gcs02-osd01.service: Main process exited, code=exited, status=255/n/a
Aug 28 20:41:04 sun-gcs02-osd01 systemd[1]: ceph-mon@sun-gcs02-osd01.service: Failed with result 'exit-code'.

The directories for the monitor are owned by 167.167 and matches the UID.GID that the container reports.

oot@sun-gcs02-osd01:~# ls -lhd /var/lib/ceph/
drwxr-x--- 14 ceph ceph 4.0K Jul 30 22:15 /var/lib/ceph/
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/
total 56K
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-mds
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-mgr
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-osd
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rbd
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rbd-mirror
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rgw
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mds
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mgr
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mon
drwxr-xr-x  14 167 167 4.0K Jul 30 22:28 osd
drwxr-xr-x   4 167 167 4.0K Aug  1 23:36 radosgw
drwxr-xr-x 254 167 167  12K Aug 28 20:44 tmp
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/
total 4.0K
drwxr-xr-x 3 167 167 4.0K Jul 30 22:16 ceph-sun-gcs02-osd01
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/ceph-sun-gcs02-osd01/
total 16K
-rw------- 1 167 167   77 Jul 30 22:15 keyring
-rw-r--r-- 1 167 167    8 Jul 30 22:15 kv_backend
-rw-r--r-- 1 167 167    3 Jul 30 22:16 min_mon_release
drwxr-xr-x 2 167 167 4.0K Aug 28 19:16 store.db
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/ceph-sun-gcs02-osd01/store.db/
total 149M
-rw-r--r-- 1 167 167 1.7M Aug 28 19:16 050225.log
-rw-r--r-- 1 167 167  65M Aug 28 19:16 050227.sst
-rw-r--r-- 1 167 167  45M Aug 28 19:16 050228.sst
-rw-r--r-- 1 167 167   16 Aug 16 07:40 CURRENT
-rw-r--r-- 1 167 167   37 Jul 30 22:15 IDENTITY
-rw-r--r-- 1 167 167    0 Jul 30 22:15 LOCK
-rw-r--r-- 1 167 167 1.3M Aug 28 19:16 MANIFEST-027846
-rw-r--r-- 1 167 167 4.7K Aug  1 23:38 OPTIONS-002825
-rw-r--r-- 1 167 167 4.7K Aug 16 07:40 OPTIONS-027849

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux