Hi, On 02.05.19 13:40, Alfredo Deza wrote: > Can you give a bit more details on the environment? How dense is the > server? If the unit retries is fine and I was hoping at some point it > would see things ready and start activating (it does retry > indefinitely at the moment). It is a machine with 13 Bluestore OSDs on LVM with SSDs as Block.DB devices. The SSDs have also been setup with LVM. This has been done with "ceph-volume lvm batch". The issue started with the latest Ubuntu updates (no Ceph updates involved) and the following reboot. The customer let the boot process run for over 30 minutes but the ceph-volume activation services (and wpa-supplicant + logind) were not able to start. > Would also help to see what problems is it encountering as it can't > get to activate. There are two logs for this, one for the systemd unit > at /var/log/ceph/ceph-volume-systemd.log and the other one at > /var/log/ceph/ceph-volume.log that might > help. Like these entries? [2019-05-02 10:04:32,211][ceph_volume.process][INFO ] stderr Job for ceph-osd@21.service canceled. [2019-05-02 10:04:32,211][ceph_volume][ERROR ] exception caught by decorator Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 59, in newfunc return f(*a, **kw) File "/usr/lib/python2.7/dist-packages/ceph_volume/main.py", line 148, in main terminal.dispatch(self.mapper, subcommand_args) File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, in dispatch instance.main() File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/main.py", line 40, in main terminal.dispatch(self.mapper, self.argv) File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, in dispatch instance.main() File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root return func(*a, **kw) File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/trigger.py", line 70, in main Activate(['--auto-detect-objectstore', osd_id, osd_uuid]).main() File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 339, in main self.activate(args) File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root return func(*a, **kw) File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 261, in activate return activate_bluestore(lvs, no_systemd=args.no_systemd) File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/activate.py", line 196, in activate_bluestore systemctl.start_osd(osd_id) File "/usr/lib/python2.7/dist-packages/ceph_volume/systemd/systemctl.py", line 39, in start_osd return start(osd_unit % id_) File "/usr/lib/python2.7/dist-packages/ceph_volume/systemd/systemctl.py", line 8, in start process.run(['systemctl', 'start', unit]) File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 153, in run raise RuntimeError(msg) RuntimeError: command returned non-zero exit status: 1 [2019-05-02 10:04:32,222][ceph_volume.process][INFO ] stdout Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-21 --> Absolute path not found for executable: restorecon --> Ensure $PATH environment variable contains common executable locations Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-21 Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-block-393ba2fc-e970-4d48-8dcb-c6261dfdfe08/osd-block-931e2d94-63f6-4df8-baed-6873eb0123e2 --path /var/lib/ceph/osd/ceph-21 --no-mon-config Running command: /bin/ln -snf /dev/ceph-block-393ba2fc-e970-4d48-8dcb-c6261dfdfe08/osd-block-931e2d94-63f6-4df8-baed-6873eb0123e2 /var/lib/ceph/osd/ceph-21/block Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-21/block Running command: /bin/chown -R ceph:ceph /dev/dm-12 Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-21 Running command: /bin/ln -snf /dev/ceph-block-dbs-75eda181-946f-4a40-b4e0-8ecd60721398/osd-block-db-45ee9a1f-3ee2-4db9-a057-fd06fa1452e8 /var/lib/ceph/osd/ceph-21/block.db Running command: /bin/chown -h ceph:ceph /dev/ceph-block-dbs-75eda181-946f-4a40-b4e0-8ecd60721398/osd-block-db-45ee9a1f-3ee2-4db9-a057-fd06fa1452e8 Running command: /bin/chown -R ceph:ceph /dev/dm-21 Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-21/block.db Running command: /bin/chown -R ceph:ceph /dev/dm-21 Running command: /bin/systemctl enable ceph-volume@lvm-21-e6f688e0-3e71-4ee6-90f3-b3c07a99059f Running command: /bin/systemctl enable --runtime ceph-osd@21 stderr: Created symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd@21.service → /lib/systemd/system/ceph-osd@.service. Running command: /bin/systemctl start ceph-osd@21 stderr: Job for ceph-osd@21.service canceled. There is nothing in the global journal because journald had not been started at that time. > The "After=" directive is just adding some wait time to start > activating here, so I wonder how is it that your OSDs didn't > eventually came up. Yes, we added that After because ceph-osd@.service contains this line. At least it does no harm. ;) Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 93818 B Geschäftsführer: Peer Heinlein - Sitz: Berlin
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com