Re: OSD container won't boot up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Update on this, I have figured out what happened.

I had ceph packages installed on the node, since this was converted from ceph-deploy to cephadm when we tested Octopus. When I upgraded Ubuntu, it updated the packages and triggered a permission issue which is already documented. This whole mess doesn't stop there though. Now I'm getting the following error :

 Dec 01 20:18:32 ceph-osdstore3 bash[10389]: debug 2022-12-01T20:18:32.269+0000 7f852991ef00 -1 bluefs _check_new_allocations invalid extent 1: 0x40440000~10000: wasn't given but allocated for ino 1 Dec 01 20:18:32 ceph-osdstore3 bash[10389]: debug 2022-12-01T20:18:32.269+0000 7f852991ef00 -1 bluefs mount failed to replay log: (14) Bad address Dec 01 20:18:32 ceph-osdstore3 bash[10389]: debug 2022-12-01T20:18:32.269+0000 7f852991ef00 -1 bluestore(/var/lib/ceph/osd/ceph-6) _open_bluefs failed bluefs mount: (14) Bad address Dec 01 20:18:32 ceph-osdstore3 bash[10389]: debug 2022-12-01T20:18:32.545+0000 7f852991ef00 -1 osd.6 0 OSD:init: unable to mount object store Dec 01 20:18:32 ceph-osdstore3 bash[10389]: debug 2022-12-01T20:18:32.545+0000 7f852991ef00 -1  ** ERROR: osd init failed: (14) Bad address Dec 01 20:18:33 ceph-osdstore3 systemd[1]: ceph-04c5d4a4-8815-45fb-b97f-027252d1aea5@osd.6.service: Main process exited, code=exited, status=1/FAILURE

This is similar to what the rook team found here: https://tracker.ceph.com/issues/48036 ; . From what I understand, the bluestore fs is corrupted and the bluestore tools repair can't fix it. Is there a way to fix this, or is this completely unrepairable?

On 11/29/22 13:57, J-P Methot wrote:
Hi,

I've been testing the cephadm upgrade process in my staging environment and I'm running into an issue where the docker container just doesn't boot up anymore. This is an octopus to Nautilus  16.2.10 upgrade and I expect to upgrade to quincy afterwards. This is also running on Ubuntu 22.04. When cephadm tries to start the OSD container, I get the following error :

cephadm exited with an error code: 1, stderr:Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:5e650154a636a3655892c436203b3535433d014ca0774224a42fdf166887cd4d -e NODE_NAME=ceph-osdstore1 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/04c5d4a4-8815-45fb-b97f-027252d1aea5:/var/run/ceph:z -v /var/log/ceph/04c5d4a4-8815-45fb-b97f-027252d1aea5:/var/log/ceph:z -v /var/lib/ceph/04c5d4a4-8815-45fb-b97f-027252d1aea5/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfqec2qeo:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpaqun0_rl:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:5e650154a636a3655892c436203b3535433d014ca0774224a42fdf166887cd4d lvm batch --no-auto /dev/sdb --yes --no-systemd

I tried running the above command by myself to see what error Docker will throw at me, and this is what I get:

exception caught by decorator
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
  File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 114, in main
    description=self.help(),
  File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 47, in help
    ceph_path=self.stat_ceph_conf(),
  File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 83, in stat_ceph_conf
    configuration.load(conf.path)
  File "/usr/lib/python3.6/site-packages/ceph_volume/configuration.py", line 56, in load
    ceph_file = open(abspath)
IsADirectoryError: [Errno 21] Is a directory: '/etc/ceph/ceph.conf'

I'm not entirely sure how I can troubleshoot this further. I'd be quite surprised if this was an issue with the container image.

--
Jean-Philippe Méthot
Senior Openstack system administrator
Administrateur système Openstack sénior
PlanetHoster inc.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux