Re: osd won't restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Frédéric,

Thanks a lot for your help... I'm getting a bit desperate

I keep on having the same issue about permissions (see below) but I've checked. All folders in /var/lib/ceph are owned by ceph:ceph.

# ceph orch daemon add osd hvs005:/dev/sde
...
/usr/bin/docker: stderr Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-18/keyring
/usr/bin/docker: stderr Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-18/
/usr/bin/docker: stderr Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 18 --monmap /var/lib/ceph/osd/ceph-18/activate.monmap --keyfile - --osdspec-affinity None --osd-data /var/lib/ceph/osd/ceph-18/ --osd-uuid a9a453d1-af41-4889-8bd7-360db63031af --setuser ceph --setgroup ceph
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.503+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-18//block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.503+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-18//block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.504+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-18//block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3]
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.504+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18/) _read_fsid unparsable uuid
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.578+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-18//block: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.578+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-18//block: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.578+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-18//block: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.578+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-18//block: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.578+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-18//block: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.578+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-18//block: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:07.578+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-18//block: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:08.016+0000 799d28690640 -1 bluestore(/var/lib/ceph/osd/ceph-18//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-18//block: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:08.016+0000 799d28690640 -1 bdev(0x5702dfb36a80 /var/lib/ceph/osd/ceph-18//block) open open got: (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:08.016+0000 799d28690640 -1 OSD::mkfs: ObjectStore::mkfs failed with error (13) Permission denied
/usr/bin/docker: stderr  stderr: 2025-01-22T17:50:08.016+0000 799d28690640 -1  ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-18/: (13) Permission denied
/usr/bin/docker: stderr --> Was unable to complete a new OSD, will rollback changes
...

Also I have traces of the 'old' osd's still in the system:
# ceph orch ps hvs005
NAME                  HOST    PORTS   STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION    IMAGE ID      CONTAINER ID
crash.hvs005          hvs005          running (14m)     5m ago  14m    7224k        -  19.2.0     37996728e013  7312995d44a8
node-exporter.hvs005  hvs005  *:9100  running (14m)     5m ago  14m    8095k        -  1.5.0      0da6a335fe13  8ffad82e01ac
osd.18                hvs005          stopped           5m ago   9d        -    4096M  <unknown>  <unknown>     <unknown>
osd.19                hvs005          error             5m ago   9d        -    4096M  <unknown>  <unknown>     <unknown>
osd.20                hvs005          error             5m ago   9d        -    4096M  <unknown>  <unknown>     <unknown>
osd.21                hvs005          error             5m ago   9d        -    4096M  <unknown>  <unknown>     <unknown>
osd.22                hvs005          error             5m ago   9d        -    4096M  <unknown>  <unknown>     <unknown>
osd.23                hvs005          error             5m ago   9d        -    4096M  <unknown>  <unknown>     <unknown>
osd.24                hvs005          running (2h)      5m ago   9d    1929M    16.3G  19.2.0     37996728e013  3dd73aa4aa26
osd.31                hvs005          running (2h)      5m ago   9d    1821M    16.3G  19.2.0     37996728e013  e1060d4906b3

These 'old' osd's were removed and zapped...

I also have de disk sysctl entries of these disks (except for osd.18 which I deleted manually trying to clean op osd.18)
# ls -lh /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585.target.wants/
total 36K
lrwxrwxrwx 1 root root 70 Jan 22 17:45 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@crash.hvs005.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service
lrwxrwxrwx 1 root root 70 Jan 22 17:45 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@node-exporter.hvs005.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service
lrwxrwxrwx 1 root root 70 Jan 13 16:41 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@osd.19.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service
lrwxrwxrwx 1 root root 70 Jan 13 16:42 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@osd.20.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service
lrwxrwxrwx 1 root root 70 Jan 13 16:42 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@osd.21.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service
lrwxrwxrwx 1 root root 70 Jan 13 16:42 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@osd.22.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service
lrwxrwxrwx 1 root root 70 Jan 13 16:42 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@osd.23.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service
lrwxrwxrwx 1 root root 70 Jan 13 16:47 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@osd.24.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service
lrwxrwxrwx 1 root root 70 Jan 13 16:48 ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@osd.31.service -> /etc/systemd/system/ceph-dd4b0610-b4d2-11ec-bb58-d1b32ae31585@.service

This host is getting very messy...

> -----Oorspronkelijk bericht-----
> Van: Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx>
> Verzonden: woensdag 22 januari 2025 18:20
> Aan: Dominique Ramaekers <dominique.ramaekers@xxxxxxxxxx>
> CC: ceph-users <ceph-users@xxxxxxx>
> Onderwerp: Re:  Re: osd won't restart
>
>
> ----- Le 22 Jan 25, à 15:31, Dominique Ramaekers
> dominique.ramaekers@xxxxxxxxxx a écrit :
>
> > I didn't get any reply on this issue, so I tried some steps:
> > - I removed Apparmor (Ubuntu right...)
> > - I restarted the server
> > - osd's unmanaged: #ceph orch set-unmanaged osd.all-available-devices
> > (because when I want to create lvm's this service was bothering me)
> > - Created lvm on a disk
> >
> > Then on creating the osd I get this:
> > ceph orch daemon add osd hvs005:/dev/vgsdc/hvs005_sdc Created no
> > osd(s) on host hvs005; already created?
> >
>
> cephadm takes care of creating LVM volumes on devices, so the "Created
> lvm on a disk" step you did before probably explains the "Created no osd(s)
> on host hvs005; already created?" and the non-creation of the OSD.
>
> Frédéric.
>
> > I removed config keys of the first available osd number also the
> > auth... No luck...
> >
> > Can somewone give me some pointers on how to continue creating osd?
> >
> > Note: My setup is a simple setup deployed with cephadm and docker...
> >
> >> -----Oorspronkelijk bericht-----
> >> Van: Dominique Ramaekers <dominique.ramaekers@xxxxxxxxxx>
> >> Verzonden: maandag 20 januari 2025 11:59
> >> Aan: ceph-users@xxxxxxx
> >> Onderwerp:  osd won't restart
> >>
> >> Hi,
> >>
> >> Strange thing just happened (ceph v19.2.0). I added two disks to a host.
> >> Kernel recognized nicely the two disks and they appeared as available
> >> devices in ceph.
> >>
> >> After 15 minutes osd's were not created, so I looked at the logs:
> >> /usr/bin/docker: stderr --> Creating keyring file for osd.36
> >> /usr/bin/docker: stderr Running command: /usr/bin/chown -R ceph:ceph
> >> /var/lib/ceph/osd/ceph-36/keyring
> >> /usr/bin/docker: stderr Running command: /usr/bin/chown -R ceph:ceph
> >> /var/lib/ceph/osd/ceph-36/
> >> /usr/bin/docker: stderr Running command: /usr/bin/ceph-osd --cluster
> >> ceph --osd-objectstore bluestore --mkfs -i 36 --monmap
> >> /var/lib/ceph/osd/ceph- 36/activate.monmap --keyfile -
> >> --osdspec-affinity all-available-devices --osd- data
> >> /var/lib/ceph/osd/ceph-36/ --osd-uuid 41675779-943d-4dca-baa3-
> >> 3a4f6ace004a --setuser ceph --setgroup ceph
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:01.979+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> unable to decode label /var/lib/ceph/osd/ceph-36//block at offset
> >> 102: void
> >> bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_ite
> >> rator
> >> &) decode past end of struct encoding: Malformed input [buffer:3]
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:01.979+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> unable to decode label /var/lib/ceph/osd/ceph-36//block at offset
> >> 102: void
> >> bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_ite
> >> rator
> >> &) decode past end of struct encoding: Malformed input [buffer:3]
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:01.980+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> unable to decode label /var/lib/ceph/osd/ceph-36//block at offset
> >> 102: void
> >> bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_ite
> >> rator
> >> &) decode past end of struct encoding: Malformed input [buffer:3]
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:01.980+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36/) _read_fsid unparsable uuid
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.075+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> failed to open /var/lib/ceph/osd/ceph-36//block: (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.076+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> failed to open /var/lib/ceph/osd/ceph-36//block: (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.076+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> failed to open /var/lib/ceph/osd/ceph-36//block: (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.076+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> failed to open /var/lib/ceph/osd/ceph-36//block: (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.076+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> failed to open /var/lib/ceph/osd/ceph-36//block: (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.076+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> failed to open /var/lib/ceph/osd/ceph-36//block: (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.076+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> failed to open /var/lib/ceph/osd/ceph-36//block: (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.506+0000
> >> 79e9fea34640
> >> -1 bluestore(/var/lib/ceph/osd/ceph-36//block) _read_bdev_label
> >> failed to open /var/lib/ceph/osd/ceph-36//block: (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.506+0000
> >> 79e9fea34640
> >> -1 bdev(0x566a21f14a80 /var/lib/ceph/osd/ceph-36//block) open open
> got:
> >> (13) Permission denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.506+0000
> >> 79e9fea34640
> >> -1 OSD::mkfs: ObjectStore::mkfs failed with error (13) Permission
> >> denied
> >> /usr/bin/docker: stderr  stderr: 2025-01-20T08:21:02.506+0000
> >> 79e9fea34640
> >> -1 [0;31m ** ERROR: error creating empty object store in
> >> /var/lib/ceph/osd/ceph-36/: (13) Permission denied
> >> /usr/bin/docker: stderr --> Was unable to complete a new OSD, will
> >> rollback changes
> >>
> >> As it sad "Permission denied" and I have already osd's running, I
> >> thought it was an issue that docker may be updated but not restarted.
> >> So I did 'systemcrl restart docker.service'. Now none of the managed
> >> osd's are going back online!!!
> >> 'systemctl start ceph-dd4b0610-b4d2-11ec-bb58-
> >> d1b32ae31585@osd.18.service<mailto:ceph-dd4b0610-b4d2-11ec-bb58-
> >> d1b32ae31585@osd.18.service>' fails with not much explanation...
> >>
> >> Only unmanaged osd have no issue...
> >>
> >> I didn't payed much attention to the log entry '_read_fsid unparsable
> uuid'...
> >> So I think there is more going on. Permission denied is logical if
> >> the path is wrong...  see dubbel slash in '_read_bdev_label failed to
> >> open /var/lib/ceph/osd/ceph-36//block'
> >> Is this a bug like
> https://lsems.gravityzone.bitdefender.com/scan/aHR0cHM6Ly9naXRodWIuY
> 29tL3Jvb2svcm9vay9pc3N1ZXMvMTAyMTk=/4A7C71F71EE1FD4669AA065137
> F3E870B917B9156C6DD082C339235DCD51813C?c=1&i=1&docs=1 ?
> >>
> >> Can I get around this without recreating these osd as unmanaged?
> >>
> >> Thanks in advance.
> >>
> >> Dominique.
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> >> email to ceph-users-leave@xxxxxxx
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> > email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux