Re: cephadm Failed to apply 1 service(s)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sure, you can save the drivegroup spec in a file, edit it according to your requirements (not sure if having device paths in there makes sense though) and apply it:

ceph orch apply -i new-drivegroup.yml

Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:

Many thanks for your response, Eugen!

I tried to fail mgr twice, unfortunately that had no effect on the issue.
Neither `cephadm ceph-volume inventory` nor `ceph device ls-by-host ceph03`
have the failed drive on the list.

Though your assumption is correct, the spec appears to explicitly include
the failed drive:

---
service_type: osd
service_id: ceph03_combined_osd
service_name: osd.ceph03_combined_osd
placement:
  hosts:
  - ceph03
spec:
  data_devices:
    paths:
...
    - /dev/sde
...
  db_devices:
    paths:
    - /dev/nvme0n1
    - /dev/nvme1n1
  filter_logic: AND
  objectstore: bluestore
---

Do you know the best way to remove the device from the spec?

/Z

On Fri, 16 Feb 2024 at 14:10, Eugen Block <eblock@xxxxxx> wrote:

Hi,

sometimes the easiest fix is to failover the mgr, have you tried that?
If that didn't work, can you share the drivegroup spec?

ceph orch ls <your_osd_spec> --export

Does it contain specific device paths or something? Does 'cephadm ls'
on that node show any traces of the previous OSD?
I'd probably try to check some things like

cephadm ceph-volume inventory
ceph device ls-by-host <host>

Regards,
Eugen

Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:

> Hi,
>
> We had a physical drive malfunction in one of our Ceph OSD hosts managed
by
> cephadm (Ceph 16.2.14). I have removed the drive from the system, and the
> kernel no longer sees it:
>
> ceph03 ~]# ls -al /dev/sde
> ls: cannot access '/dev/sde': No such file or directory
>
> I have removed the corresponding OSD from cephadm, crush map, etc. For
all
> intents and purposes that OSD and its block device no longer exist:
>
> root@ceph01:/# ceph orch ps | grep osd.26
> root@ceph01:/# ceph osd tree| grep 26
> root@ceph01:/# ceph orch device ls | grep -E "ceph03.*sde"
>
> None of the above commands return anything. Cephadm correctly sees 8
> remaining OSDs on the host:
>
> root@ceph01:/# ceph orch ls | grep ceph03_c
> osd.ceph03_combined_osd                     8  33s ago    2y   ceph03
>
> Unfortunately, cephadm appears to be trying to apply a spec to host
ceph03
> including the disk that is now missing:
>
> RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host
> --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume
> --privileged --group-add=disk --init -e CONTAINER_IMAGE=
>
quay.io/ceph/ceph@sha256:843f112990e6489362c625229c3ea3d90b8734bd5e14e0aeaf89942fbb980a8b
> -e NODE_NAME=ceph03 -e CEPH_USE_RANDOM_NONCE=1 -e
> CEPH_VOLUME_OSDSPEC_AFFINITY=ceph03_combined_osd -e
> CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v
> /var/run/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86:/var/run/ceph:z -v
> /var/log/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86:/var/log/ceph:z -v
>
/var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/crash:/var/lib/ceph/crash:z
> -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v
> /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v
> /tmp/ceph-tmpc7b33pf0:/etc/ceph/ceph.conf:z -v
> /tmp/ceph-tmpq45nkmd6:/var/lib/ceph/bootstrap-osd/ceph.keyring:z
>
quay.io/ceph/ceph@sha256:843f112990e6489362c625229c3ea3d90b8734bd5e14e0aeaf89942fbb980a8b
> lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
> /dev/sdg /dev/sdh /dev/sdi --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes
> --no-systemd
>
> Note that `lvm batch` includes the missing drive, /dev/sde. This fails
> because the drive no longer exists. Other than this cephadm ceph-volume
> thingy, the cluster is healthy.How can I tell cephadm that it should stop
> trying to use /dev/sde, which no longer exists, without affecting other
> OSDs on the host?
>
> I would very much appreciate any advice or pointers.
>
> Best regards,
> Zakhar
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux