Re: Stuck OSD service specification - can't remove

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm not attempting to remove the OSDs, but instead the
service/placement specification. I want the OSDs/data to persist.
--force did not work on the service, as noted in the original email.

Thank you,
David

On Fri, May 7, 2021 at 1:36 AM mabi <mabi@xxxxxxxxxxxxx> wrote:
>
> Hi David,
>
> I had a similar issue yesterday where I wanted to remove an OSD on an OSD node which had 2 OSDs so for that I used "ceph orch osd rm" command which completed successfully but after rebooting that OSD node I saw it was still trying to start the systemd service for that OSD and one CPU core was 100% busy trying to do a "crun delete" which I suppose here is trying to delete an image or container. So what I did here is to kill this process and I also had to run the following command:
>
> ceph orch daemon rm osd.3 --force
>
> After that everything was fine again. This is a Ceph 15.2.11 cluster on Ubuntu 20.04 and podman.
>
> Hope that helps.
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Friday, May 7, 2021 1:24 AM, David Orman <ormandj@xxxxxxxxxxxx> wrote:
>
> > Has anybody run into a 'stuck' OSD service specification? I've tried
> > to delete it, but it's stuck in 'deleting' state, and has been for
> > quite some time (even prior to upgrade, on 15.2.x). This is on 16.2.3:
> >
> > NAME PORTS RUNNING REFRESHED AGE PLACEMENT
> > osd.osd_spec 504/525 <deleting> 12m label:osd
> > root@ceph01:/# ceph orch rm osd.osd_spec
> > Removed service osd.osd_spec
> >
> > From active monitor:
> >
> > debug 2021-05-06T23:14:48.909+0000 7f17d310b700 0
> > log_channel(cephadm) log [INF] : Remove service osd.osd_spec
> >
> > Yet in ls, it's still there, same as above. --export on it:
> >
> > root@ceph01:/# ceph orch ls osd.osd_spec --export
> > service_type: osd
> > service_id: osd_spec
> > service_name: osd.osd_spec
> > placement: {}
> > unmanaged: true
> > spec:
> > filter_logic: AND
> > objectstore: bluestore
> >
> > We've tried --force, as well, with no luck.
> >
> > To be clear, the --export even prior to delete looks nothing like the
> > actual service specification we're using, even after I re-apply it, so
> > something seems 'bugged'. Here's the OSD specification we're applying:
> >
> > service_type: osd
> > service_id: osd_spec
> > placement:
> > label: "osd"
> > data_devices:
> > rotational: 1
> > db_devices:
> > rotational: 0
> > db_slots: 12
> >
> > I would appreciate any insight into how to clear this up (without
> > removing the actual OSDs, we're just wanting to apply the updated
> > service specification - we used to use host placement rules and are
> > switching to label-based).
> >
> > Thanks,
> > David
> >
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux