Re: Ceph orchestrator not refreshing device list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You're right about deleting the service, of course. I wasn't very clear in my statement, what I actually meant was that it won't be removed entirely until all OSDs report a different spec in their unit.meta file. I forgot to add that info in my last response, that's actually how I've done it several times after adopting a cluster. Thanks for clearing that up! :-)

Zitat von Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx>:

----- Le 25 Oct 24, à 18:21, Frédéric Nass frederic.nass@xxxxxxxxxxxxxxxx a écrit :

----- Le 25 Oct 24, à 16:31, Bob Gibson rjg@xxxxxxxxxx a écrit :

HI Frédéric,

I think this message shows up as this very specific post adoption 'osd' service has already been marked as 'deleted'. Maybe when you ran the command for the
first time.
The only reason it still shows up on 'ceph orch ls' is that 95 OSDs are still
referencing this service in their configuration.

Once you'll have edited all OSDs /var/lib/ceph/$(ceph fsid)/osd.xxx/unit.meta files (changed their service_name) and restarted all OSDs (or recreated these 95 OSDs encrypted under another service_name), the 'osd' service will disappear by itself and won't show up anymore on 'ceph orch ls' output. At least this is
what I've observed in the past.

Yes, as Eugen pointed out, it doesn’t make sense to try to delete an unmanaged
service using the orchestrator.

Well actually you **can** delete a service whatever it's status (managed or
unmanaged).

To explain a bit more, see below:

$ ceph orch ls --export osd osd.delete
service_type: osd
service_id: delete
service_name: osd.delete
placement:
  hosts:
  - test-mom02h01
unmanaged: true
spec:
  data_devices:
    size: :11G
  db_devices:
    size: '12G:'
  db_slots: 2
  filter_logic: AND
  objectstore: bluestore

This is what you should expect:

$ ceph orch rm osd.delete
Error EINVAL: If osd.delete is removed then the following OSDs will remain, --force to proceed anyway
	host test-mom02h01: osd.11

$ ceph orch rm osd.delete --force        <--- ok, let's force it
Removed service osd.delete

$ ceph orch ls | grep osd
osd.delete 1 95s ago - <unmanaged> <--- still here because used by 1 OSD
osd.standard                              12  9m ago     8w   label:osds

$ ceph orch rm osd.delete
Invalid service 'osd.delete'. Use 'ceph orch ls' to list available services. <--- but not for the orchestrator

$ sed -i 's/osd.delete/osd.standard/g' /var/lib/ceph/$(ceph fsid)/osd.11/unit.meta <--- remove this service from osd.11 configuration

$ ceph orch daemon restart osd.11
Scheduled to restart osd.11 on host 'test-mom02h01'

$ ceph orch ls | grep osd
osd.standard 13 8m ago 8w label:osds <--- osd.delete service finally gone

The osd.delete service is finally gone right after the last OSD stopped referencing it.

With the very specific post adoption 'osd' service, when you try to delete it, it doesn't complain about existing OSDs referencing it (when it should...) and doesn't require you to use the --force argument. It just deletes the service (that will finally be removed when no more OSDs are using it).

The fact that the 'ceph orch rm' output is not consistent when deleting the 'osd' post adoption service and deleting any other osd service that you create looks like a bug to me.

But anyways, that was just to say you can delete an osd service whatever it's status (managed or unmanaged).

Cheers,
Frédéric.

It works just fine with any osd service other than this specific post adoption
'osd' service. Don't know why.

Frédéric.


My hunch is that some persistent state is corrupted, or there’s something else preventing the orchestrator from successfully refreshing its device status, but
I don’t know how to troubleshoot this. Any ideas?

I don't think this is related to the 'osd' service. As suggested by Tobi,
enabling cephadm debug will tell you more.

Agreed. I’ll dig through the logs some more today to see if I can spot any
problems.

Cheers,
/rjg
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux