Hi Bob, have you tried to restart the active mgr? ( sometimes mgr gets stuck and prevents the orchestrator from working correctly ). Regarding the orchestrator device scan: have a look into the ceph-volume.log on the corresponding host. you will find it under /var/log/ceph/CLUSTER-ID/ceph-volume.log this log is generated by the device scan from the orchestrator. It may also help to have a look at cephadm debug logs - see https://docs.ceph.com/en/latest/cephadm/operations/#watching-cephadm-log-messages Cheers, tobi Am Mi., 23. Okt. 2024 um 20:15 Uhr schrieb Bob Gibson <rjg@xxxxxxxxxx>: > Sorry to resurrect this thread, but while I was able to get the cluster > healthy again by manually creating the osd, I'm still unable to manage osds > using the orchestrator. > > The orchestrator is generally working, but It appears to be unable to scan > devices. Immediately after failing out the mgr `ceph orch device ls` will > display device status from >4 weeks ago, which was when we converted the > cluster to be managed by cephadm. Eventually the orchestrator will attempt > to refresh its device status. At this point `ceph orch device ls` stops > displaying any output at all. I can reproduce this state almost immediately > if I run `ceph orch device ls —refresh` to force an immediate refresh. The > mgr log shows events like the following just before `ceph orch device ls` > stops reporting output (one event for every osd node in the cluster): > > "Detected new or changed devices on ceph-osd31” > > Here are the osd services in play: > > # ceph orch ls osd > NAME PORTS RUNNING REFRESHED AGE PLACEMENT > osd 95 8m ago - <unmanaged> > osd.ceph-osd31 4 8m ago 6d ceph-osd31 > > # ceph orch ls osd --export > service_type: osd > service_name: osd > unmanaged: true > spec: > filter_logic: AND > objectstore: bluestore > --- > service_type: osd > service_id: ceph-osd31 > service_name: osd.ceph-osd31 > placement: > hosts: > - ceph-osd31 > spec: > data_devices: > rotational: 0 > size: '3TB:' > encrypted: true > filter_logic: AND > objectstore: bluestore > > I tried deleting the default “osd” service in case it was somehow > conflicting with my per-node spec, but it looks like that’s not allowed, so > I assume any custom osd service specs override the unmanaged default. > > # ceph orch rm osd > Invalid service 'osd'. Use 'ceph orch ls' to list available services. > > My hunch is that some persistent state is corrupted, or there’s something > else preventing the orchestrator from successfully refreshing its device > status, but I don’t know how to troubleshoot this. Any ideas? > > Cheers, > /rjg > > P.S. @Eugen: When I first started this thread you said it was unnecessary > to destroy an osd to convert it from unmanaged to managed. Can you explain > how this is done? Although we want to recreate the osds to enable > encryption, it would save time, and unnecessary wear on the SSDs, while > troubleshooting. > > On Oct 16, 2024, at 2:45 PM, Eugen Block <eblock@xxxxxx> wrote: > > EXTERNAL EMAIL | USE CAUTION > > Glad to hear it worked out for you! > > Zitat von Bob Gibson <rjg@xxxxxxxxxx>: > > I’ve been away on vacation and just got back to this. I’m happy to > report that manually recreating the OSD with ceph-volume and then > adopting it with cephadm fixed the problem. > > Thanks again for your help Eugen! > > Cheers, > /rjg > > On Sep 29, 2024, at 10:40 AM, Eugen Block <eblock@xxxxxx> wrote: > > EXTERNAL EMAIL | USE CAUTION > > Okay, apparently this is not what I was facing. I see two other > options right now. The first would be to purge osd.88 from the crush > tree entirely. > The second approach would be to create an osd manually with > ceph-volume, not cephadm ceph-volume, to create a legacy osd (you'd > get warnings about a stray daemon). If that works, adopt the osd with > cephadm. > I don't have a better idea right now. > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- Best Regards, Tobias Fischer Head of Ceph Clyso GmbH p: +49 89 2152527 41 a: Hohenzollernstraße 27 | 80801 München | Germany w: https://clyso.com | e: tobias.fischer@xxxxxxxxx We are hiring: https://www.clyso.com/jobs/ --- Geschäftsführer: Dipl. Inf. (FH) Joachim Kraftmayer Unternehmenssitz: Utting am Ammersee Handelsregister beim Amtsgericht: Augsburg Handelsregister-Nummer: HRB 25866 USt. ID-Nr.: DE275430677 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx