Tim, Actually, the links the Eugen shared earlier were sufficient. I ended up with service_type: osd service_name: osd placement: host_pattern: 'ceph01' spec: data_devices: rotational: 1 db_devices: rotational: 0 This worked exactly right as far as creating the OSD - it found and reused the same OSD number that was previously destroyed, and also recreated the WAL/DB LV using the 'blank spot' on the NVMe drive. However, I'm a bit concerned that the output of 'ceph orch ls osd' has changed in a way that might not be quite good: NAME PORTS RUNNING REFRESHED AGE PLACEMENT osd 32 3m ago 52m ceph01 Before all of this started this line used to contain the word 'unmanaged' somewhere. Eugen and I were having a side discussion about how to make all of my OSDs managed without destroying them, so I could do things like 'ceph orch restart osd' to restart all of the OSDs to assure that the pick up changes to attributes like osd_memory_target and osd_memory_target_autotune, So, in applying this spec, did I make all my OSDs managed, or just all of the ones on ceph01, or just the one that got created when I applied the spec? When I add my next host, should I change the placement to that host name or to '*'? More generally, is there a higher level document that talks about Ceph spec files and the orchestrator - something that deals with the general concepts? Thanks. -Dave -- Dave Hall Binghamton University kdhall@xxxxxxxxxxxxxx On Fri, Nov 1, 2024 at 1:40 PM Tim Holloway <timh@xxxxxxxxxxxxx> wrote: > I can't offer a spec off the cuff, but if the LV still exists and you > don't need to change its size, then I'd zap it to remove residual Ceph > info because otherwise the operation will complain and fail. > > Having done that, the requirements should be the same as a first-time > construction of an OSD on that LV. Eugen can likely give you the spec > info. I'd have to RTFM. > > Tim > > > On 11/1/24 11:22, Dave Hall wrote: > > Tim, Eugen, > > > > So what would a spec file look like for a single OSD that uses a specific > > HDD (/dev/sdi) and with WAL/DB on an LV that's 25% of a specific NVMe > > drive? Regarding the NVMe, there are 3 other OSDs already using 25% each > > of this NVMe for WAL/DB, but I have removed the LV that was used by the > > failed OSD. Do I need to pre-create the LV, or will 'ceph orch' do that > > for me? > > > > Thanks. > > > > -Dave > > > > -- > > Dave Hall > > Binghamton University > > kdhall@xxxxxxxxxxxxxx > > > > On Thu, Oct 31, 2024 at 3:52 PM Tim Holloway <timh@xxxxxxxxxxxxx> wrote: > > > >> I migrated from gluster when I found out it's going unsupported shortly. > >> I'm really not big enough for Ceph proper, but there were only so many > >> supported distributed filesystems with triple redundancy. > >> > >> Where I got into trouble was that I started off with Octopus and Octopus > >> had some teething pains. Like stalling scheduled operations until the > >> system was clan but the only way to get a clean system was to run the > >> stalled operations. Pacific cured that for me. > >> > >> But the docs were and remain somewhat fractured between legacy and > >> managed services and I managed to get into a real mess there, especially > >> since I was wildly trying anything to get those stalled fixes to take. > >> > >> Since then, I've pretty much redefined all my OSDs with fewer but larger > >> datastores and made them all managed. Now if I could just persuade the > >> auto-tuner to fix the PG sizes, > >> > >> I'm in the process of opening a ticket account right now. The fun part > >> of this is that realistically, older docs need a re-write just as much > >> as the docs for the current release. > >> > >> Tim > >> > >> On 10/31/24 15:39, Eugen Block wrote: > >>> I completely understand your point of view. Our own main cluster is > >>> also a bit "wild" in its OSD layout, that's why its OSDs are > >>> "unmanaged" as well. When we adopted it via cephadm, I started to > >>> create suitable osd specs for all those hosts and OSDs and I gave up. > >>> :-D But since we sometimes also tend to experiment a bit, I rather > >>> have full control over it. That's why we also have > >>> osd_crush_initial_weight = 0, to check the OSD creation before letting > >>> Ceph remap any PGs. > >>> > >>> It definitely couldn't hurt to clarify the docs, you can always report > >>> on tracker.ceph.com if you have any improvement ideas. > >>> > >>> Zitat von Tim Holloway <timh@xxxxxxxxxxxxx>: > >>> > >>>> I have been slowly migrating towards spec files as I prefer > >>>> declarative management as a rule. > >>>> > >>>> However, I think that we may have a dichotomy in the user base. > >>>> > >>>> On the one hand, users with dozens/hundreds of server/drives of > >>>> basically identical character. > >>>> > >>>> On the other, I'm one who's running fewer servers and for historical > >>>> reasons they tend to be wildly individualistic and often have blocks > >>>> of future-use space reserved for non-ceph storage. > >>>> > >>>> Ceph, left to its own devices (no pun intended) can be quite > >>>> enthusiastic about adopting any storage it can find. And that's great > >>>> for users in the first category. Which is what the spec information > >>>> in the supplied links is emphasizing. But for us lesser creatures who > >>>> feel the need to manually control where each OSD and how it's > >>>> configured, it's not so simple. I'm fairly certain that there's > >>>> documentation on the spec file setup for that sort of stuff in the > >>>> online docs, but it's located somewhere else and I cannot recall > where. > >>>> > >>>> At any rate I would consider it very important that the different > >>>> ways to setup an OSD should explicitly indicate which type of OSD > >>>> will be generated in their documentation. > >>>> > >>>> Tim > >>>> > >>>> > >>>> On 10/31/24 14:28, Eugen Block wrote: > >>>>> Hi, > >>>>> > >>>>> the preferred method to deploy OSDs in cephadm managed clusters are > >>>>> spec files, see this part of the docs [0] for more information. I > >>>>> would just not use the '--all-available-devices' flag, except in > >>>>> test clusters, or if you're really sure that this is what you want. > >>>>> > >>>>> If you use 'ceph orch daemon add osd ...', you'll end up with one > >>>>> (or more) OSD(s), but they will be unmanaged, as you already noted > >>>>> in your own cluster. There are a couple of examples with advanced > >>>>> specs (e. g. DB/WAL on dedicated devices) in the docs as well [1]. > >>>>> So my recommendation would be to have a suiting spec file for your > >>>>> disk layout. You can always check with the '--dry-run' flag before > >>>>> actually applying it: > >>>>> > >>>>> ceph orch apply -i osd-spec.yaml --dry-run > >>>>> > >>>>> Regards, > >>>>> Eugen > >>>>> > >>>>> [0] > https://docs.ceph.com/en/latest/cephadm/services/osd/#deploy-osds > >>>>> [1] > >>>>> > >> > https://docs.ceph.com/en/latest/cephadm/services/osd/#advanced-osd-service-specifications > >>>>> Zitat von Tim Holloway <timh@xxxxxxxxxxxxx>: > >>>>> > >>>>>> As I understand it, the manual OSD setup is only for legacy > >>>>>> (non-container) OSDs. Directory locations are wrong for managed > >>>>>> (containerized) OSDs, for one. > >>>>>> > >>>>>> Actually, the whole manual setup docs ought to be moved out of the > >>>>>> mainline documentation. In their present arrangement, they make > >>>>>> legacy setup sound like the preferred method. And have you noticed > >>>>>> that there is no corresponding well-marked section titled > >>>>>> "Authomated (cephadmin) setup?". > >>>>>> > >>>>>> This is how we end up with OSDs that are simultaneously legacy AND > >>>>>> administered for the same OSD, since at last count there are no > >>>>>> interlocks within Ceph to prevent such a mess. > >>>>>> > >>>>>> Tim > >>>>>> > >>>>>> On 10/31/24 13:39, Dave Hall wrote: > >>>>>>> Hello. > >>>>>>> > >>>>>>> Sorry if it appears that I am reposting the same issue under a > >>>>>>> different > >>>>>>> topic. However, I feel that the problem has moved and I now have > >>>>>>> different > >>>>>>> questions. > >>>>>>> > >>>>>>> At this point I have, I believe, removed all traces of OSD.12 from > my > >>>>>>> cluster - based on steps in the Reef docs at > >>>>>>> https://docs.ceph.com/en/reef/rados/operations/add-or-rm-osds/#. > >>>>>>> I have > >>>>>>> further located and removed the WAL/DB LV on an associated NVMe > drive > >>>>>>> (shared with 3 other OSDs). > >>>>>>> > >>>>>>> I don't believe the instructions for replacing an OSD (ceph-volume > >>>>>>> lvm > >>>>>>> prepare) still apply, so I have been trying to work with the > >>>>>>> instructions > >>>>>>> under ADDING AN OSD (MANUAL). > >>>>>>> > >>>>>>> However, since my installation is containerized (Podman), it is > >>>>>>> unclear > >>>>>>> which steps should be issued on the host and which within 'cephadm > >>>>>>> shell'. > >>>>>>> > >>>>>>> There is also another ambiguity: In step 3 the instruction is to > >>>>>>> 'mkfs -t > >>>>>>> {fstype}' and then to 'mount -o user_xattr'. However, which fs > type? > >>>>>>> > >>>>>>> After this, in step 4, the 'ceph-osd -i {osd-id} --mkfs --mkkey' > gets > >>>>>>> throws errors about the keyring file. > >>>>>>> > >>>>>>> So, are these the right instructions to be using in a containerized > >>>>>>> installation? Are there, in general, alternate documents for > >>>>>>> containerized > >>>>>>> installations? > >>>>>>> > >>>>>>> Lastly, the above cited instructions don't say anything about the > >>>>>>> separate > >>>>>>> WAL/DB LV. > >>>>>>> > >>>>>>> Please advise. > >>>>>>> > >>>>>>> Thanks. > >>>>>>> > >>>>>>> -Dave > >>>>>>> > >>>>>>> -- > >>>>>>> Dave Hall > >>>>>>> Binghamton University > >>>>>>> kdhall@xxxxxxxxxxxxxx > >>>>>>> _______________________________________________ > >>>>>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>>>>> _______________________________________________ > >>>>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>>>> > >>>>> _______________________________________________ > >>>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>>> _______________________________________________ > >>>> ceph-users mailing list -- ceph-users@xxxxxxx > >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >>> > >>> _______________________________________________ > >>> ceph-users mailing list -- ceph-users@xxxxxxx > >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx