Re: Recreate Destroyed OSD

Dave Hall <kdhall@xxxxxxxxxxxxxx> · Fri, 1 Nov 2024 11:22:17 -0400

Tim, Eugen,

So what would a spec file look like for a single OSD that uses a specific
HDD (/dev/sdi) and with WAL/DB on an LV that's 25% of a specific NVMe
drive?  Regarding the NVMe, there are 3 other OSDs already using 25% each
of this NVMe for WAL/DB, but I have removed the LV that was used by the
failed OSD.  Do I need to pre-create the LV, or will 'ceph orch' do that
for me?

Thanks.

-Dave

--
Dave Hall
Binghamton University
kdhall@xxxxxxxxxxxxxx

On Thu, Oct 31, 2024 at 3:52 PM Tim Holloway <timh@xxxxxxxxxxxxx> wrote:

> I migrated from gluster when I found out it's going unsupported shortly.
> I'm really not big enough for Ceph proper, but there were only so many
> supported distributed filesystems with triple redundancy.
>
> Where I got into trouble was that I started off with Octopus and Octopus
> had some teething pains. Like stalling scheduled operations until the
> system was clan but the only way to get a clean system was to run the
> stalled operations. Pacific cured that for me.
>
> But the docs were and remain somewhat fractured between legacy and
> managed services and I managed to get into a real mess there, especially
> since I was wildly trying anything to get those stalled fixes to take.
>
> Since then, I've pretty much redefined all my OSDs with fewer but larger
> datastores and made them all managed. Now if I could just persuade the
> auto-tuner to fix the PG sizes,
>
> I'm in the process of opening a ticket account right now. The fun part
> of this is that realistically, older docs need a re-write just as much
> as the docs for the current release.
>
>     Tim
>
> On 10/31/24 15:39, Eugen Block wrote:
> > I completely understand your point of view. Our own main cluster is
> > also a bit "wild" in its OSD layout, that's why its OSDs are
> > "unmanaged" as well. When we adopted it via cephadm, I started to
> > create suitable osd specs for all those hosts and OSDs and I gave up.
> > :-D But since we sometimes also tend to experiment a bit, I rather
> > have full control over it. That's why we also have
> > osd_crush_initial_weight = 0, to check the OSD creation before letting
> > Ceph remap any PGs.
> >
> > It definitely couldn't hurt to clarify the docs, you can always report
> > on tracker.ceph.com if you have any improvement ideas.
> >
> > Zitat von Tim Holloway <timh@xxxxxxxxxxxxx>:
> >
> >> I have been slowly migrating towards spec files as I prefer
> >> declarative management as a rule.
> >>
> >> However, I think that we may have a dichotomy in the user base.
> >>
> >> On the one hand, users with dozens/hundreds of server/drives of
> >> basically identical character.
> >>
> >> On the other, I'm one who's running fewer servers and for historical
> >> reasons they tend to be wildly individualistic and often have blocks
> >> of future-use space reserved for non-ceph storage.
> >>
> >> Ceph, left to its own devices (no pun intended) can be quite
> >> enthusiastic about adopting any storage it can find. And that's great
> >> for users in the first category. Which is what the spec information
> >> in the supplied links is emphasizing. But for us lesser creatures who
> >> feel the need to manually control where each OSD and how it's
> >> configured, it's not so simple. I'm fairly certain that there's
> >> documentation on the spec file setup for that sort of stuff in the
> >> online docs, but it's located somewhere else and I cannot recall where.
> >>
> >> At any rate I would consider it very important that the different
> >> ways to setup an OSD should explicitly indicate which type of OSD
> >> will be generated in their documentation.
> >>
> >>    Tim
> >>
> >>
> >> On 10/31/24 14:28, Eugen Block wrote:
> >>> Hi,
> >>>
> >>> the preferred method to deploy OSDs in cephadm managed clusters are
> >>> spec files, see this part of the docs [0] for more information. I
> >>> would just not use the '--all-available-devices' flag, except in
> >>> test clusters, or if you're really sure that this is what you want.
> >>>
> >>> If you use 'ceph orch daemon add osd ...', you'll end up with one
> >>> (or more) OSD(s), but they will be unmanaged, as you already noted
> >>> in your own cluster. There are a couple of examples with advanced
> >>> specs (e. g. DB/WAL on dedicated devices) in the docs as well [1].
> >>> So my recommendation would be to have a suiting spec file for your
> >>> disk layout. You can always check with the '--dry-run' flag before
> >>> actually applying it:
> >>>
> >>> ceph orch apply -i osd-spec.yaml --dry-run
> >>>
> >>> Regards,
> >>> Eugen
> >>>
> >>> [0] https://docs.ceph.com/en/latest/cephadm/services/osd/#deploy-osds
> >>> [1]
> >>>
> https://docs.ceph.com/en/latest/cephadm/services/osd/#advanced-osd-service-specifications
> >>>
> >>> Zitat von Tim Holloway <timh@xxxxxxxxxxxxx>:
> >>>
> >>>> As I understand it, the manual OSD setup is only for legacy
> >>>> (non-container) OSDs. Directory locations are wrong for managed
> >>>> (containerized) OSDs, for one.
> >>>>
> >>>> Actually, the whole manual setup docs ought to be moved out of the
> >>>> mainline documentation. In their present arrangement, they make
> >>>> legacy setup sound like the preferred method. And have you noticed
> >>>> that there is no corresponding well-marked section titled
> >>>> "Authomated (cephadmin) setup?".
> >>>>
> >>>> This is how we end up with OSDs that are simultaneously legacy AND
> >>>> administered for the same OSD, since at last count there are no
> >>>> interlocks within Ceph to prevent such a mess.
> >>>>
> >>>>    Tim
> >>>>
> >>>> On 10/31/24 13:39, Dave Hall wrote:
> >>>>> Hello.
> >>>>>
> >>>>> Sorry if it appears that I am reposting the same issue under a
> >>>>> different
> >>>>> topic.  However, I feel that the problem has moved and I now have
> >>>>> different
> >>>>> questions.
> >>>>>
> >>>>> At this point I have, I believe, removed all traces of OSD.12 from my
> >>>>> cluster - based on steps in the Reef docs at
> >>>>> https://docs.ceph.com/en/reef/rados/operations/add-or-rm-osds/#.
> >>>>> I have
> >>>>> further located and removed the WAL/DB LV on an associated NVMe drive
> >>>>> (shared with 3 other OSDs).
> >>>>>
> >>>>> I don't believe the instructions for replacing an OSD (ceph-volume
> >>>>> lvm
> >>>>> prepare) still apply, so I have been trying to work with the
> >>>>> instructions
> >>>>> under ADDING AN OSD (MANUAL).
> >>>>>
> >>>>> However, since my installation is containerized (Podman), it is
> >>>>> unclear
> >>>>> which steps should be issued on the host and which within 'cephadm
> >>>>> shell'.
> >>>>>
> >>>>> There is also another ambiguity:  In step 3 the instruction is to
> >>>>> 'mkfs -t
> >>>>> {fstype}' and then to 'mount -o user_xattr'.  However, which fs type?
> >>>>>
> >>>>> After this, in step 4, the 'ceph-osd -i {osd-id} --mkfs --mkkey' gets
> >>>>> throws errors about the keyring file.
> >>>>>
> >>>>> So, are these the right instructions to be using in a containerized
> >>>>> installation?  Are there, in general, alternate documents for
> >>>>> containerized
> >>>>> installations?
> >>>>>
> >>>>> Lastly, the above cited instructions don't say anything about the
> >>>>> separate
> >>>>> WAL/DB LV.
> >>>>>
> >>>>> Please advise.
> >>>>>
> >>>>> Thanks.
> >>>>>
> >>>>> -Dave
> >>>>>
> >>>>> --
> >>>>> Dave Hall
> >>>>> Binghamton University
> >>>>> kdhall@xxxxxxxxxxxxxx
> >>>>> _______________________________________________
> >>>>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>>> _______________________________________________
> >>>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >>>
> >>>
> >>> _______________________________________________
> >>> ceph-users mailing list -- ceph-users@xxxxxxx
> >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx