Re: Ceph-deploy for FreeBSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 21, 2017 at 01:45:05PM +0000, Sage Weil wrote:
> On Fri, 21 Apr 2017, Fabian Grünbichler wrote:
> > On Fri, Apr 21, 2017 at 01:16:23PM +0000, Sage Weil wrote:
> > > On Fri, 21 Apr 2017, Fabian Grünbichler wrote:
> > > > On Thu, Apr 20, 2017 at 08:11:38PM +0200, Nathan Cutler wrote:
> > > > > Hi Willem:
> > > > > 
> > > > > It sounds like you are trying to use the sysvinit scripts? These have been
> > > > > unmaintained (presumably with lots of weeds growing up) since infernalis.
> > > > > Until now I have been assuming that all init systems other than systemd
> > > > > (sysvinit, upstart, etc.) are deprecated in Ceph.
> > > > 
> > > > Slightly OT, but AFAICT http://tracker.ceph.com/issues/18305 still
> > > > applies to the official ceph.com Kraken Debian packages, i.e., ceph-base
> > > > installs and activates the init.d script, which then races against the
> > > > (udev-activated) ceph-osd systemd units. If anything except systemd is
> > > > indeed deprecated, I wonder why the Debian packages (still) ship AND
> > > > activate both systemd units and Sys V init scripts?
> > > > 
> > > > (Note that the proposed fix probably does not apply as is anymore,
> > > > because ceph-disk and the systemd units have been changed in the
> > > > meantime).
> > > 
> > > I think the issue with Debian (generally) is that it "supports" multiple 
> > > init systems (sysvinit and systemd both), even though systemd is the one 
> > > installed default.  Which means we ship the sysvinit script and systemd 
> > > unit files.
> > > 
> > > (There may very well be a bug in how we "activate" them, though!)
> > > 
> > 
> > that's why I reported the original issue - in general you never want to
> > have both a multi-daemon init script (like the "old" ceph one) and the
> > replacing split up systemd units active at the same time.
> > 
> > since systemd will generate a unit for every init script for which a
> > unit of the same name does not already exist, you either need to mask
> > the auto-generated unit (i.e., symlink it to /dev/null) or write a
> > replacement unit that has the identical name (so the "ceph" init script
> > becomes the "ceph.service" unit). if you don't do that and your units
> > are named differently than your init script, both will be active (this
> > is not a Debianism, it is how the LSB generator in systemd is supposed
> > to work to ease the transition from Sys V init to systemd..).
> > 
> > what exacerbates the issue in this case is that the systemd units + udev
> > actually don't completely replace the old init scripts, because some of
> > the udev events might have been processed before the system was fully
> > booted, and osds might not be properly activated on boot as a result.
> > 
> > hence my proposal to add a ceph.service that simply calls "ceph-disk
> > activate-all", which is AFAICT the only part of the init script that is
> > not covered by the current systemd units / udev rules.
> 
> This sounds reasonable to me.  It could also do nothing... IIRC the 
> ceph-disk activate-all was a workaround for racy/buggy udev interactions 
> that preventing all osds from starting on large boxes with lots of disks 
> (see below).  Given that we don't have such a workaround on systemd 
> anymore I'm not sure if it's still necessary or not.  (I guess it can't 
> hurt, though!)

I am sorry if that was not clear enough - the incompleteness I mentioned
in my previous mail is not theoretical, if I "systemctl mask
ceph.service" (which disables the init script) on a Debian Jessie /
Ceph Jewel based system, not all OSDs will be activated on boot (in
fact, most of the time none will). The ceph-osd services are only
started if a udev add event for a OSD partition happens late enough in
the boot process (e.g., if I hot unplug and replug the OSD disk, they
are correctly started). I last tested this around 10.2.5 (and have had
the "ceph-disk activate-all" ceph.service in place since), but it was
very reproducible.

> 
> > the whole ceph service startup is pretty messy in general IMHO,
> > especially for OSDs where (IIRC?) udev rules are calling python scripts
> > which are starting systemd units which are in turn calling python
> > scripts with different parameters that end up starting systemd units
> > which actually start daemons (somewhere in there the mounting happens as
> > well..).
> 
> Agreed.  The goal with using udev like this was to make it all hotplug, 
> but I'm not sure if any operators actually take advantage of this.  If 
> they don't, we could consider going back to something a bit less weird...
> 
> sage

I am undecided on this - the current state is far from elegant, but
OTOH, simply moving OSD disks between hosts and having them work OOTB is
a nice feature...


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux