Re: Jewel -> Luminous upgrade, package install stopped all daemons

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Well OK now.

Before we go setting off the fire alarms all over town let's work out what is
happening, and why. I spent some time reproducing this and, it is indeed tied to
selinux being (at least) permissive. It does not happen when selinux is
disabled.

If we look at the journalctl output in the OP we see that yum reports ceph-base
installed successfully and it is only after that that ceph daemons start
shutting down. Then yum reports that the ceph-selinux package has been installed
so a closer look at that package appears warranted.

# rpm -q --scripts ceph-selinux|head -28
postinstall scriptlet (using /bin/sh):
# backup file_contexts before update
. /etc/selinux/config
FILE_CONTEXT=/etc/selinux/${SELINUXTYPE}/contexts/files/file_contexts
cp ${FILE_CONTEXT} ${FILE_CONTEXT}.pre

# Install the policy
/usr/sbin/semodule -i /usr/share/selinux/packages/ceph.pp

# Load the policy if SELinux is enabled
if ! /usr/sbin/selinuxenabled; then
    # Do not relabel if selinux is not enabled
    exit 0
fi

if diff ${FILE_CONTEXT} ${FILE_CONTEXT}.pre > /dev/null 2>&1; then
   # Do not relabel if file contexts did not change
   exit 0
fi

# Check whether the daemons are running
/usr/bin/systemctl status ceph.target > /dev/null 2>&1
STATUS=$?

# Stop the daemons if they were running
if test $STATUS -eq 0; then
    /usr/bin/systemctl stop ceph.target > /dev/null 2>&1
fi

Note that if selinux is disabled we do nothing but, if selinux is enabled and
the ceph daemons are running we stop them. That's this section here;

https://github.com/ceph/ceph/blob/28c8e8953c39893978137285a0577cf8c01ebc19/ceph.spec.in#L1671

Note the same thing will happen if you uninstall that package.

https://github.com/ceph/ceph/blob/28c8e8953c39893978137285a0577cf8c01ebc19/ceph.spec.in#L1740

Now given this code has been there for a considerable amount of time more or
less unaltered I'd say it hasn't been *extensively tested" in the wild. It's
likely the solution here is something similar to the
CEPH_AUTO_RESTART_ON_UPGRADE solution but I'll leave it to those that understand
the selinux implications better than I to critique the solution. if everyone's
happy this is the actual issue we are seeing and that we need a bug opened for
it I'll open a tracker for it tomorrow and we can start moving towards a
solution.

On Sun, Sep 17, 2017 at 9:05 AM, Matthias Ferdinand <mf+ml.ceph@xxxxxxxxx> wrote:
>> On Fri, Sep 15, 2017 at 3:49 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>> > On Fri, Sep 15, 2017 at 3:34 PM David Turner <drakonstein@xxxxxxxxx> wrote:
>> >>
>> >> I don't understand a single use case where I want updating my packages
>> >> using yum, apt, etc to restart a ceph daemon.  ESPECIALLY when there are so
>> >> many clusters out there with multiple types of daemons running on the same
>> >> server.
>> >>
>> >> My home setup is 3 nodes each running 3 OSDs, a MON, and an MDS server.
>> >> If upgrading the packages restarts all of those daemons at once, then I'm
>> >> mixing MON versions, OSD versions and MDS versions every time I upgrade my
>> >> cluster.  It removes my ability to methodically upgrade my MONs, OSDs, and
>> >> then clients.
>> I think the choice one makes with small cluster is the upgrade is
>> going to be disruptive, but for the large redundant cluster
>> it is better that the upgrade do the *full* job for better user
>
> Hi, if upgrades on small clusters are _supposed_ to be disruptive, that
> should be documented very prominently, including the minimum
> requirements to be met for an update to _not_ be disruptive. Smooth
> upgrade experience is probably more important for small clusters.
> Larger installations will have less of a tendency to colocate different
> daemon types and will have deployment/management tools with all the
> necessary bells and whistles. If a ceph cluster has only a few machines
> that does not always mean it can afford downtime.
>
> On Ubuntu/Debian systems, you could create a script at
> /usr/sbin/policy-rc.d with return code 101 to suppress all
> start/stop/restart actions at install time:
>     http://blog.zugschlus.de/archives/974-Debians-Policy-rc.d-infrastructure-explained.html
> Remember to remove it afterwards :-)
>
> Don't know about RPM-based systems.
>
> Regards
> Matthias
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux