Re: Jewel -> Luminous upgrade, package install stopped all daemons

Vasu Kulkarni <vakulkar@xxxxxxxxxx> · Fri, 15 Sep 2017 17:51:59 -0700

On Fri, Sep 15, 2017 at 3:49 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> On Fri, Sep 15, 2017 at 3:34 PM David Turner <drakonstein@xxxxxxxxx> wrote:
>>
>> I don't understand a single use case where I want updating my packages
>> using yum, apt, etc to restart a ceph daemon.  ESPECIALLY when there are so
>> many clusters out there with multiple types of daemons running on the same
>> server.
>>
>> My home setup is 3 nodes each running 3 OSDs, a MON, and an MDS server.
>> If upgrading the packages restarts all of those daemons at once, then I'm
>> mixing MON versions, OSD versions and MDS versions every time I upgrade my
>> cluster.  It removes my ability to methodically upgrade my MONs, OSDs, and
>> then clients.
I think the choice one makes with small cluster is the upgrade is
going to be disruptive, but for the large redundant cluster
it is better that the upgrade do the *full* job for better user
experience because once the decision to upgrade is made
there is no use running old version of daemon, also not sure how the
format changes in major version would actually
effect the new upgraded files on the system running old daemon, maybe
we haven't hit corner cases?

>>
>> Now let's take the Luminous upgrade which REQUIRES you to upgrade all of
>> your MONs before anything else... I'm screwed.  I literally can't perform
>> the upgrade if it's going to restart all of my daemons because it is
>> impossible for me to achieve a paxos quorum of MONs running the Luminous
>> binaries BEFORE I upgrade any other daemon in the cluster.  The only way to
>> achieve that is to stop the entire cluster and every daemon,

Again for small physical node cluster with colocated daemon's that's
the compromise, one could use VM"s insde
the cluster to separate out the upgrade process with some tradeoff's.

upgrade all of
>> the packages, then start the mons, then start the rest of the cluster
>> again... There is no way that is a desired behavior.
>>
>> All of this is ignoring large clusters using something like Puppet to
>> manage their package versions.  I want to just be able to update the ceph
>> version and push that out to the cluster.  It will install the new packages
>> to the entire cluster and then my automated scripts can perform a rolling
>> restart of the cluster upgrading all of the daemons while ensuring that the
>> cluster is healthy every step of the way.  I don't want to add in the time
>> of installing the packages on every node DURING the upgrade.  I want that
>> done before I initiate my script to be in a mixed version state as little as
>> possible.
>>
>> Claiming that having anything other than an issued command to specifically
>> restart a Ceph daemon is anything but a bug and undesirable sounds crazy to
>> me.  I don't ever want anything restarting my Ceph daemons that is not
>> explicitly called to do so.  That just sounds like it's begging to put my
>> entire cluster into a world of hurt by accidentally restarting too many
>> daemons at the same time making the data in my cluster inaccessible.
>>
>> I'm used to the Ubuntu side of things.  I've never seen upgrading the Ceph
>> packages to ever affect a daemon before.  If that's actually a thing that is
>> done on purpose in RHEL and CentOS... good riddance! That's ridiculous!
>
>
> I don't know what the settings are right now, or what the latest argument
> was to get them there.
>
> But we *have* had distributions require us to make changes to come into
> compliance with their packaging policies.
> Some users *do* want their daemons to automatically reboot on upgrade,
> because if you have segregated nodes that you're managing by hand, it's a
> lot easier to issue one command than two.
> And on and on and on.
>
> Personally, I tend closer to your position. But this is a thing that some
> people get very vocal about; we don't have a lot of upstream people
> interested in maintaining packaging or fighting with other interest groups
> who say we're doing it wrong; and it's just not a lot of fun to deal with.
>
> Looking through the git logs, I think CEPH_AUTO_RESTART_ON_UPGRADE was
> probably added so distros could easily make that distinction. And it would
> not surprise me if the use of selinux required restarts — upgrading packages
> tends to change what the daemon's selinux policy allows it to do, and if
> they have different behavior I presume selinux is going to complain
> wildly...
> -Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com