Re: Upgrade to Infernalis: failed to pick suitable auth object

David Turner <drakonstein@xxxxxxxxx> · Fri, 17 Aug 2018 11:21:57 -0400

In your baby step upgrade you should avoid the 2 non-LTS releases of Infernalis and Kraken.  You should go from Hammer to Jewel to Luminous.
The general rule of doing the upgrade to put all of your OSDs to be owned by ceph was to not change the ownership as part of the upgrade.  There is a [1] config option that tells Ceph to override the user the daemons run as so that you can separate these 2 operations from each other simplifying each maintenance task.  It will set the user to whatever the user is for each daemon's folder.

[1]
setuser match path = /var/lib/ceph/$type/$cluster-$id

On Fri, Aug 17, 2018 at 10:53 AM Kees Meijs <kees@xxxxxxxx> wrote:

    Hi Cephers,

    For the last months (well... years actually) we were quite happy
    using Hammer. So far, there was no immediate cause implying an
    upgrade.

    However, having seen Luminous providing support for BlueStore, it
    seemed like a good idea to perform some upgrade steps.

    Doing baby steps, I wanted to upgrade from Hammer to Infernalis
    first since all ownerships should be changed because of using an
    unprivileged user (good stuff!) instead of root.

    So far, I've upgraded all monitors from Hammer (0.94.10) to
    Infernalis (9.2.1). All seemed well resulting in HEALTH_OK.

    Then, I tried upgrading one OSD server using the following
    procedure:

      Alter APT sources to utilise Infernalis instead of Hammer.
      Update and upgrade the packages.
      Since I didn't want any rebalancing going on, I ran "ceph osd
        set noout" as well.
      Stop a OSD, then chown ceph:ceph -R /var/lib/ceph/osd/ceph-X,
        start the OSD and so on.

    Maybe I acted too quickly (ehrm... didn't wait long enough) but
      at some point it seemed not all ownership was changed during the
      process. Meanwhile we were still HEALTH_OK so I didn't really
      worry and fixed left-overs using find /var/lib/ceph -not -user
      ceph -exec chown ceph:ceph '{}' ';'
    It seemed to work well and two days passed without any issues.
    But then... Deep scrubbing happened:

     health HEALTH_ERR

                    1 pgs inconsistent

                    2 scrub errors

    So far, I figured out the two scrubbing errors apply to the same
      OSD, being osd.0.
    The log at the OSD shows:

    2018-08-17 15:25:36.810866 7fa3c9e09700  0
      log_channel(cluster) log [INF] : 3.72 deep-scrub starts

      2018-08-17 15:25:37.221562 7fa3c7604700 -1 log_channel(cluster)
      log [ERR] : 3.72 soid -5/00000072/temp_3.72_0_16187756_3476/head:
      failed to pick suitable auth object

      2018-08-17 15:25:37.221566 7fa3c7604700 -1 log_channel(cluster)
      log [ERR] : 3.72 soid -5/00000072/temp_3.72_0_16195026_251/head:
      failed to pick suitable auth object

      2018-08-17 15:46:36.257994 7fa3c7604700 -1 log_channel(cluster)
      log [ERR] : 3.72 deep-scrub 2 errors

    The situation seems similar to http://tracker.ceph.com/issues/13862
    but so far I'm unable to repair the placement group.

    Meanwhile I'm forcing deep scrubbing for all placement groups
    applicable to osd.0, hopefully resulting in just PG 3.72 having
    errors.

    Awaiting deep scrubbing to finish, it seemed like a good idea to ask
    you guys for help.

    What's the best approach at this point?

    eph health detail

      HEALTH_ERR 1 pgs inconsistent; 2 scrub errors

      pg 3.72 is active+clean+inconsistent, acting [0,33,39]

      2 scrub errors

    OSDs 33 and 39 are untouched (still running 0.94.10) and seem fine
    without errors.

    Thanks in advance for any comments or thoughts.

    Regards and enjoy your weekend!

    Kees

    -- 

    https://nefos.nl/contact 

    Nefos IT bv

    Ambachtsweg 25 (industrienummer 4217)

    5627 BZ Eindhoven

    Nederland

    KvK 66494931

    Aanwezig op maandag, dinsdag, woensdag en vrijdag

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com