Re: Ceph 16.2.3 issues during upgrade from 15.2.10 with cephadm/lvm list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage,

We've got 2.0.27 installed. I restarted all the manager pods, just in
case, and I have the same behavior afterwards.

David

On Mon, May 10, 2021 at 6:53 PM Sage Weil <sage@xxxxxxxxxxxx> wrote:
>
> The root cause is a bug in conmon.  If you can upgrade to >= 2.0.26
> this will also fix the problem.  What version are you using?  The
> kubic repos currently have 2.0.27.  See
> https://build.opensuse.org/project/show/devel:kubic:libcontainers:stable
>
> We'll make sure the next release has the verbosity workaround!
>
> sage
>
> On Mon, May 10, 2021 at 5:47 PM David Orman <ormandj@xxxxxxxxxxxx> wrote:
> >
> > I think I may have found the issue:
> >
> > https://tracker.ceph.com/issues/50526
> > It seems it may be fixed in: https://github.com/ceph/ceph/pull/41045
> >
> > I hope this can be prioritized as an urgent fix as it's broken
> > upgrades on clusters of a relatively normal size (14 nodes, 24x OSDs,
> > 2x NVME for DB/WAL w/ 12 OSDs per NVME), even when new OSDs are not
> > being deployed, as it still tries to apply the OSD specification.
> >
> > On Mon, May 10, 2021 at 4:03 PM David Orman <ormandj@xxxxxxxxxxxx> wrote:
> > >
> > > Hi,
> > >
> > > We are seeing the mgr attempt to apply our OSD spec on the various
> > > hosts, then block. When we investigate, we see the mgr has executed
> > > cephadm calls like so, which are blocking:
> > >
> > > root     1522444  0.0  0.0 102740 23216 ?        S    17:32   0:00
> > >      \_ /usr/bin/python3
> > > /var/lib/ceph/XXXXX/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90
> > > --image docker.io/ceph/ceph@sha256:694ba9cdcbe6cb7d25ab14b34113c42c2d1af18d4c79c7ba4d1f62cf43d145fe
> > > ceph-volume --fsid XXXXX -- lvm list --format json
> > >
> > > This occurs on all hosts in the cluster, following
> > > starting/restarting/failing over a manager. It's blocking an
> > > in-progress upgrade post-manager updates on one cluster, currently.
> > >
> > > Looking at the cephadm logs on the host(s) in question, we see the
> > > last entry appears to be truncated, like:
> > >
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.db_uuid": "1n2f5v-EEgO-1Kn6-hQd2-v5QF-AN9o-XPkL6b",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.encrypted": "0",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.osd_fsid": "XXXX",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.osd_id": "205",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.osdspec_affinity": "osd_spec",
> > > 2021-05-10 17:32:06,471 INFO /usr/bin/podman:
> > > "ceph.type": "block",
> > >
> > > The previous entry looks like this:
> > >
> > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > > "ceph.db_uuid": "TMTPD5-MLqp-06O2-raqp-S8o5-TfRG-hbFmpu",
> > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > > "ceph.encrypted": "0",
> > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > > "ceph.osd_fsid": "XXXX",
> > > 2021-05-10 17:32:06,469 INFO /usr/bin/podman:
> > > "ceph.osd_id": "195",
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:
> > > "ceph.osdspec_affinity": "osd_spec",
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:
> > > "ceph.type": "block",
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:                 "ceph.vdo": "0"
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:             },
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:             "type": "block",
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:             "vg_name":
> > > "ceph-ffd1a4a7-316c-4c85-acde-06459e26f2c4"
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:         }
> > > 2021-05-10 17:32:06,470 INFO /usr/bin/podman:     ],
> > >
> > > We'd like to get to the bottom of this, please let us know what other
> > > information we can provide.
> > >
> > > Thank you,
> > > David
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux