Upgrade 16.2.9 to 16.2.11 stopped due to #57627

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Just to note this:

ceph-volume activate takes time to complete
https://tracker.ceph.com/issues/57627

...is a show stopper bug for me in 16.2.11 when trying to upgrade from 16.2.9 - in particular to get this fix:

Pacific: Significant write amplification as compared to Nautilus
https://tracker.ceph.com/issues/58530

The upgrade to 16.2.11 stopped with:

$ ceph orch upgrade status
{
    "target_image": "quay.io/ceph/ceph@sha256:748387ea347157fb9df9bb2620d873ac633ff80d0308bcc82a74a821df0d0cfa",
    "in_progress": true,
    "which": "Upgrading all daemon types on all hosts",
    "services_complete": [
        "mon",
        "mgr"
    ],
    "progress": "10/90 daemons upgraded",
    "message": "Error: UPGRADE_REDEPLOY_DAEMON: Upgrading daemon osd.24 on host b2 failed.",
    "is_paused": true
}

Likely because that "b2" host is getting bitten VERY badly by the "ceph-volume activate takes time to complete" problem due to a large number of block devices on the system:

b2$ lsblk -P -p -o 'NAME' | wc -l
924

Attempting to start the affected osd via systemd was failing due to timing out.
I tried manually starting the osd per it's unit.run, but the "ceph-volume
activate" step was running for over an hour before I gave up.

I've been able to manually revert this particular OSD (the first one to be
updated on this particular box) back to 16.2.9 by updating it's unit.run file and restarting the osd, so my cluster is healthy.

I see the fix has been backported:

https://tracker.ceph.com/issues/58790

I'm guessing it shouldn't be too much of a problem running mixed versions for a while until 16.2.12 comes out?

$ ceph versions
{
    "mon": {
        "ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)": 5
    },
    "mgr": {
        "ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)": 3
    },
    "osd": {
        "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)": 2,
        "ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)": 79
    },
    "mds": {},
    "overall": {
        "ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)": 2,
        "ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)": 87
    }
}


Cheers,

Chris



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux