Re: Safely Upgrading OS on a live Ceph Cluster

"Heller, Chris" <cheller@xxxxxxxxxx> · Wed, 1 Mar 2017 13:41:59 +0000

That is a good question, and I'm not sure how to answer. The journal is on its own volume, and is not a symlink. Also how does one flush the journal? That seems like an important step when bringing down a cluster safely.
-Chris

On Mar 1, 2017, at 8:37 AM, Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx> wrote:

    On 02/28/17 18:55, Heller, Chris wrote:

      Quick update. So I'm trying out the procedure as documented here.

      So far I've:

      1. Stopped ceph-mds
      2. set noout, norecover, norebalance, nobackfill
      3. Stopped all ceph-osd
      4. Stopped ceph-mon
      5. Installed new OS
      6. Started ceph-mon
      7. Started all ceph-osd

      This is where I've stopped. All but one OSD came
        back online. One has this backtrace:

          2017-02-28 17:44:54.884235 7fb2ba3187c0 -1
            journal FileJournal::_open: disabling aio for non-block
            journal.  Use journal_force_aio to force use of aio anyway

    Are the journals inline? or separate? If they're separate, the above
    means the journal symlink/config is missing, so it would possibly
    make a new journal, which would be bad if you didn't flush the old
    journal before.

    And also just one osd is easy enough to replace (which I wouldn't do
    until the cluster settled down and recovered). So it's lame for it
    to be broken, but it's still recoverable if that's the only issue.

Attachment:
smime.p7s

Description: S/MIME cryptographic signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com