Re: Safely Upgrading OS on a live Ceph Cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I see. My journal is specified in ceph.conf. I'm not removing it from the OSD so sounds like flushing isn't needed in my case.

-Chris
On Mar 1, 2017, at 9:31 AM, Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx> wrote:

On 03/01/17 14:41, Heller, Chris wrote:
That is a good question, and I'm not sure how to answer. The journal is on its own volume, and is not a symlink. Also how does one flush the journal? That seems like an important step when bringing down a cluster safely.

You only need to flush the journal if you are removing it from the osd, replacing it with a different journal.

So since your journal is on its own, then you need either a symlink in the osd directory named "journal" which points to the device (ideally not /dev/sdx but /dev/disk/by-.../), or you put it in the ceph.conf.

And since it said you have a non-block journal now, it probably means there is a file... you should remove that (rename it to journal.junk until you're sure it's not an important file, and delete it later).

-Chris

On Mar 1, 2017, at 8:37 AM, Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx> wrote:

On 02/28/17 18:55, Heller, Chris wrote:
Quick update. So I'm trying out the procedure as documented here.

So far I've:

1. Stopped ceph-mds
2. set noout, norecover, norebalance, nobackfill
3. Stopped all ceph-osd
4. Stopped ceph-mon
5. Installed new OS
6. Started ceph-mon
7. Started all ceph-osd

This is where I've stopped. All but one OSD came back online. One has this backtrace:

2017-02-28 17:44:54.884235 7fb2ba3187c0 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
Are the journals inline? or separate? If they're separate, the above means the journal symlink/config is missing, so it would possibly make a new journal, which would be bad if you didn't flush the old journal before.

And also just one osd is easy enough to replace (which I wouldn't do until the cluster settled down and recovered). So it's lame for it to be broken, but it's still recoverable if that's the only issue.




Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux