On 03/01/17 14:41, Heller, Chris wrote:
That is a good question, and I'm not sure how to answer. The
journal is on its own volume, and is not a symlink. Also how does
one flush the journal? That seems like an important step when
bringing down a cluster safely.
You only need to flush the journal if you are removing it from the
osd, replacing it with a different journal.
So since your journal is on its own, then you need either a symlink
in the osd directory named "journal" which points to the device
(ideally not /dev/sdx but /dev/disk/by-.../), or you put it in the
ceph.conf.
And since it said you have a non-block journal now, it probably
means there is a file... you should remove that (rename it to
journal.junk until you're sure it's not an important file, and
delete it later).
-Chris
On 02/28/17 18:55, Heller,
Chris wrote:
Quick update. So I'm trying out
the procedure as documented here.
So far I've:
1. Stopped ceph-mds
2. set noout, norecover, norebalance,
nobackfill
3. Stopped all ceph-osd
4. Stopped ceph-mon
5. Installed new OS
6. Started ceph-mon
7. Started all ceph-osd
This is where I've stopped. All but one
OSD came back online. One has this backtrace:
2017-02-28 17:44:54.884235
7fb2ba3187c0 -1 journal FileJournal::_open:
disabling aio for non-block journal. Use
journal_force_aio to force use of aio anyway
Are the journals inline? or separate? If they're
separate, the above means the journal symlink/config is
missing, so it would possibly make a new journal, which
would be bad if you didn't flush the old journal before.
And also just one osd is easy enough to replace (which I
wouldn't do until the cluster settled down and
recovered). So it's lame for it to be broken, but it's
still recoverable if that's the only issue.
|
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com