Re: mons die with mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch)...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/12/16 12:01 AM, Gregory Farnum wrote:
On Mon, Apr 11, 2016 at 3:45 PM, Eric Hall <eric.hall@xxxxxxxxxxxxxx> wrote:
Power failure in data center has left 3 mons unable to start with
mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch)

Have found simliar problem discussed at
http://irclogs.ceph.widodh.nl/index.php?date=2015-05-29, but am unsure how
to proceed.

If I read
ceph-kvstore-tool /var/lib/ceph/mon/ceph-cephsecurestore1/store.db list
correctly, they believe osdmap is 1, but they also have osdmap:full_38456
and osdmap:38630 in the store.

Exactly what values are you reading that's giving you those values?
The "real" OSDMap epoch is going to be at least 38630...if you're very
lucky it will be exactly 38630. But since it reset itself to 1 in the
monitor's store, I doubt you'll be lucky.

I'm getting this from ceph-kvstore-tool list.

So in order to get your cluster back up, you need to find the largest
osdmap version in your cluster. You can do that, very tediously, by
looking at the OSDMap stores. Or you may have debug logs indicating it
more easily on the monitors.

I don't see info like this in any logs.  How/where do I inspect this?

Thank you,
--
Eric

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux