On Tue, Apr 12, 2016 at 4:41 AM, Eric Hall <eric.hall@xxxxxxxxxxxxxx> wrote: > On 4/12/16 12:01 AM, Gregory Farnum wrote: >> >> On Mon, Apr 11, 2016 at 3:45 PM, Eric Hall <eric.hall@xxxxxxxxxxxxxx> >> wrote: >>> >>> Power failure in data center has left 3 mons unable to start with >>> mon/OSDMonitor.cc: 125: FAILED assert(version >= osdmap.epoch) >>> >>> Have found simliar problem discussed at >>> http://irclogs.ceph.widodh.nl/index.php?date=2015-05-29, but am unsure >>> how >>> to proceed. >>> >>> If I read >>> ceph-kvstore-tool /var/lib/ceph/mon/ceph-cephsecurestore1/store.db list >>> correctly, they believe osdmap is 1, but they also have osdmap:full_38456 >>> and osdmap:38630 in the store. >> >> >> Exactly what values are you reading that's giving you those values? >> The "real" OSDMap epoch is going to be at least 38630...if you're very >> lucky it will be exactly 38630. But since it reset itself to 1 in the >> monitor's store, I doubt you'll be lucky. > > > I'm getting this from ceph-kvstore-tool list. I meant the keys that it was outputting...I forgot we actually had one called "osdmap". > >> So in order to get your cluster back up, you need to find the largest >> osdmap version in your cluster. You can do that, very tediously, by >> looking at the OSDMap stores. Or you may have debug logs indicating it >> more easily on the monitors. > > > I don't see info like this in any logs. How/where do I inspect this? If you had debugging logs up high enough, it would tell you things like each map commit. And every time the monitor subsystems (like the OSD Monitor) print out any debugging info they include what epoch/version they are on, so it's in the log output prefix. -Greg > > Thank you, > -- > Eric > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com