On Thu, Jan 5, 2012 at 5:24 AM, Karoly Horvath <rhswdev@xxxxxxxxx> wrote: > Hi, > > back from holiday. > > I did a successful power unplug test now, but the FS was unavailable > for 16 minutes which is clearly wrong... > > I have the log files but the MDS log is 1.2 gigabyte, if you let me > know which lines to filter / filter out I will upload it somewhere... > > -- > Karoly Horvath Assuming it's the same error as last time, the log will have a line that contains "waiting for osdmap n (which blacklists prior instance)", where n is an epoch number. Then at some later point there will be a line that looks something like the following: "2011-12-21 13:45:17.594746 7f4885307700 -- xxx.xxx.xxx.31:6800/4438 <== mon.2 xxx.xxx.xxx.35:6789/0 9 ==== osd_map(y..z src has 1..495) v2 ==== 748+0+0 (656995691 0 0) 0x1637400 con 0x163c000" Where y and z are an interval which contains n. (In the previous log, and probably here too, y=z=n.) I'm going to be interested in those two lines and the stuff following when the osdmap arrives. Probably I will only care about "objecter" lines, but it might be all of them...try trimming off the minute following that osdmap line; it'll probably contain more than I care about. :) -Greg > On Fri, Dec 23, 2011 at 12:00 AM, Gregory Farnum > <gregory.farnum@xxxxxxxxxxxxx> wrote: >> On Wed, Dec 21, 2011 at 8:43 AM, Karoly Horvath <rhswdev@xxxxxxxxx> wrote: >>> On Wed, Dec 21, 2011 at 4:13 PM, Gregory Farnum >>>>> By client I assume you mean the kernel driver.. the FS is freezed, so >>>>> I cannot unmount (cannot even `shutdown`).. how can I force the client >>>>> to reconnect? >>>> >>>> Try a lazy force unmount: >>>> umount -lf ceph_mnt_point/ >>>> And then mount again. >>> >>> wow, never heard about this, thanks.:) >>> will report with the next mail >>> >>> In the meantime I did one test, killing mds+osd+mon on beta, >>> it's jammed in '{0=alpha=up:replay}', after 45 minutes I shut it down... >>> I attached the logs. >> >> Oh, this is very odd! The MDS goes to sleep while it waits for an >> up-to-date OSDMap, but it never seems to get woken up even though I >> see the message sending in the OSDMap. >> >> So let's try this one more time, but this time also add in "debug >> objecter = 20" to the MDS config...Those logs will include everything >> I need, or nothing will, promise! :) >> -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html