Re: how can I achieve HA with ceph?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 5, 2012 at 5:24 AM, Karoly Horvath <rhswdev@xxxxxxxxx> wrote:
> Hi,
>
> back from holiday.
>
> I did a successful power unplug test now, but the FS was unavailable
> for 16 minutes which is clearly wrong...
>
> I have the log files but the MDS log is 1.2 gigabyte, if you let me
> know which lines to filter / filter out I will  upload it somewhere...
>
> --
> Karoly Horvath

Assuming it's the same error as last time, the log will have a line
that contains "waiting for osdmap n (which blacklists prior
instance)", where n is an epoch number.

Then at some later point there will be a line that looks something
like the following:
"2011-12-21 13:45:17.594746 7f4885307700 -- xxx.xxx.xxx.31:6800/4438
<== mon.2 xxx.xxx.xxx.35:6789/0 9 ==== osd_map(y..z src has 1..495) v2
==== 748+0+0 (656995691 0 0) 0x1637400 con 0x163c000"
Where y and z are an interval which contains n. (In the previous log,
and probably here too, y=z=n.) I'm going to be interested in those two
lines and the stuff following when the osdmap arrives. Probably I will
only care about "objecter" lines, but it might be all of them...try
trimming off the minute following that osdmap line; it'll probably
contain more than I care about. :)
-Greg


> On Fri, Dec 23, 2011 at 12:00 AM, Gregory Farnum
> <gregory.farnum@xxxxxxxxxxxxx> wrote:
>> On Wed, Dec 21, 2011 at 8:43 AM, Karoly Horvath <rhswdev@xxxxxxxxx> wrote:
>>> On Wed, Dec 21, 2011 at 4:13 PM, Gregory Farnum
>>>>> By client I assume you mean the kernel driver.. the FS is freezed, so
>>>>> I cannot unmount (cannot even `shutdown`).. how can I force the client
>>>>> to reconnect?
>>>>
>>>> Try a lazy force unmount:
>>>> umount -lf ceph_mnt_point/
>>>> And then mount again.
>>>
>>> wow, never heard about this, thanks.:)
>>> will report with the next mail
>>>
>>> In the meantime I did one test, killing mds+osd+mon on beta,
>>> it's jammed in '{0=alpha=up:replay}', after 45 minutes I shut it down...
>>> I attached the logs.
>>
>> Oh, this is very odd! The MDS goes to sleep while it waits for an
>> up-to-date OSDMap, but it never seems to get woken up even though I
>> see the message sending in the OSDMap.
>>
>> So let's try this one more time, but this time also add in "debug
>> objecter = 20" to the MDS config...Those logs will include everything
>> I need, or nothing will, promise! :)
>> -Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux