Re: How to remove lost objects.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2012/1/19 Gregory Farnum <gregory.farnum@xxxxxxxxxxxxx>:
> On Thu, Jan 19, 2012 at 12:53 AM, Andrey Stepachev <octo47@xxxxxxxxx> wrote:
>> 2012/1/19 Gregory Farnum <gregory.farnum@xxxxxxxxxxxxx>:
>>> On Wednesday, January 18, 2012, Andrey Stepachev <octo47@xxxxxxxxx> wrote:
>>>> 2012/1/19 Gregory Farnum <gregory.farnum@xxxxxxxxxxxxx>:
>>>>> On Wed, Jan 18, 2012 at 12:48 PM, Andrey Stepachev <octo47@xxxxxxxxx>
>>>>> wrote:
>>>>>> But still don't know what happens with ceph, so it can't
>>>>>> respond and hang. It is not a good behavior, because
>>>>>> such situation leads to unresponsible cluster in case of
>>>>>> temporal network failure.
>>>>>
>>>>> I'm a little concerned about this — I would expect to see hangs of up
>>>>> to ~30 seconds (the timeout period), but for operations to then
>>>>> continue. Are you putting the MDS down? If so, do you have any
>>>>> standbys specified?
>>>>
>>>> Yes, MDS goes down (I restart it at some point, while changing something
>>>> in config).
>>>> Yes, i have 2 standbys.
>>>> Clients hang more then 10 minutes.
>>>
>>> Okay, so it's probably an issue with the MDS not entering recovery when it
>>> should. Are you also taking down one of the monitor nodes? There's a known
>>> bug which can cause a standby MDS to wait up to 15 minutes if its monitor
>>> goes down which is fixed in latest master (and maybe .40; I'd have to
>>> check).
>>
>> Yes. I have collocated mon mds and osd on some nodes.
>> And restart all daemons at once. I use 0.40. (built from my github fork).
>
> Hrm. I checked and the fix is in 0.40. Can you reproduce this with
> client logging enabled (--debug_ms 1 --debug_client 10) and post the
> logs somewhere for me to check out? That should be able to isolate the
> problem area at least.

Client writes "renew caps" and nothing more.
I'd try to reproduce problem with more logging, but still no luck.
May be debug serializes race somewhere and prevents
this bug to occur.

> -Greg



-- 
Andrey.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux