Re: MDS spinning wild after restart on all nodes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Amon,
I've been going through my backlog of flagged emails and came across
this one. Did you ever get that information for the bug that you were
going to try and find?
-Greg

On Fri, Jun 15, 2012 at 9:44 AM, Sage Weil <sage@xxxxxxxxxxx> wrote:
> On Fri, 15 Jun 2012, Amon Ott wrote:
>> Hello all,
>>
>> I have seen this for a long time, but never investigated further. After stable
>> test runs for several days, this is our last known show stopper before using
>> Ceph in production. We are running 0.47.2 on 32 Bit.
>>
>> If we restart MDS (or all ceph daemons) on all nodes, one after another or all
>> together, they first recover and then the active one starts to spin with full
>> cpu and does not answer any more. After a while, the next takes over, starts
>> to spin, etc., until the whole cluster is unusable. This is completely
>> reproducable and happens even without any active client.
>>
>> As ecpected, ceph -w shows lots of
>> "2012-06-15 11:35:28.588775   mds e959: 1/1/1 up {0=3=up:active(laggy or
>> crashed)}"
>>
>> It does not help to stop all services on all nodes for minutes or longer and
>> to restart them - MDS will restart spinning. But: If we reboot the whole
>> cluster, everything goes back to work.
>>
>> Today's MDS log is available at
>> https://download.m-privacy.de/homeuser-mds.0.log.gz
>>
>> Is this a known problem? It has been with us for a looong time now, but since
>> rebooting used to help, we never tracked it down.
>
> I haven't seen this before.  Can you attach to the spinning process with
> gdb and send us a dump of what the threads are doing?  'thread apply all
> bt'.  I opened #2596:
>
>         http://tracker.newdream.net/issues/2596
>
> Thanks!
> sage
>
>
>
>
>>
>> Amon Ott
>> --
>> Dr. Amon Ott
>> m-privacy GmbH           Tel: +49 30 24342334
>> Am Köllnischen Park 1    Fax: +49 30 24342336
>> 10179 Berlin             http://www.m-privacy.de
>>
>> Amtsgericht Charlottenburg, HRB 84946
>>
>> Geschäftsführer:
>>  Dipl.-Kfm. Holger Maczkowsky,
>>  Roman Maczkowsky
>>
>> GnuPG-Key-ID: 0x2DD3A649
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux