Re: Monitor Restart triggers half of our OSDs marked down

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Thu, 5 Feb 2015 09:44:19 +0100

Hi,
We also have seen this once after upgrading to 0.80.8 (from dumpling).
Last week we had a network outage which marked out around 1/3rd of our
OSDs. The outage lasted less than a minute -- all the OSDs were
brought up once the network was restored.

Then 30 minutes later I restarted one monitor to roll out a small
config change (changing leveldb log path). Surprisingly that resulted
in many OSDs (but seemingly fewer than before) being marked out again
then quickly marked in again.

I only have the lowest level logs from this incident -- but I think
it's easily reproducable.

Cheers, Dan

On Wed, Feb 4, 2015 at 12:06 PM, Christian Eichelmann
<christian.eichelmann@xxxxxxxx> wrote:
> Hi Greg,
>
> the behaviour is indeed strange. Today I was trying to reproduce the
> problem, but no matter which monitor I've restarted, no matter how many
> times, the bahviour was like expected: A new monitor election was called
> and everything contiuned normally.
>
> Then I continued my failover tests and simulated the failure of two
> racks with iptables (for us: 2 MON and & 6 OSD Server with in sum 360 OSDs)
>
> Afterwards I tried again to restart one monitor and again about 240 OSDs
> got marked as down.
>
> There was no load on our monitor servers in that period. On one of the
> OSDs which got marked down I found lot's of those messages:
>
> 2015-02-04 11:55:22.788245 7fc48fa48700  0 -- 10.76.70.4:6997/17094790
>>> 10.76.70.8:6806/3303244 pipe(0x7a1b600 sd=198 :59766 s=2 pgs=1353
> cs=1 l=0 c=0x4e562c0).fault with nothing to send, going to standby
> 2015-02-04 11:55:22.788371 7fc48be0c700  0 -- 10.76.70.4:6997/17094790
>>> 10.76.70.8:6842/12012876 pipe(0x895e840 sd=188 :49283 s=2 pgs=36873
> cs=1 l=0 c=0x13226f20).fault with nothing to send, going to standby
> 2015-02-04 11:55:22.788458 7fc494e9c700  0 -- 10.76.70.4:6997/17094790
>>> 10.76.70.13:6870/13021609 pipe(0x13ace2c0 sd=117 :64130 s=2 pgs=38011
> cs=1 l=0 c=0x52b4840).fault with nothing to send, going to standby
> 2015-02-04 11:55:22.797107 7fc46459d700  0 -- 10.76.70.4:0/94790 >>
> 10.76.70.11:6980/37144571 pipe(0xba0c580 sd=30 :0 s=1 pgs=0 cs=0 l=1
> c=0x4e51600).fault
> 2015-02-04 11:55:22.799350 7fc482d7d700  0 -- 10.76.70.4:6997/17094790
>>> 10.76.70.10:6887/30410592 pipe(0x6a0cb00 sd=271 :53090 s=2 pgs=15372
> cs=1 l=0 c=0xf3a6f20).fault with nothing to send, going to standby
> 2015-02-04 11:55:22.800018 7fc46429a700  0 -- 10.76.70.4:0/94790 >>
> 10.76.28.41:7076/37144571 pipe(0xba0c840 sd=59 :0 s=1 pgs=0 cs=0 l=1
> c=0xf339760).fault
> 2015-02-04 11:55:22.803086 7fc482272700  0 -- 10.76.70.4:6997/17094790
>>> 10.76.70.5:6867/17011547 pipe(0x12f998c0 sd=294 :6997 s=2 pgs=46095
> cs=1 l=0 c=0x8382000).fault with nothing to send, going to standby
> 2015-02-04 11:55:22.804736 7fc4892e1700  0 -- 10.76.70.4:6997/17094790
>>> 10.76.70.13:6852/9142109 pipe(0x12fa5b80 sd=163 :57056 s=2 pgs=45269
> cs=1 l=0 c=0x189d1600).fault with nothing to send, going to standby
>
> The IPs mentioned there are all OSD Server.
>
> For me it feels like the monitors still have some "memory" about the
> failed OSDs and something is happening when one of the goes down. If I
> can provide you any more information to clarify the issue, just tell me
> what you need.
>
> Regards,
> Christian
>
> Am 03.02.2015 18:10, schrieb Gregory Farnum:
>> On Tue, Feb 3, 2015 at 3:38 AM, Christian Eichelmann
>> <christian.eichelmann@xxxxxxxx> wrote:
>>> Hi all,
>>>
>>> during some failover tests and some configuration tests, we currently
>>> discover a strange phenomenon:
>>>
>>> Restarting one of our monitors (5 in sum) triggers about 300 of the
>>> following events:
>>>
>>> osd.669 10.76.28.58:6935/149172 failed (20 reports from 20 peers after
>>> 22.005858 >= grace 20.000000)
>>>
>>> The osds come back up shortly after the have been marked down. What I
>>> don't understand is: How can a restart of one monitor prevent the osds
>>> from talking to each other and marking them down?
>>>
>>> FYI:
>>> We are currently using the following settings:
>>> mon osd adjust hearbeat grace = false
>>> mon osd min down reporters = 20
>>> mon osd adjust down out interval = false
>>
>> That's really strange. I think maybe you're seeing some kind of
>> secondary effect; what kind of CPU usage are you seeing on the
>> monitors during this time? Have you checked the log on any OSDs which
>> have been marked down?
>>
>> I have a suspicion that maybe the OSDs are detecting their failed
>> monitor connection and not being able to reconnect to another monitor
>> quickly enough, but I'm not certain what the overlaps are there.
>> -Greg
>>
>
>
> --
> Christian Eichelmann
> Systemadministrator
>
> 1&1 Internet AG - IT Operations Mail & Media Advertising & Targeting
> Brauerstraße 48 · DE-76135 Karlsruhe
> Telefon: +49 721 91374-8026
> christian.eichelmann@xxxxxxxx
>
> Amtsgericht Montabaur / HRB 6484
> Vorstände: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Robert
> Hoffmann, Markus Huhn, Hans-Henning Kettler, Dr. Oliver Mauss, Jan Oetjen
> Aufsichtsratsvorsitzender: Michael Scheeren
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com