Re: Huge memory usage spike in OSD on hammer/giant

Jan Schermer <jan@xxxxxxxxxxx> · Tue, 8 Sep 2015 12:31:03 +0200

YMMV, same story like SSD selection.
Intels have their own problems :-)

Jan

> On 08 Sep 2015, at 12:09, Mariusz Gronczewski <mariusz.gronczewski@xxxxxxxxxxxx> wrote:
> 
> For those interested:
> 
> Bug that caused ceph to go haywire was a emulex nic driver dropping
> packets when making more than few hundred megabits (basically linear
> change compared to load) which caused osds to flap constantly once
> something gone wrong (high traffic, osd go down, ceph starts to
> reallocationg stuff, which causes more traffic, more osds flap, etc)
> 
> upgrading kernel to 4.1.6 (was present at least in 4.0.1, and in c6
> "distro" kernel) fixed that and it started to rebuild correctly
> 
> Lessons learned, buy Intel NICs...
> 
> On Mon, 7 Sep 2015 20:51:57 +0800, 池信泽 <xmdxcxz@xxxxxxxxx> wrote:
> 
>> Yeh, There is bug which would use huge memory. It be triggered when osd
>> down or add into cluster and do recovery/backfilling.
>> 
>> The patch https://github.com/ceph/ceph/pull/5656
>> https://github.com/ceph/ceph/pull/5451 merged into master would fix it, and
>> it would be backport.
>> 
>> I think ceph v0.93 or newer version maybe hit this bug.
>> 
>> 2015-09-07 20:42 GMT+08:00 Shinobu Kinjo <skinjo@xxxxxxxxxx>:
>> 
>>> How heavy network traffic was?
>>> 
>>> Have you tried to capture that traffic between cluster and public network
>>> to see where such a bunch of traffic came from?
>>> 
>>> Shinobu
>>> 
>>> ----- Original Message -----
>>> From: "Jan Schermer" <jan@xxxxxxxxxxx>
>>> To: "Mariusz Gronczewski" <mariusz.gronczewski@xxxxxxxxxxxx>
>>> Cc: ceph-users@xxxxxxxxxxxxxx
>>> Sent: Monday, September 7, 2015 9:17:04 PM
>>> Subject: Re:  Huge memory usage spike in OSD on hammer/giant
>>> 
>>> Hmm, even network traffic went up.
>>> Nothing in logs on the mons which started 9/4 ~6 AM?
>>> 
>>> Jan
>>> 
>>>> On 07 Sep 2015, at 14:11, Mariusz Gronczewski <
>>> mariusz.gronczewski@xxxxxxxxxxxx> wrote:
>>>> 
>>>> On Mon, 7 Sep 2015 13:44:55 +0200, Jan Schermer <jan@xxxxxxxxxxx> wrote:
>>>> 
>>>>> Maybe some configuration change occured that now takes effect when you
>>> start the OSD?
>>>>> Not sure what could affect memory usage though - some ulimit values
>>> maybe (stack size), number of OSD threads (compare the number from this OSD
>>> to the rest of OSDs), fd cache size. Look in /proc and compare everything.
>>>>> Also look in "ceph osd tree" - didn't someone touch it while you were
>>> gone?
>>>>> 
>>>>> Jan
>>>>> 
>>>> 
>>>>> number of OSD threads (compare the number from this OSD to the rest of
>>>> OSDs),
>>>> 
>>>> it occured on all OSDs, and it looked like that
>>>> http://imgur.com/IIMIyRG
>>>> 
>>>> sadly I was on vacation so I didnt manage to catch it before ;/ but I'm
>>>> sure there was no config change
>>>> 
>>>> 
>>>>>> On 07 Sep 2015, at 13:40, Mariusz Gronczewski <
>>> mariusz.gronczewski@xxxxxxxxxxxx> wrote:
>>>>>> 
>>>>>> On Mon, 7 Sep 2015 13:02:38 +0200, Jan Schermer <jan@xxxxxxxxxxx>
>>> wrote:
>>>>>> 
>>>>>>> Apart from bug causing this, this could be caused by failure of other
>>> OSDs (even temporary) that starts backfills.
>>>>>>> 
>>>>>>> 1) something fails
>>>>>>> 2) some PGs move to this OSD
>>>>>>> 3) this OSD has to allocate memory for all the PGs
>>>>>>> 4) whatever fails gets back up
>>>>>>> 5) the memory is never released.
>>>>>>> 
>>>>>>> A similiar scenario is possible if for example someone confuses "ceph
>>> osd crush reweight" with "ceph osd reweight" (yes, this happened to me :-)).
>>>>>>> 
>>>>>>> Did you try just restarting the OSD before you upgraded it?
>>>>>> 
>>>>>> stopped, upgraded, started. it helped a bit ( <3GB per OSD) but beside
>>>>>> that nothing changed. I've tried to wait till it stops eating CPU then
>>>>>> restart it but it still eats >2GB of memory which means I can't start
>>>>>> all 4 OSDs at same time ;/
>>>>>> 
>>>>>> I've also added noin,nobackfill,norecover flags but that didnt help
>>>>>> 
>>>>>> it is suprising for me because before all 4 OSDs total ate less than
>>>>>> 2GBs of memory so I though I have enough headroom, and we did restart
>>>>>> machines and removed/added os to test if recovery/rebalance goes fine
>>>>>> 
>>>>>> it also does not have any external traffic at the moment
>>>>>> 
>>>>>> 
>>>>>>>> On 07 Sep 2015, at 12:58, Mariusz Gronczewski <
>>> mariusz.gronczewski@xxxxxxxxxxxx> wrote:
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> over a weekend (was on vacation so I didnt get exactly what happened)
>>>>>>>> our OSDs started eating in excess of 6GB of RAM (well RSS), which
>>> was a
>>>>>>>> problem considering that we had only 8GB of ram for 4 OSDs (about 700
>>>>>>>> pgs per osd and about 70GB space used. So spam of coredumps and OOMs
>>>>>>>> blocked the osds down to unusabiltity.
>>>>>>>> 
>>>>>>>> I then upgraded one of OSDs to hammer which made it a bit better
>>> (~2GB
>>>>>>>> per osd) but still much higher usage than before.
>>>>>>>> 
>>>>>>>> any ideas what would be a reason for that ? logs are mostly full on
>>>>>>>> OSDs trying to recover and timed out heartbeats
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Mariusz Gronczewski, Administrator
>>>>>>>> 
>>>>>>>> Efigence S. A.
>>>>>>>> ul. Wołoska 9a, 02-583 Warszawa
>>>>>>>> T: [+48] 22 380 13 13
>>>>>>>> F: [+48] 22 380 13 14
>>>>>>>> E: mariusz.gronczewski@xxxxxxxxxxxx
>>>>>>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list
>>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Mariusz Gronczewski, Administrator
>>>>>> 
>>>>>> Efigence S. A.
>>>>>> ul. Wołoska 9a, 02-583 Warszawa
>>>>>> T: [+48] 22 380 13 13
>>>>>> F: [+48] 22 380 13 14
>>>>>> E: mariusz.gronczewski@xxxxxxxxxxxx
>>>>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Mariusz Gronczewski, Administrator
>>>> 
>>>> Efigence S. A.
>>>> ul. Wołoska 9a, 02-583 Warszawa
>>>> T: [+48] 22 380 13 13
>>>> F: [+48] 22 380 13 14
>>>> E: mariusz.gronczewski@xxxxxxxxxxxx
>>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> Mariusz Gronczewski, Administrator
> 
> Efigence S. A.
> ul. Wołoska 9a, 02-583 Warszawa
> T: [+48] 22 380 13 13
> F: [+48] 22 380 13 14
> E: mariusz.gronczewski@xxxxxxxxxxxx
> <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com