Re: Huge memory usage spike in OSD on hammer/giant

Shinobu Kinjo <skinjo@xxxxxxxxxx> · Tue, 8 Sep 2015 08:21:14 -0400 (EDT)

> eat between 2 and 6 GB RAM

That is a bit huge difference, I think.

----- Original Message -----
From: "Mariusz Gronczewski" <mariusz.gronczewski@xxxxxxxxxxxx>
To: "Jan Schermer" <jan@xxxxxxxxxxx>
Cc: ceph-users@xxxxxxxxxxxxxx
Sent: Tuesday, September 8, 2015 8:17:43 PM
Subject: Re:  Huge memory usage spike in OSD on hammer/giant

the worst thing that cluster was running (on light load tho) for about
6 months now and I already flashed firmware to those cards which made
problem "disappear" for small loads, so I wasnt even expecting problem
in that place. Sadly OSDs still eat between 2 and 6 GB RAM each but I
hope that will stop once recovery finishes.

On Tue, 8 Sep 2015 12:31:03 +0200, Jan Schermer
<jan@xxxxxxxxxxx> wrote:

> YMMV, same story like SSD selection.
> Intels have their own problems :-)
> 
> Jan
> 
> > On 08 Sep 2015, at 12:09, Mariusz Gronczewski <mariusz.gronczewski@xxxxxxxxxxxx> wrote:
> > 
> > For those interested:
> > 
> > Bug that caused ceph to go haywire was a emulex nic driver dropping
> > packets when making more than few hundred megabits (basically linear
> > change compared to load) which caused osds to flap constantly once
> > something gone wrong (high traffic, osd go down, ceph starts to
> > reallocationg stuff, which causes more traffic, more osds flap, etc)
> > 
> > upgrading kernel to 4.1.6 (was present at least in 4.0.1, and in c6
> > "distro" kernel) fixed that and it started to rebuild correctly
> > 
> > Lessons learned, buy Intel NICs...
> > 
> > On Mon, 7 Sep 2015 20:51:57 +0800, 池信泽 <xmdxcxz@xxxxxxxxx> wrote:
> > 
> >> Yeh, There is bug which would use huge memory. It be triggered when osd
> >> down or add into cluster and do recovery/backfilling.
> >> 
> >> The patch https://github.com/ceph/ceph/pull/5656
> >> https://github.com/ceph/ceph/pull/5451 merged into master would fix it, and
> >> it would be backport.
> >> 
> >> I think ceph v0.93 or newer version maybe hit this bug.
> >> 
> >> 2015-09-07 20:42 GMT+08:00 Shinobu Kinjo <skinjo@xxxxxxxxxx>:
> >> 
> >>> How heavy network traffic was?
> >>> 
> >>> Have you tried to capture that traffic between cluster and public network
> >>> to see where such a bunch of traffic came from?
> >>> 
> >>> Shinobu
> >>> 
> >>> ----- Original Message -----
> >>> From: "Jan Schermer" <jan@xxxxxxxxxxx>
> >>> To: "Mariusz Gronczewski" <mariusz.gronczewski@xxxxxxxxxxxx>
> >>> Cc: ceph-users@xxxxxxxxxxxxxx
> >>> Sent: Monday, September 7, 2015 9:17:04 PM
> >>> Subject: Re:  Huge memory usage spike in OSD on hammer/giant
> >>> 
> >>> Hmm, even network traffic went up.
> >>> Nothing in logs on the mons which started 9/4 ~6 AM?
> >>> 
> >>> Jan
> >>> 
> >>>> On 07 Sep 2015, at 14:11, Mariusz Gronczewski <
> >>> mariusz.gronczewski@xxxxxxxxxxxx> wrote:
> >>>> 
> >>>> On Mon, 7 Sep 2015 13:44:55 +0200, Jan Schermer <jan@xxxxxxxxxxx> wrote:
> >>>> 
> >>>>> Maybe some configuration change occured that now takes effect when you
> >>> start the OSD?
> >>>>> Not sure what could affect memory usage though - some ulimit values
> >>> maybe (stack size), number of OSD threads (compare the number from this OSD
> >>> to the rest of OSDs), fd cache size. Look in /proc and compare everything.
> >>>>> Also look in "ceph osd tree" - didn't someone touch it while you were
> >>> gone?
> >>>>> 
> >>>>> Jan
> >>>>> 
> >>>> 
> >>>>> number of OSD threads (compare the number from this OSD to the rest of
> >>>> OSDs),
> >>>> 
> >>>> it occured on all OSDs, and it looked like that
> >>>> http://imgur.com/IIMIyRG
> >>>> 
> >>>> sadly I was on vacation so I didnt manage to catch it before ;/ but I'm
> >>>> sure there was no config change
> >>>> 
> >>>> 
> >>>>>> On 07 Sep 2015, at 13:40, Mariusz Gronczewski <
> >>> mariusz.gronczewski@xxxxxxxxxxxx> wrote:
> >>>>>> 
> >>>>>> On Mon, 7 Sep 2015 13:02:38 +0200, Jan Schermer <jan@xxxxxxxxxxx>
> >>> wrote:
> >>>>>> 
> >>>>>>> Apart from bug causing this, this could be caused by failure of other
> >>> OSDs (even temporary) that starts backfills.
> >>>>>>> 
> >>>>>>> 1) something fails
> >>>>>>> 2) some PGs move to this OSD
> >>>>>>> 3) this OSD has to allocate memory for all the PGs
> >>>>>>> 4) whatever fails gets back up
> >>>>>>> 5) the memory is never released.
> >>>>>>> 
> >>>>>>> A similiar scenario is possible if for example someone confuses "ceph
> >>> osd crush reweight" with "ceph osd reweight" (yes, this happened to me :-)).
> >>>>>>> 
> >>>>>>> Did you try just restarting the OSD before you upgraded it?
> >>>>>> 
> >>>>>> stopped, upgraded, started. it helped a bit ( <3GB per OSD) but beside
> >>>>>> that nothing changed. I've tried to wait till it stops eating CPU then
> >>>>>> restart it but it still eats >2GB of memory which means I can't start
> >>>>>> all 4 OSDs at same time ;/
> >>>>>> 
> >>>>>> I've also added noin,nobackfill,norecover flags but that didnt help
> >>>>>> 
> >>>>>> it is suprising for me because before all 4 OSDs total ate less than
> >>>>>> 2GBs of memory so I though I have enough headroom, and we did restart
> >>>>>> machines and removed/added os to test if recovery/rebalance goes fine
> >>>>>> 
> >>>>>> it also does not have any external traffic at the moment
> >>>>>> 
> >>>>>> 
> >>>>>>>> On 07 Sep 2015, at 12:58, Mariusz Gronczewski <
> >>> mariusz.gronczewski@xxxxxxxxxxxx> wrote:
> >>>>>>>> 
> >>>>>>>> Hi,
> >>>>>>>> 
> >>>>>>>> over a weekend (was on vacation so I didnt get exactly what happened)
> >>>>>>>> our OSDs started eating in excess of 6GB of RAM (well RSS), which
> >>> was a
> >>>>>>>> problem considering that we had only 8GB of ram for 4 OSDs (about 700
> >>>>>>>> pgs per osd and about 70GB space used. So spam of coredumps and OOMs
> >>>>>>>> blocked the osds down to unusabiltity.
> >>>>>>>> 
> >>>>>>>> I then upgraded one of OSDs to hammer which made it a bit better
> >>> (~2GB
> >>>>>>>> per osd) but still much higher usage than before.
> >>>>>>>> 
> >>>>>>>> any ideas what would be a reason for that ? logs are mostly full on
> >>>>>>>> OSDs trying to recover and timed out heartbeats
> >>>>>>>> 
> >>>>>>>> --
> >>>>>>>> Mariusz Gronczewski, Administrator
> >>>>>>>> 
> >>>>>>>> Efigence S. A.
> >>>>>>>> ul. Wołoska 9a, 02-583 Warszawa
> >>>>>>>> T: [+48] 22 380 13 13
> >>>>>>>> F: [+48] 22 380 13 14
> >>>>>>>> E: mariusz.gronczewski@xxxxxxxxxxxx
> >>>>>>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
> >>>>>>>> _______________________________________________
> >>>>>>>> ceph-users mailing list
> >>>>>>>> ceph-users@xxxxxxxxxxxxxx
> >>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>> 
> >>>>>> --
> >>>>>> Mariusz Gronczewski, Administrator
> >>>>>> 
> >>>>>> Efigence S. A.
> >>>>>> ul. Wołoska 9a, 02-583 Warszawa
> >>>>>> T: [+48] 22 380 13 13
> >>>>>> F: [+48] 22 380 13 14
> >>>>>> E: mariusz.gronczewski@xxxxxxxxxxxx
> >>>>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
> >>>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> --
> >>>> Mariusz Gronczewski, Administrator
> >>>> 
> >>>> Efigence S. A.
> >>>> ul. Wołoska 9a, 02-583 Warszawa
> >>>> T: [+48] 22 380 13 13
> >>>> F: [+48] 22 380 13 14
> >>>> E: mariusz.gronczewski@xxxxxxxxxxxx
> >>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
> >>> 
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users@xxxxxxxxxxxxxx
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>> _______________________________________________
> >>> ceph-users mailing list
> >>> ceph-users@xxxxxxxxxxxxxx
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>> 
> >> 
> >> 
> >> 
> > 
> > 
> > 
> > -- 
> > Mariusz Gronczewski, Administrator
> > 
> > Efigence S. A.
> > ul. Wołoska 9a, 02-583 Warszawa
> > T: [+48] 22 380 13 13
> > F: [+48] 22 380 13 14
> > E: mariusz.gronczewski@xxxxxxxxxxxx
> > <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Mariusz Gronczewski, Administrator

Efigence S. A.
ul. Wołoska 9a, 02-583 Warszawa
T: [+48] 22 380 13 13
F: [+48] 22 380 13 14
E: mariusz.gronczewski@xxxxxxxxxxxx
<mailto:mariusz.gronczewski@xxxxxxxxxxxx>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com