the worst thing that cluster was running (on light load tho) for about 6 months now and I already flashed firmware to those cards which made problem "disappear" for small loads, so I wasnt even expecting problem in that place. Sadly OSDs still eat between 2 and 6 GB RAM each but I hope that will stop once recovery finishes. On Tue, 8 Sep 2015 12:31:03 +0200, Jan Schermer <jan@xxxxxxxxxxx> wrote: > YMMV, same story like SSD selection. > Intels have their own problems :-) > > Jan > > > On 08 Sep 2015, at 12:09, Mariusz Gronczewski <mariusz.gronczewski@xxxxxxxxxxxx> wrote: > > > > For those interested: > > > > Bug that caused ceph to go haywire was a emulex nic driver dropping > > packets when making more than few hundred megabits (basically linear > > change compared to load) which caused osds to flap constantly once > > something gone wrong (high traffic, osd go down, ceph starts to > > reallocationg stuff, which causes more traffic, more osds flap, etc) > > > > upgrading kernel to 4.1.6 (was present at least in 4.0.1, and in c6 > > "distro" kernel) fixed that and it started to rebuild correctly > > > > Lessons learned, buy Intel NICs... > > > > On Mon, 7 Sep 2015 20:51:57 +0800, 池信泽 <xmdxcxz@xxxxxxxxx> wrote: > > > >> Yeh, There is bug which would use huge memory. It be triggered when osd > >> down or add into cluster and do recovery/backfilling. > >> > >> The patch https://github.com/ceph/ceph/pull/5656 > >> https://github.com/ceph/ceph/pull/5451 merged into master would fix it, and > >> it would be backport. > >> > >> I think ceph v0.93 or newer version maybe hit this bug. > >> > >> 2015-09-07 20:42 GMT+08:00 Shinobu Kinjo <skinjo@xxxxxxxxxx>: > >> > >>> How heavy network traffic was? > >>> > >>> Have you tried to capture that traffic between cluster and public network > >>> to see where such a bunch of traffic came from? > >>> > >>> Shinobu > >>> > >>> ----- Original Message ----- > >>> From: "Jan Schermer" <jan@xxxxxxxxxxx> > >>> To: "Mariusz Gronczewski" <mariusz.gronczewski@xxxxxxxxxxxx> > >>> Cc: ceph-users@xxxxxxxxxxxxxx > >>> Sent: Monday, September 7, 2015 9:17:04 PM > >>> Subject: Re: Huge memory usage spike in OSD on hammer/giant > >>> > >>> Hmm, even network traffic went up. > >>> Nothing in logs on the mons which started 9/4 ~6 AM? > >>> > >>> Jan > >>> > >>>> On 07 Sep 2015, at 14:11, Mariusz Gronczewski < > >>> mariusz.gronczewski@xxxxxxxxxxxx> wrote: > >>>> > >>>> On Mon, 7 Sep 2015 13:44:55 +0200, Jan Schermer <jan@xxxxxxxxxxx> wrote: > >>>> > >>>>> Maybe some configuration change occured that now takes effect when you > >>> start the OSD? > >>>>> Not sure what could affect memory usage though - some ulimit values > >>> maybe (stack size), number of OSD threads (compare the number from this OSD > >>> to the rest of OSDs), fd cache size. Look in /proc and compare everything. > >>>>> Also look in "ceph osd tree" - didn't someone touch it while you were > >>> gone? > >>>>> > >>>>> Jan > >>>>> > >>>> > >>>>> number of OSD threads (compare the number from this OSD to the rest of > >>>> OSDs), > >>>> > >>>> it occured on all OSDs, and it looked like that > >>>> http://imgur.com/IIMIyRG > >>>> > >>>> sadly I was on vacation so I didnt manage to catch it before ;/ but I'm > >>>> sure there was no config change > >>>> > >>>> > >>>>>> On 07 Sep 2015, at 13:40, Mariusz Gronczewski < > >>> mariusz.gronczewski@xxxxxxxxxxxx> wrote: > >>>>>> > >>>>>> On Mon, 7 Sep 2015 13:02:38 +0200, Jan Schermer <jan@xxxxxxxxxxx> > >>> wrote: > >>>>>> > >>>>>>> Apart from bug causing this, this could be caused by failure of other > >>> OSDs (even temporary) that starts backfills. > >>>>>>> > >>>>>>> 1) something fails > >>>>>>> 2) some PGs move to this OSD > >>>>>>> 3) this OSD has to allocate memory for all the PGs > >>>>>>> 4) whatever fails gets back up > >>>>>>> 5) the memory is never released. > >>>>>>> > >>>>>>> A similiar scenario is possible if for example someone confuses "ceph > >>> osd crush reweight" with "ceph osd reweight" (yes, this happened to me :-)). > >>>>>>> > >>>>>>> Did you try just restarting the OSD before you upgraded it? > >>>>>> > >>>>>> stopped, upgraded, started. it helped a bit ( <3GB per OSD) but beside > >>>>>> that nothing changed. I've tried to wait till it stops eating CPU then > >>>>>> restart it but it still eats >2GB of memory which means I can't start > >>>>>> all 4 OSDs at same time ;/ > >>>>>> > >>>>>> I've also added noin,nobackfill,norecover flags but that didnt help > >>>>>> > >>>>>> it is suprising for me because before all 4 OSDs total ate less than > >>>>>> 2GBs of memory so I though I have enough headroom, and we did restart > >>>>>> machines and removed/added os to test if recovery/rebalance goes fine > >>>>>> > >>>>>> it also does not have any external traffic at the moment > >>>>>> > >>>>>> > >>>>>>>> On 07 Sep 2015, at 12:58, Mariusz Gronczewski < > >>> mariusz.gronczewski@xxxxxxxxxxxx> wrote: > >>>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> over a weekend (was on vacation so I didnt get exactly what happened) > >>>>>>>> our OSDs started eating in excess of 6GB of RAM (well RSS), which > >>> was a > >>>>>>>> problem considering that we had only 8GB of ram for 4 OSDs (about 700 > >>>>>>>> pgs per osd and about 70GB space used. So spam of coredumps and OOMs > >>>>>>>> blocked the osds down to unusabiltity. > >>>>>>>> > >>>>>>>> I then upgraded one of OSDs to hammer which made it a bit better > >>> (~2GB > >>>>>>>> per osd) but still much higher usage than before. > >>>>>>>> > >>>>>>>> any ideas what would be a reason for that ? logs are mostly full on > >>>>>>>> OSDs trying to recover and timed out heartbeats > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Mariusz Gronczewski, Administrator > >>>>>>>> > >>>>>>>> Efigence S. A. > >>>>>>>> ul. Wołoska 9a, 02-583 Warszawa > >>>>>>>> T: [+48] 22 380 13 13 > >>>>>>>> F: [+48] 22 380 13 14 > >>>>>>>> E: mariusz.gronczewski@xxxxxxxxxxxx > >>>>>>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx> > >>>>>>>> _______________________________________________ > >>>>>>>> ceph-users mailing list > >>>>>>>> ceph-users@xxxxxxxxxxxxxx > >>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Mariusz Gronczewski, Administrator > >>>>>> > >>>>>> Efigence S. A. > >>>>>> ul. Wołoska 9a, 02-583 Warszawa > >>>>>> T: [+48] 22 380 13 13 > >>>>>> F: [+48] 22 380 13 14 > >>>>>> E: mariusz.gronczewski@xxxxxxxxxxxx > >>>>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx> > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Mariusz Gronczewski, Administrator > >>>> > >>>> Efigence S. A. > >>>> ul. Wołoska 9a, 02-583 Warszawa > >>>> T: [+48] 22 380 13 13 > >>>> F: [+48] 22 380 13 14 > >>>> E: mariusz.gronczewski@xxxxxxxxxxxx > >>>> <mailto:mariusz.gronczewski@xxxxxxxxxxxx> > >>> > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@xxxxxxxxxxxxxx > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> _______________________________________________ > >>> ceph-users mailing list > >>> ceph-users@xxxxxxxxxxxxxx > >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >>> > >> > >> > >> > > > > > > > > -- > > Mariusz Gronczewski, Administrator > > > > Efigence S. A. > > ul. Wołoska 9a, 02-583 Warszawa > > T: [+48] 22 380 13 13 > > F: [+48] 22 380 13 14 > > E: mariusz.gronczewski@xxxxxxxxxxxx > > <mailto:mariusz.gronczewski@xxxxxxxxxxxx> > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Mariusz Gronczewski, Administrator Efigence S. A. ul. Wołoska 9a, 02-583 Warszawa T: [+48] 22 380 13 13 F: [+48] 22 380 13 14 E: mariusz.gronczewski@xxxxxxxxxxxx <mailto:mariusz.gronczewski@xxxxxxxxxxxx>
Attachment:
pgpKmDe62UXYC.pgp
Description: OpenPGP digital signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com