Re: OSD memory leaks?

Sébastien Han <han.sebastien@xxxxxxxxx> · Fri, 4 Jan 2013 16:20:58 +0100

Hi Sam,

Thanks for your answer and sorry the late reply.

Unfortunately I can't get something out from the profiler, actually I
do but I guess it doesn't show what is supposed to show... I will keep
on trying this. Anyway yesterday I just thought that the problem might
be due to some over usage of some OSDs. I was thinking that the
distribution of the primary OSD might be uneven, this could have
explained that some memory leaks are more important with some servers.
At the end, the repartition seems even but while looking at the pg
dump I found something interesting in the scrub column, timestamps
from the last scrubbing operation matched with times showed on the
graph.

After this, I made some calculation, I compared the total number of
scrubbing operation with the time range where memory leaks occurred.
First of all check my setup:

root@c2-ceph-01 ~ # ceph osd tree
dumped osdmap tree epoch 859
# id weight type name up/down reweight
-1 12 pool default
-3 12 rack lc2_rack33
-2 3 host c2-ceph-01
0 1 osd.0 up 1
1 1 osd.1 up 1
2 1 osd.2 up 1
-4 3 host c2-ceph-04
10 1 osd.10 up 1
11 1 osd.11 up 1
9 1 osd.9 up 1
-5 3 host c2-ceph-02
3 1 osd.3 up 1
4 1 osd.4 up 1
5 1 osd.5 up 1
-6 3 host c2-ceph-03
6 1 osd.6 up 1
7 1 osd.7 up 1
8 1 osd.8 up 1

And there are the results:

* Ceph node 1 which has the most important memory leak performed 1608
in total and 1059 during the time range where memory leaks occured
* Ceph node 2, 1168 in total and 776 during the time range where
memory leaks occured
* Ceph node 3, 940 in total and 94 during  the time range where memory
leaks occurred
* Ceph node 4, 899 in total and 191 during  the time range where
memory leaks occurred

I'm still not entirely sure that the scrub operation causes the leak
but the only relevant relation that I found...

Could it be that the scrubbing process doesn't release memory? Btw I
was wondering, how ceph decides at what time it should run the
scrubbing operation? I know that it's once a day and control by the
following options

OPTION(osd_scrub_min_interval, OPT_FLOAT, 300)
OPTION(osd_scrub_max_interval, OPT_FLOAT, 60*60*24)

But how ceph determined the time where the operation started, during
cluster creation probably?

I just checked the options that control OSD scrubbing and found that by default:

OPTION(osd_max_scrubs, OPT_INT, 1)

So that might explain why only one OSD uses a lot of memory.

My dirty workaround at the moment is to performed a check of memory
use by every OSD and restart it if it uses more than 25% of the total
memory. Also note that on ceph 1, 3 and 4 it's always one OSD that
uses a lot of memory, for ceph 2 only the mem usage is high but almost
the same for all the OSD process.

Thank you in advance.

--
Regards,
Sébastien Han.

On Wed, Dec 19, 2012 at 10:43 PM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
>
> Sorry, it's been very busy.  The next step would to try to get a heap
> dump.  You can start a heap profile on osd N by:
>
> ceph osd tell N heap start_profiler
>
> and you can get it to dump the collected profile using
>
> ceph osd tell N heap dump.
>
> The dumps should show up in the osd log directory.
>
> Assuming the heap profiler is working correctly, you can look at the
> dump using pprof in google-perftools.
>
> On Wed, Dec 19, 2012 at 8:37 AM, Sébastien Han <han.sebastien@xxxxxxxxx> wrote:
> > No more suggestions? :(
> > --
> > Regards,
> > Sébastien Han.
> >
> >
> > On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han <han.sebastien@xxxxxxxxx> wrote:
> >> Nothing terrific...
> >>
> >> Kernel logs from my clients are full of "libceph: osd4
> >> 172.20.11.32:6801 socket closed"
> >>
> >> I saw this somewhere on the tracker.
> >>
> >> Does this harm?
> >>
> >> Thanks.
> >>
> >> --
> >> Regards,
> >> Sébastien Han.
> >>
> >>
> >>
> >> On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
> >>>
> >>> What is the workload like?
> >>> -Sam
> >>>
> >>> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han <han.sebastien@xxxxxxxxx> wrote:
> >>> > Hi,
> >>> >
> >>> > No, I don't see nothing abnormal in the network stats. I don't see
> >>> > anything in the logs... :(
> >>> > The weird thing is that one node over 4 seems to take way more memory
> >>> > than the others...
> >>> >
> >>> > --
> >>> > Regards,
> >>> > Sébastien Han.
> >>> >
> >>> >
> >>> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han <han.sebastien@xxxxxxxxx> wrote:
> >>> >>
> >>> >> Hi,
> >>> >>
> >>> >> No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :(
> >>> >> The weird thing is that one node over 4 seems to take way more memory than the others...
> >>> >>
> >>> >> --
> >>> >> Regards,
> >>> >> Sébastien Han.
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.just@xxxxxxxxxxx> wrote:
> >>> >>>
> >>> >>> Are you having network hiccups?  There was a bug noticed recently that
> >>> >>> could cause a memory leak if nodes are being marked up and down.
> >>> >>> -Sam
> >>> >>>
> >>> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han <han.sebastien@xxxxxxxxx> wrote:
> >>> >>> > Hi guys,
> >>> >>> >
> >>> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes used a
> >>> >>> > lot of memory. It keeps growing and growing.
> >>> >>> > See the graph attached to this mail.
> >>> >>> > I run 0.48.2 on Ubuntu 12.04.
> >>> >>> >
> >>> >>> > The other nodes also grow, but slowly than the first one.
> >>> >>> >
> >>> >>> > I'm not quite sure about the information that I have to provide. So
> >>> >>> > let me know. The only thing I can say is that the load haven't
> >>> >>> > increase that much this week. It seems to be consuming and not giving
> >>> >>> > back the memory.
> >>> >>> >
> >>> >>> > Thank you in advance.
> >>> >>> >
> >>> >>> > --
> >>> >>> > Regards,
> >>> >>> > Sébastien Han.
> >>> >>
> >>> >>
Attachment:
ceph-leak-scrub.png

Description: PNG image