On Wed, Dec 18, 2013 at 4:54 PM, Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > On 12/16/2013 10:24 PM, Nathan Hoad wrote: > >> While running under this configuration, I've confirmed that memory >> usage does go up when active, and stays at that level when inactive, >> allowing some time for timeouts and whatnot. I'm currently switching >> between the two instances every fifteen minutes. >> >> Here is a link to the memory graph for the entire running time of the >> second process, at 1 minute intervals: >> http://getoffmalawn.com/static/mem-graph.png. The graph shows memory >> use steadily increasing during activity, but remaining reasonably >> stable during inactivity. > > I agree that this looks like a memory leak, but (in general) it could > also be some kind of memory pooling or cache entry accumulation. > > >> Where shall we go from here? > > > I recommend the following next steps: > > 1. Set "memory_pools off". > > 2. Disable all caching with "cache deny all". > > Do you see as similar memory growth pattern after the above two steps? I do see a similar pattern, although slowed - this makes sense though, given the directives that I've added, so it would appear that it's not related to pooling or caching. Memory usage still reaches a point where I have to kill everything to prevent the system being OOM'd. I'm happy to go in the other direction and raise the size of the memory pools, if that could be something useful. > > * If yes: Time for valgrind or ALL,9 debugging. I can help you make that > choice if needed. You can actually do those things now, without doing > steps 1 and 2 first, but valgrind and log analysis take time so if we > can avoid it by eliminating false positives and/or simplifying the > setup, we should do that first... I have got an ALL,9 log, but I am hesitant to unleash it on anyone as it is a 20gb file, from start to stop. If there is interest, I can still upload it - it compresses down to 1.7gb. During periods of inactivity I did reduce the logging to ALL,8, to try and lessen the burden a little. Alternatively, I'm happy to bump the debugging at specific times, e.g. shutdown, or raise/lower specific levels. Running valgrind produces repeated, spurious errors - claims that the debugs() macro has a mismatched free() / delete / delete [] on each call, which naturally gets a little noisy. If this is a known issue and not too much of a problem, I'm happy to continue doing it, though. What parameters should I run it with? I am not too experienced with Valgrind. Thanks, Nathan. > > * If no: Try re-enabling caching, but using smaller memory [and disk?] > cache sizes so that a cache gets and stays _full_ way before you run out > of RAM. This will eliminate cache index growth as a suspect. If your > disk cache is full already or uses Rock store, then this applies to > memory cache only. Do you see as similar memory growth pattern after > re-enabling caching? Are your caches full? > > > HTH, > > Alex. >