[PATCH] makedumpfile: change the wrong code to calculate bufsize_cyclic for elf dump

vgoyal@xxxxxxxxxx (Vivek Goyal) · Mon, 21 Apr 2014 11:19:14 -0400

On Fri, Apr 18, 2014 at 09:41:33PM +0200, Petr Tesarik wrote:
> On Fri, 18 Apr 2014 22:29:12 +0800
> "bhe at redhat.com" <bhe at redhat.com> wrote:
> 
> > 
> > > >> It definitely will cause OOM. On my test machine, it has 100G memory. So
> > > >> per old code, its needed_size is 3200K*2 == 6.4M, if currently free
> > > >> memory is only 15M left, the free_size will be 15M*0.4 which is 6M. So
> > > >> info->bufsize_cyclic is assigned to be 6M. and only 3M is left for other
> > > >> use, e.g page cache, dynamic allocation. OOM will happen.
> > > >>
> > > >
> > > >BTW, in our case, there's about 30M free memory when we started saving
> > > >dump. It should be caused by my coarse estimation above.
> > > 
> > > Thanks for your description, I understand that situation and
> > > the nature of the problem.
> > > 
> > > That is, the assumption that 20% of free memory is enough for
> > > makedumpfile can be broken if free memory is too small.
> > > If your machine has 200GB memory, OOM will happen even after fix
> > > the too allocation bug.
> > 
> > Well, we have done some experiments to try to get the statistical memory
> > range which kdump really need. Then a final reservation will be
> > calculated automatically as (base_value + linear growth of total memory). 
> > If one machine has 200GB memory, its reservation will grow too. Since
> > except of the bitmap cost, other memory cost is almost fixed. 
> > 
> > Per this scheme things should be go well, if memory always goes to the
> > edge of OOM, an adjust of base_value is needed. So a constant value as
> > you said may not be needed.
> > 
> > Instead, I am wondering how the 80% comes from, and why 20% of free
> > memory must be safe.
> 
> I believe these 80% come from the default value of vm.dirty_ratio,

Actually I had suggested this 80% number when --cyclic feature was
implemented. And I did not base it on dirty_ratio. Just a random
suggestion.

> which is 20%. In other words, the kernel won't block further writes
> until 20% of available RAM is used up by dirty cache. But if you
> fill up all free memory with dirty pages and then touch another (though
> allocated) page, the kernel will go into direct reclaim, and if nothing
> can be written out ATM, it will invoke the OOM Killer.

We can start playig with reducing dirty_raio too and see how does it go.

Thanks
Vivek