Re: [PATCH] mm: add MM_SWAPENTS and page table when calculate tasksize in lowmem_scan()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





2016-02-17 8:35 GMT+08:00 David Rientjes <rientjes@xxxxxxxxxx>:
On Tue, 16 Feb 2016, Greg Kroah-Hartman wrote:

> On Tue, Feb 16, 2016 at 05:37:05PM +0800, Xishi Qiu wrote:
> > Currently tasksize in lowmem_scan() only calculate rss, and not include swap.
> > But usually smart phones enable zram, so swap space actually use ram.
>
> Yes, but does that matter for this type of calculation?  I need an ack
> from the android team before I could ever take such a core change to
> this code...
>

The calculation proposed in this patch is the same as the generic oom
killer, it's an estimate of the amount of memory that will be freed if it
is killed and can exit.  This is better than simply get_mm_rss().

However, I think we seriously need to re-consider the implementation of
the lowmem killer entirely.  It currently abuses the use of TIF_MEMDIE,
which should ideally only be set for one thread on the system since it
allows unbounded access to global memory reserves.


i don't understand why it need wait 1 second:

if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
   time_before_eq(jiffies, lowmem_deathpending_timeout)) {
task_unlock(p);
rcu_read_unlock();
return 0;                             <= why return rather than continue?
}

and it will retry and wait many CPU times if one task holding the TIF_MEMDI.
   shrink_slab_node()   
       while()
           shrinker->scan_objects();
                     lowmem_scan()
                                 if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
                                       time_before_eq(jiffies, lowmem_deathpending_timeout)) 

 

It also abuses the user-visible /proc/self/oom_score_adj tunable: this
tunable is used by the generic oom killer to bias or discount a proportion
of memory from a process's usage.  This is the only supported semantic of
the tunable.  The lowmem killer uses it as a strict prioritization, so any
process with oom_score_adj higher than another process is preferred for
kill, REGARDLESS of memory usage.  This leads to priority inversion, the
user is unable to always define the same process to be killed by the
generic oom killer and the lowmem killer.  This is what happens when a
tunable with a very clear and defined purpose is used for other reasons.

I'd seriously consider not accepting any additional hacks on top of this
code until the implementation is rewritten.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href="" href="mailto:dont@xxxxxxxxx">dont@xxxxxxxxx"> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]