Re: What is oom_killer_disable() for?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[CCing Rafael]

On Sun 10-01-16 02:02:32, Tetsuo Handa wrote:
> I wonder what oom_killer_disable() wants to do.
> 
> (1) We need to save a consistent memory snapshot image when suspending,
>     is this correct?

Yes

> (2) To obtain a consistent memory snapshot image, we need to freeze all
>     but current thread in order to avoid modifying on-memory data while
>     saving to disk, is this correct?

Yes, the system has to enter a quiescent state where no further user
space activity can interfere with the PM suspend.
 
> (3) Then, what is the purpose of disabling the OOM killer? Why do we
>     need to disable the OOM killer? Is it because the OOM killer thaws
>     already frozen threads?

Yes. We have to be able to thaw frozen tasks if they are chosen as an
OOM victim otherwise you could hide processes into the fridge and lockup
the system. oom_killer_disable is then needed for pm freezer because the
OOM killer might be invoked even after all the userspace is considered
in the quiescent state. We cannot wake any task anymore during the later
pm freezer processing AFAIU. oom_killer_disable then acts as a hard
barrier after which we know that even OOM killer won't wake any user
space tasks.

> (4) Then, why do we wait for TIF_MEMDIE threads to terminate? We can
>     freeze thawed threads again without waiting for TIF_MEMDIE threads,
>     can't we? Is it because we need free memory for saving to disk?

It is preferable to finish the OOM killer handling before entering the
quiescent state. As only TIF_MEMDIE tasks are thawed we do not wait for
all killed task. If this alone doesn't help to resolve the OOM condition
then we will likely fail later in the process when a memory allocation
is required and ENOMEM terminate the whole process.

> (5) Then, why waiting for only TIF_MEMDIE threads is sufficient? There
>     is no TIF_MEMDIE threads does not guarantee that we have free memory,
>     for there might be !TIF_MEMDIE threads which are still sharing memory
>     used by TIF_MEMDIE threads.

see above. We are trying to be as good as possible but not perfect. The
only hard requirement is that an unexpected OOM victim won't interfere
with a code which doesn't expect userspace to run.

> (6) Since oom_killer_disable() already disabled the OOM killer,
>     !TIF_MEMDIE threads which are sharing memory used by TIF_MEMDIE
>     threads cannot get TIF_MEMDIE by calling out_of_memory().

They shouldn't be running at all because the whole userspace has been
frozen. Only TIF_MEMDIE tasks are running at the time we are disabling
the oom killer and we are waiting for them. Please go and read through
c32b3cbe0d06 ("oom, PM: make OOM detection in the freezer path raceless")

>     Also, since out_of_memory() returns false after oom_killer_disable()
>     disabled the OOM killer, allocation requests by these !TIF_MEMDIE
>     threads start failing. Why do we need to give up with accepting
>     undesirable errors (e.g. failure of syscalls which modify an object's
>     attribute)?

Userspace shouldn't see unexpected errors due to OOM being disabled
because it is not runable at that time.

>     Why don't we abort suspend operation by marking that
>     re-enabling of the OOM killer might caused modification of on-memory
>     data (like patch shown below)? We can make final decision after memory
>     image snapshot is saved to disk, can't we?

I am not sure I am following you here but how do you detect that the
userspace has corrupted your image or accesses an already (half)
suspended device or something similar?

I am not saying disabling the OOM killer is the greatest solution. It
took quite some time to make it work reliably. But I fail to see how can
we guarantee no userspace interference when an OOM victim might be woken
by any allocation deep inside the PM code path. I believe we simply
need a point of no userspace activity to proceed with further PM steps
reasonably.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]