Re: [PATCH 1/1] mm/oom_kill: trigger the oom killer if oom occurs without __GFP_FS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 5/8/23 05:07, Phillip Lougher wrote:

On 03/05/2023 20:10, Phillip Lougher wrote:

On 03/05/2023 12:49, Hui Wang wrote:

On 4/29/23 03:53, Michal Hocko wrote:
On Thu 27-04-23 11:47:10, Hui Wang wrote:
[...]
So Michal,

Don't know if you read the "[PATCH 0/1] mm/oom_kill: system enters a state something like hang when running stress-ng", do you know why out_of_memory() will return immediately if there is no __GFP_FS, could we drop these lines
directly:

     /*
      * The OOM killer does not compensate for IO-less reclaim.
      * pagefault_out_of_memory lost its gfp context so we have to
      * make sure exclude 0 mask - all other users should have at least       * ___GFP_DIRECT_RECLAIM to get here. But mem_cgroup_oom() has to
      * invoke the OOM killer even if it is a GFP_NOFS allocation.
      */
     if (oc->gfp_mask && !(oc->gfp_mask & __GFP_FS) && !is_memcg_oom(oc))
         return true;
The comment is rather hard to grasp without an intimate knowledge of the
memory reclaim. The primary reason is that the allocation context
without __GFP_FS (and also __GFP_IO) cannot perform a full memory
reclaim because fs or the storage subsystem might be holding locks
required for the memory reclaim. This means that a large amount of
reclaimable memory is out of sight of the specific direct reclaim
context. If we allowed oom killer to trigger we could invoke the oom
killer while there is a lot of otherwise reclaimable memory. As you can
imagine not something many users would appreciate as the oom kill is a
very disruptive operation. In this case we rely on kswapd or other
GFP_KERNEL like allocation context to make forward instead. If there is really nothing reclaimable then the oom killer would eventually hit from
elsewhere.

HTH
Hi Michal,

Understand. Thanks for explanation. So we can't remove those 2 lines of code.

Here in my patch, letting a kthread allocate a page with GFP_KERNEL, It could possibly trigger the reclaim and if nothing reclaimable, trigger the oom killer. Do you think it is a safe workaround for the issue we are facing currently?


And Hi Phillip,

What is your opinion on it, do you have a direction to solve this issue from filesystem?


The following patch creates the concept of "squashfs contexts", which moves all memory dynamically allocated (in a readahead/read_page path) into a single structure which can be allocated and deleted once.  It then creates a pool of these at filesystem mount time.  Threads entering readahead/read_page will take a context from the pool, and will then perform no dynamic memory allocation.

The final patch-series will make this a non-default build option for systems that need this.

Phillip



An updated version of the patch.

Hi Phillip,

The patch could fix the issue.

I built and tested the kernel linux-next 20230505 on my arm64 board, without your patch, the system will hang when I ran "$ stress-ng --bigheap 2 --sequential 0 --timeout 30s --skip-silent --verbose" or "stress-ng --sequential 0 --class os --timeout 30 --skip-silent --verbose". After applied your patch, I repeated bigheap testcase 60 times, the hang issue didn't happen, and I ran "--class os, --class vm and --class scheduler", all passed.

Thanks,

Hui.







[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux