Re: [PATCH mm-unstable v2] mm/page_alloc: keep track of free highatomic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Oct 27, 2024 at 2:36 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
>
> On 10/27/24 21:17, Yu Zhao wrote:
> > On Sun, Oct 27, 2024 at 1:53 PM Vlastimil Babka <vbabka@xxxxxxx> wrote:
> >>
> >> On 10/26/24 05:36, Yu Zhao wrote:
> >> > OOM kills due to vastly overestimated free highatomic reserves were
> >> > observed:
> >> >
> >> >   ... invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0 ...
> >> >   Node 0 Normal free:1482936kB boost:0kB min:410416kB low:739404kB high:1068392kB reserved_highatomic:1073152KB ...
> >> >   Node 0 Normal: 1292*4kB (ME) 1920*8kB (E) 383*16kB (UE) 220*32kB (ME) 340*64kB (E) 2155*128kB (UE) 3243*256kB (UE) 615*512kB (U) 1*1024kB (M) 0*2048kB 0*4096kB = 1477408kB
> >> >
> >> > The second line above shows that the OOM kill was due to the following
> >> > condition:
> >> >
> >> >   free (1482936kB) - reserved_highatomic (1073152kB) = 409784KB < min (410416kB)
> >> >
> >> > And the third line shows there were no free pages in any
> >> > MIGRATE_HIGHATOMIC pageblocks, which otherwise would show up as type
> >> > 'H'. Therefore __zone_watermark_unusable_free() underestimated the
> >> > usable free memory by over 1GB, which resulted in the unnecessary OOM
> >> > kill above.
> >> >
> >> > The comments in __zone_watermark_unusable_free() warns about the
> >> > potential risk, i.e.,
> >> >
> >> >   If the caller does not have rights to reserves below the min
> >> >   watermark then subtract the high-atomic reserves. This will
> >> >   over-estimate the size of the atomic reserve but it avoids a search.
> >> >
> >> > However, it is possible to keep track of free pages in reserved
> >> > highatomic pageblocks with a new per-zone counter nr_free_highatomic
> >> > protected by the zone lock, to avoid a search when calculating the
> >>
> >> It's only possible to track this reliably since the "mm: page_alloc:
> >> freelist migratetype hygiene" patchset was merged, which explains why
> >> nr_reserved_highatomic was used until now, even if it's imprecise.
> >
> > I just refreshed my memory by quickly going through the discussion
> > around that series and didn't find anything that helps me understand
> > the above. More pointers please?
>
> For example:
>
> - a page is on pcplist in MIGRATE_MOVABLE list
> - we reserve its pageblock as highatomic, which does nothing to the page on
> the pcplist
> - page above is flushed from pcplist to zone freelist, but it remembers it
> was MIGRATE_MOVABLE, merges with another buddy/buddies from the
> now-highatomic list, the resulting order-X page ends up on the movable
> freelist despite being in highatomic pageblock. The counter of free
> highatomic is now wrong wrt the freelist reality

This is the part I don't follow: how is it wrong w.r.t. the freelist
reality? The new nr_free_highatomic should reflect how many pages are
exactly on free_list[MIGRATE_HIGHATOMIC], because it's updated
accordingly.

(My current understanding is that, in this case, the reservation
itself is messed up, i.e., under-reserved.)

> The series has addressed various scenarios like that, where page can end up
> on the wrong freelist.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux