On Wed 26-02-25 23:57:48, Baoquan He wrote: > On 02/26/25 at 01:01pm, Michal Hocko wrote: > > On Wed 26-02-25 19:51:12, Baoquan He wrote: > > > On 02/26/25 at 12:00pm, Michal Hocko wrote: > > > > On Wed 26-02-25 11:52:41, Michal Hocko wrote: > > > > > On Wed 26-02-25 18:00:26, Baoquan He wrote: > > > > > > On 02/26/25 at 07:54am, Michal Hocko wrote: > > > > > [...] > > > > > > > In any case 96a5c186efff seems incorrect because it assumes that the > > > > > > > protection has anything to do with how higher zone is populated while > > > > > > > the protection fundamentaly protects lower zone from higher zones > > > > > > > allocation. Those allocations are independent on the actual memory in > > > > > > > that zone. > > > > > > > > > > > > The protection value was introduced in non-NUMA time, and later adapted > > > > > > to NUMA system. While it still only reflects each zone with other zones > > > > > > within one specific node. We may need take this opportunity to > > > > > > reconsider it, e.g in the FALLBACK zonelists case it needs take crossing > > > > > > nodes into account. > > > > > > > > > > Are you suggesting zone fallback list to interleave nodes? I.e. > > > > > numa_zonelist_order we used to have in the past and that has been > > > > > removed by c9bff3eebc09 ("mm, page_alloc: rip out ZONELIST_ORDER_ZONE"). > > > > > > Hmm, if Gabriel can provide detailed node/zone information of the > > > system, we can check if there's anything we can do to adjust > > > zone->lowmem_reserve[] to reflect its real usage and semantics. > > > > Why do you think anything needs to be adjusted? > > No, I don't think like that. But I am wondering what makes you get > the conclusion. > > > > > > I haven't thought of the whole zone fallback list to interleave nodes > > > which invovles a lot of change. > > > > > > > > > > > Btw. has 96a5c186efff tried to fix any actual runtime problem? The > > > > changelog doesn't say much about that. > > > > > > No, no actual problem was observed on tht. > > > > OK > > > > > I was just trying to make > > > clear the semantics because I was confused by its obscure value printing > > > of zone->lowmem_reserve[] in /proc/zoneinfo. > > > > > > I think we can merge this reverting firstly, then to investigate how to > > > better clarify it. > > > > What do you think needs to be clarify? How exactly is the original > > output confusing? > > When I did the change, I wrote the reason in commit log. I don't think > you care to read it from your talking. Let me explain again, in > setup_per_zone_lowmem_reserve(), each zone's protection value is > calculated based on its own node's zones. E.g below on node 0, its > Movable zone and Device zone are empty but still show influence on > Normal/DMA32/DMA zone, this is unreasonable from the protection value > calculating code and its showing. You said that in the commit message without explanation why. Also I claim this is just wrong because the zone's protection is independent on the size of the zone that it is protected from. I have explained why I believe but let me reiterate. ZONE_DMA32 should be protected from GFP_MOVABLE even if the zone movable is empty (same as if it had a single or many pages). Why? Because, the lowmem reserve protects low memory allocation requests. See my point? Is that reasoning clear? P.S. I think we can have a more productive discussion without accusations. -- Michal Hocko SUSE Labs