On Wed, Jan 3, 2024 at 2:30 PM Jaroslav Pulchart <jaroslav.pulchart@xxxxxxxxxxxx> wrote: > > > > > > > > > Hi yu, > > > > > > On 12/2/2023 5:22 AM, Yu Zhao wrote: > > > > Charan, does the fix previously attached seem acceptable to you? Any > > > > additional feedback? Thanks. > > > > > > First, thanks for taking this patch to upstream. > > > > > > A comment in code snippet is checking just 'high wmark' pages might > > > succeed here but can fail in the immediate kswapd sleep, see > > > prepare_kswapd_sleep(). This can show up into the increased > > > KSWAPD_HIGH_WMARK_HIT_QUICKLY, thus unnecessary kswapd run time. > > > @Jaroslav: Have you observed something like above? > > > > I do not see any unnecessary kswapd run time, on the contrary it is > > fixing the kswapd continuous run issue. > > > > > > > > So, in downstream, we have something like for zone_watermark_ok(): > > > unsigned long size = wmark_pages(zone, mark) + MIN_LRU_BATCH << 2; > > > > > > Hard to convince of this 'MIN_LRU_BATCH << 2' empirical value, may be we > > > should atleast use the 'MIN_LRU_BATCH' with the mentioned reasoning, is > > > what all I can say for this patch. > > > > > > + mark = sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING ? > > > + WMARK_PROMO : WMARK_HIGH; > > > + for (i = 0; i <= sc->reclaim_idx; i++) { > > > + struct zone *zone = lruvec_pgdat(lruvec)->node_zones + i; > > > + unsigned long size = wmark_pages(zone, mark); > > > + > > > + if (managed_zone(zone) && > > > + !zone_watermark_ok(zone, sc->order, size, sc->reclaim_idx, 0)) > > > + return false; > > > + } > > > > > > > > > Thanks, > > > Charan > > > > > > > > -- > > Jaroslav Pulchart > > Sr. Principal SW Engineer > > GoodData > > > Hello, > > today we try to update servers to 6.6.9 which contains the mglru fixes > (from 6.6.8) and the server behaves much much worse. > > I got multiple kswapd* load to ~100% imediatelly. > 555 root 20 0 0 0 0 R 99.7 0.0 4:32.86 > kswapd1 > 554 root 20 0 0 0 0 R 99.3 0.0 3:57.76 > kswapd0 > 556 root 20 0 0 0 0 R 97.7 0.0 3:42.27 > kswapd2 > are the changes in upstream different compared to the initial patch > which I tested? > > Best regards, > Jaroslav Pulchart Hi Jaroslav, My apologies for all the trouble! Yes, there is a slight difference between the fix you verified and what went into 6.6.9. The fix in 6.6.9 is disabled under a special condition which I thought wouldn't affect you. Could you try the attached fix again on top of 6.6.9? It removed that special condition. Thanks!
Attachment:
mglru-fix-6.6.9.patch
Description: Binary data