On 9/3/19 5:28 PM, Michal Hocko wrote:
On Tue 03-09-19 16:54:16, Honglei Wang wrote:
lruvec_lru_size() is involving lruvec_page_state_local() to get the
lru_size in the current code. It's base on lruvec_stat_local.count[]
of mem_cgroup_per_node. This counter is updated in batch. It won't
do charge if the number of coming pages doesn't meet the needs of
MEMCG_CHARGE_BATCH who's defined as 32 now.
This causes small section of memory can't be handled as expected in
some scenario. For example, if we have only 32 pages madvise free
memory in memcgroup, these pages won't be freed as expected when it
meets memory pressure in this group.
Could you be more specific please?
Okay, will add more detailed description at next version.
Getting lru_size base on lru_zone_size of mem_cgroup_per_node which
is not updated in batch can make this a bit more accurate.
This is effectivelly reverting 1a61ab8038e72. There were no numbers
backing that commit, neither this one has. The only hot path I can see
is workingset_refault. All others seems to be in the reclaim path.
Yep, I saw the lruvec increasing is not in batch way in
workingset_refault path, so thought maybe change the way in
lruvec_lru_size() to no-batch might match the act more.
It seems to me less than 32 pages deviations for seq_show stuff is not a
big deal (maybe it's not a big deal in lruvec_lru_size() as well...), so
I choose just modify lruvec_lru_size(), but not revert 1a61ab8038e72.
Signed-off-by: Honglei Wang <honglei.wang@xxxxxxxxxx>
That being said, I am not against this patch but the changelog should be
more specific about the particular problem and how serious it is.
Thanks for the suggestions. I don't think this is a really serious one,
and I'll try to give a more detailed changelog about what it's trying to
fix.
Honglei
---
mm/vmscan.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c77d1e3761a7..c28672460868 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -354,12 +354,13 @@ unsigned long zone_reclaimable_pages(struct zone *zone)
*/
unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx)
{
- unsigned long lru_size;
+ unsigned long lru_size = 0;
int zid;
- if (!mem_cgroup_disabled())
- lru_size = lruvec_page_state_local(lruvec, NR_LRU_BASE + lru);
- else
+ if (!mem_cgroup_disabled()) {
+ for (zid = 0; zid < MAX_NR_ZONES; zid++)
+ lru_size += mem_cgroup_get_zone_lru_size(lruvec, lru, zid);
+ } else
lru_size = node_page_state(lruvec_pgdat(lruvec), NR_LRU_BASE + lru);
for (zid = zone_idx + 1; zid < MAX_NR_ZONES; zid++) {
--
2.17.0