On 5/3/2023 7:35 AM, Raghavendra K T wrote:
With the recent numa scan enhancements, only the tasks which had
previously accessed vma are allowed to scan.
While this has improved significant system time overhead, there are
corner cases, which genuinely needs some relaxation for e.g., concern
raised by PeterZ where unfairness amongst the theread belonging to
disjoint set of VMSs can potentially amplify the side effects of vma
regions belonging to some of the tasks being left unscanned.
To address this, allow scanning for first few times with a per vma
counter.
Signed-off-by: Raghavendra K T <raghavendra.kt@xxxxxxx>
---
Some clarification:
base was linux-next-20230411 (because I have some issue with
linux-next-20230425 onwards and linux master branch, which I am diging.
include/linux/mm_types.h | 1 +
kernel/sched/fair.c | 30 +++++++++++++++++++++++++++---
2 files changed, 28 insertions(+), 3 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3fc9e680f174..f66e6b4e0620 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -479,6 +479,7 @@ struct vma_numab_state {
unsigned long next_scan;
unsigned long next_pid_reset;
unsigned long access_pids[2];
+ unsigned int scan_counter;
};
/*
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a29ca11bead2..3c50dc3893eb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2928,19 +2928,38 @@ static void reset_ptenuma_scan(struct task_struct *p)
p->mm->numa_scan_offset = 0;
}
+/* Scan 1GB or 4 * scan_size */
+#define VMA_DISJOINT_SET_ACCESS_THRESH 4U
+
static bool vma_is_accessed(struct vm_area_struct *vma)
{
unsigned long pids;
+ unsigned int windows;
Missed windows = 0 while splitting the patch
will be corrected in next posting.
/me Remembered after kernel test robot noticed
[...]