On Mon, 24 Jul 2023 at 17:22, Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx> wrote: > > On 7/24/23 7:38 PM, Michał Mirosław wrote: > > On Mon, 24 Jul 2023 at 16:04, Muhammad Usama Anjum > > <usama.anjum@xxxxxxxxxxxxx> wrote: > >> > >> Fixed found bugs. Testing it further. > >> > >> - Split and backoff in case buffer full case as well > >> - Fix the wrong breaking of loop if page isn't interesting, skip intead > >> - Untag the address and save them into struct > >> - Round off the end address to next page > >> > >> Signed-off-by: Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx> > >> --- > >> fs/proc/task_mmu.c | 54 ++++++++++++++++++++++++++-------------------- > >> 1 file changed, 31 insertions(+), 23 deletions(-) > >> > >> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > >> index add21fdf3c9a..64b326d0ec6d 100644 > >> --- a/fs/proc/task_mmu.c > >> +++ b/fs/proc/task_mmu.c > >> @@ -2044,7 +2050,7 @@ static int pagemap_scan_thp_entry(pmd_t *pmd, > >> unsigned long start, > >> * Break huge page into small pages if the WP operation > >> * need to be performed is on a portion of the huge page. > >> */ > >> - if (end != start + HPAGE_SIZE) { > >> + if (end != start + HPAGE_SIZE || ret == -ENOSPC) { > > > > Why is it needed? If `end == start + HPAGE_SIZE` then we're handling a > > full hugepage anyway. > If we weren't able to add the complete thp in the output buffer and we need > to perform WP on the entire page, we should split and rollback. Otherwise > we'll WP the entire thp and we'll lose the state on the remaining THP which > wasn't added to output. > > Lets say max=100 > only 100 pages would be added to output > we need to split and rollback otherwise other 412 pages would get WP In this case *end will be truncated by output() to match the number of pages that fit. > >> @@ -2066,8 +2072,8 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, > >> unsigned long start, > >> { > >> struct pagemap_scan_private *p = walk->private; > >> struct vm_area_struct *vma = walk->vma; > >> + unsigned long addr, categories, next; > >> pte_t *pte, *start_pte; > >> - unsigned long addr; > >> bool flush = false; > >> spinlock_t *ptl; > >> int ret; > >> @@ -2088,12 +2094,14 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, > >> unsigned long start, > >> } > >> > >> for (addr = start; addr != end; pte++, addr += PAGE_SIZE) { > >> - unsigned long categories = p->cur_vma_category | > >> - pagemap_page_category(vma, addr, ptep_get(pte)); > >> - unsigned long next = addr + PAGE_SIZE; > >> + categories = p->cur_vma_category | > >> + pagemap_page_category(vma, addr, ptep_get(pte)); > >> + next = addr + PAGE_SIZE; > > > > Why moving the variable declarations out of the loop? > Saving spaces inside loop. What are pros of declation of variable in loop? Informing the reader that the variables have scope limited to the loop body. [...] > >> @@ -2219,22 +2225,24 @@ static int pagemap_scan_get_args(struct pm_scan_arg > >> *arg, > >> arg->category_anyof_mask | arg->return_mask) & ~PM_SCAN_CATEGORIES) > >> return -EINVAL; > >> > >> - start = untagged_addr((unsigned long)arg->start); > >> - end = untagged_addr((unsigned long)arg->end); > >> - vec = untagged_addr((unsigned long)arg->vec); > >> + arg->start = untagged_addr((unsigned long)arg->start); > >> + arg->end = untagged_addr((unsigned long)arg->end); > >> + arg->vec = untagged_addr((unsigned long)arg->vec); > > > > BTW, We should we keep the tag in args writeback(). > Sorry what? > After this function, the start, end and vec would be used. We need to make > sure that the address are untagged before that. We do write back the address the walk ended at to arg->start in userspace. This pointer I think needs the tag reconstructed so that retrying the ioctl() will be possible. Best Regards Michał Mirosław