> On Feb 20, 2023, at 5:24 AM, Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx> wrote: > >>> +static inline int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start, >>> + unsigned long end, struct mm_walk *walk) >>> +{ >>> + struct pagemap_scan_private *p = walk->private; >>> + struct vm_area_struct *vma = walk->vma; >>> + unsigned long addr = end; >>> + spinlock_t *ptl; >>> + int ret = 0; >>> + pte_t *pte; >>> + >>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE >>> + ptl = pmd_trans_huge_lock(pmd, vma); >>> + if (ptl) { >>> + bool pmd_wt; >>> + >>> + pmd_wt = !is_pmd_uffd_wp(*pmd); >>> + /* >>> + * Break huge page into small pages if operation needs to be >>> performed is >>> + * on a portion of the huge page. >>> + */ >>> + if (pmd_wt && IS_WP_ENGAGE_OP(p) && (end - start < HPAGE_SIZE)) { >>> + spin_unlock(ptl); >>> + split_huge_pmd(vma, pmd, start); >>> + goto process_smaller_pages; >> I think that such goto's are really confusing and should be avoided. And >> using 'else' (could have easily prevented the need for goto). It is not the >> best solution though, since I think it would have been better to invert the >> conditions. > Yeah, else can be used here. But then we'll have to add a tab to all the > code after adding else. We have already so many tabs and very less space to > right code. Not sure which is better. goto’s are usually not the right solution. You can extract things into a different function if you have to. I’m not sure why IS_GET_OP(p) might be false and what’s the meaning of taking the lock and dropping it in such a case. I think that the code can be simplified and additional condition nesting can be avoided. >>> --- a/include/uapi/linux/fs.h >>> +++ b/include/uapi/linux/fs.h >>> @@ -305,4 +305,54 @@ typedef int __bitwise __kernel_rwf_t; >>> #define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\ >>> RWF_APPEND) >>> +/* Pagemap ioctl */ >>> +#define PAGEMAP_SCAN _IOWR('f', 16, struct pagemap_scan_arg) >>> + >>> +/* Bits are set in the bitmap of the page_region and masks in >>> pagemap_scan_args */ >>> +#define PAGE_IS_WRITTEN (1 << 0) >>> +#define PAGE_IS_FILE (1 << 1) >>> +#define PAGE_IS_PRESENT (1 << 2) >>> +#define PAGE_IS_SWAPPED (1 << 3) >> >> These names are way too generic and are likely to be misused for the wrong >> purpose. The "_IS_" part seems confusing as well. So I think the naming >> needs to be fixed and some new type (using typedef) or enum should be >> introduced to hold these flags. I understand it is part of uapi and it is >> less common there, but it is not unheard of and does make things clearer. > Do you think PM_SCAN_PAGE_IS_* work here? Can we lose the IS somehow? > >> >> >>> + >>> +/* >>> + * struct page_region - Page region with bitmap flags >>> + * @start: Start of the region >>> + * @len: Length of the region >>> + * bitmap: Bits sets for the region >>> + */ >>> +struct page_region { >>> + __u64 start; >>> + __u64 len; >> >> I presume in bytes. Would be useful to mention. > Length of region in pages. Very unintuitive to me I must say. If the start is an address, I would expect the len to be in bytes.