On Tuesday 27 May 2014 03:51 PM, Kirill A. Shutemov wrote: > Madhavan Srinivasan wrote: >> On Tuesday 20 May 2014 03:57 PM, Kirill A. Shutemov wrote: >>> Rusty Russell wrote: >>>> "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> writes: >>>>> Andrew Morton wrote: >>>>>> On Mon, 19 May 2014 16:23:07 -0700 (PDT) Hugh Dickins <hughd@xxxxxxxxxx> wrote: >>>>>> >>>>>>> Shouldn't FAULT_AROUND_ORDER and fault_around_order be changed to be >>>>>>> the order of the fault-around size in bytes, and fault_around_pages() >>>>>>> use 1UL << (fault_around_order - PAGE_SHIFT) >>>>>> >>>>>> Yes. And shame on me for missing it (this time!) at review. >>>>>> >>>>>> There's still time to fix this. Patches, please. >>>>> >>>>> Here it is. Made at 3.30 AM, build tested only. >>>> >>>> Prefer on top of Maddy's patch which makes it always a variable, rather >>>> than CONFIG_DEBUG_FS. It's got enough hair as it is. >>> >>> Something like this? >>> >>> From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> >>> Date: Tue, 20 May 2014 13:02:03 +0300 >>> Subject: [PATCH] mm: nominate faultaround area in bytes rather then page order >>> >>> There are evidences that faultaround feature is less relevant on >>> architectures with page size bigger then 4k. Which makes sense since >>> page fault overhead per byte of mapped area should be less there. >>> >>> Let's rework the feature to specify faultaround area in bytes instead of >>> page order. It's 64 kilobytes for now. >>> >>> The patch effectively disables faultaround on architectures with >>> page size >= 64k (like ppc64). >>> >>> It's possible that some other size of faultaround area is relevant for a >>> platform. We can expose `fault_around_bytes' variable to arch-specific >>> code once such platforms will be found. >>> >>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> >>> --- >>> mm/memory.c | 62 +++++++++++++++++++++++-------------------------------------- >>> 1 file changed, 23 insertions(+), 39 deletions(-) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index 037b812a9531..252b319e8cdf 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -3402,63 +3402,47 @@ void do_set_pte(struct vm_area_struct *vma, unsigned long address, >>> update_mmu_cache(vma, address, pte); >>> } >>> >>> -#define FAULT_AROUND_ORDER 4 >>> +static unsigned long fault_around_bytes = 65536; >>> + >>> +static inline unsigned long fault_around_pages(void) >>> +{ >>> + return rounddown_pow_of_two(fault_around_bytes) / PAGE_SIZE; >>> +} >>> + >>> +static inline unsigned long fault_around_mask(void) >>> +{ >>> + return ~(rounddown_pow_of_two(fault_around_bytes) - 1) & PAGE_MASK; >>> +} >>> >>> -#ifdef CONFIG_DEBUG_FS >>> -static unsigned int fault_around_order = FAULT_AROUND_ORDER; >>> >>> -static int fault_around_order_get(void *data, u64 *val) >>> +#ifdef CONFIG_DEBUG_FS >>> +static int fault_around_bytes_get(void *data, u64 *val) >>> { >>> - *val = fault_around_order; >>> + *val = fault_around_bytes; >>> return 0; >>> } >>> >>> -static int fault_around_order_set(void *data, u64 val) >>> +static int fault_around_bytes_set(void *data, u64 val) >>> { >> >> Kindly ignore the question if not relevant. Even though we need root >> access to alter the value, will we be fine with >> negative value?. > ppc > val is u64. or I miss something? > My Bad. What I wanted to check was for all 0xf input and guess we are fine. Sorry about that. Regards Maddy -- To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html