Re: [RFC PATCH 4/5] powerpc/mm: Remove custom stack expansion checking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christophe Leroy <christophe.leroy@xxxxxxxxxx> writes:
> Le 03/07/2020 à 16:13, Michael Ellerman a écrit :
>> We have powerpc specific logic in our page fault handling to decide if
>> an access to an unmapped address below the stack pointer should expand
>> the stack VMA.
>> 
>> The logic aims to prevent userspace from doing bad accesses below the
>> stack pointer. However as long as the stack is < 1MB in size, we allow
>> all accesses without further checks. Adding some debug I see that I
>> can do a full kernel build and LTP run, and not a single process has
>> used more than 1MB of stack. So for the majority of processes the
>> logic never even fires.
>> 
>> We also recently found a nasty bug in this code which could cause
>> userspace programs to be killed during signal delivery. It went
>> unnoticed presumably because most processes use < 1MB of stack.
>> 
>> The generic mm code has also grown support for stack guard pages since
>> this code was originally written, so the most heinous case of the
>> stack expanding into other mappings is now handled for us.
>> 
>> Finally although some other arches have special logic in this path,
>> from what I can tell none of x86, arm64, arm and s390 impose any extra
>> checks other than those in expand_stack().
>> 
>> So drop our complicated logic and like other architectures just let
>> the stack expand as long as its within the rlimit.
>
> I agree that's probably not worth a so complicated logic that is nowhere 
> documented.
>
> This patch looks good to me, minor comments below.

Thanks.

>> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
>> index ed01329dd12b..925a7231abb3 100644
>> --- a/arch/powerpc/mm/fault.c
>> +++ b/arch/powerpc/mm/fault.c
>> @@ -42,39 +42,7 @@
>>   #include <asm/kup.h>
>>   #include <asm/inst.h>
>>   
>> -/*
>> - * Check whether the instruction inst is a store using
>> - * an update addressing form which will update r1.
>> - */
>> -static bool store_updates_sp(struct ppc_inst inst)
>> -{
>> -	/* check for 1 in the rA field */
>> -	if (((ppc_inst_val(inst) >> 16) & 0x1f) != 1)
>> -		return false;
>> -	/* check major opcode */
>> -	switch (ppc_inst_primary_opcode(inst)) {
>> -	case OP_STWU:
>> -	case OP_STBU:
>> -	case OP_STHU:
>> -	case OP_STFSU:
>> -	case OP_STFDU:
>> -		return true;
>> -	case OP_STD:	/* std or stdu */
>> -		return (ppc_inst_val(inst) & 3) == 1;
>> -	case OP_31:
>> -		/* check minor opcode */
>> -		switch ((ppc_inst_val(inst) >> 1) & 0x3ff) {
>> -		case OP_31_XOP_STDUX:
>> -		case OP_31_XOP_STWUX:
>> -		case OP_31_XOP_STBUX:
>> -		case OP_31_XOP_STHUX:
>> -		case OP_31_XOP_STFSUX:
>> -		case OP_31_XOP_STFDUX:
>> -			return true;
>> -		}
>> -	}
>> -	return false;
>> -}
>> +
>
> Do we need this additional blank line ?

I usually leave two blank lines between the end of the includes and the
start of the code, which is what I did here I guess.

>>   /*
>>    * do_page_fault error handling helpers
>>    */
>> @@ -267,54 +235,6 @@ static bool bad_kernel_fault(struct pt_regs *regs, unsigned long error_code,
>>   	return false;
>>   }
>>   
>> -static bool bad_stack_expansion(struct pt_regs *regs, unsigned long address,
>> -				struct vm_area_struct *vma, unsigned int flags,
>> -				bool *must_retry)
>> -{
>> -	/*
>> -	 * N.B. The POWER/Open ABI allows programs to access up to
>> -	 * 288 bytes below the stack pointer.
>> -	 * The kernel signal delivery code writes up to 4KB
>> -	 * below the stack pointer (r1) before decrementing it.
>> -	 * The exec code can write slightly over 640kB to the stack
>> -	 * before setting the user r1.  Thus we allow the stack to
>> -	 * expand to 1MB without further checks.
>> -	 */
>> -	if (address + 0x100000 < vma->vm_end) {
>> -		struct ppc_inst __user *nip = (struct ppc_inst __user *)regs->nip;
>> -		/* get user regs even if this fault is in kernel mode */
>> -		struct pt_regs *uregs = current->thread.regs;
>> -		if (uregs == NULL)
>> -			return true;
>> -
>> -		/*
>> -		 * A user-mode access to an address a long way below
>> -		 * the stack pointer is only valid if the instruction
>> -		 * is one which would update the stack pointer to the
>> -		 * address accessed if the instruction completed,
>> -		 * i.e. either stwu rs,n(r1) or stwux rs,r1,rb
>> -		 * (or the byte, halfword, float or double forms).
>> -		 *
>> -		 * If we don't check this then any write to the area
>> -		 * between the last mapped region and the stack will
>> -		 * expand the stack rather than segfaulting.
>> -		 */
>> -		if (address + 4096 >= uregs->gpr[1])
>> -			return false;
>> -
>> -		if ((flags & FAULT_FLAG_WRITE) && (flags & FAULT_FLAG_USER) &&
>> -		    access_ok(nip, sizeof(*nip))) {
>> -			struct ppc_inst inst;
>> -
>> -			if (!probe_user_read_inst(&inst, nip))
>> -				return !store_updates_sp(inst);
>> -			*must_retry = true;
>> -		}
>> -		return true;
>> -	}
>> -	return false;
>> -}
>> -
>>   #ifdef CONFIG_PPC_MEM_KEYS
>>   static bool access_pkey_error(bool is_write, bool is_exec, bool is_pkey,
>>   			      struct vm_area_struct *vma)
>> @@ -480,7 +400,6 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>>   	int is_user = user_mode(regs);
>>   	int is_write = page_fault_is_write(error_code);
>>   	vm_fault_t fault, major = 0;
>> -	bool must_retry = false;
>>   	bool kprobe_fault = kprobe_page_fault(regs, 11);
>>   
>>   	if (unlikely(debugger_fault_handler(regs) || kprobe_fault))
>> @@ -569,30 +488,15 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>>   	vma = find_vma(mm, address);
>>   	if (unlikely(!vma))
>>   		return bad_area(regs, address);
>> -	if (likely(vma->vm_start <= address))
>> -		goto good_area;
>> -	if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
>> -		return bad_area(regs, address);
>>   
>> -	/* The stack is being expanded, check if it's valid */
>> -	if (unlikely(bad_stack_expansion(regs, address, vma, flags,
>> -					 &must_retry))) {
>> -		if (!must_retry)
>> +	if (unlikely(vma->vm_start > address)) {
>> +		if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
>
> We are already in an unlikely() branch, I don't think it is worth having 
> a second level of unlikely(), better let gcc decide what's most efficient.

Yeah I did wonder if we need so many unlikelys in here, but I thought
that should be done in a separate patch with some actual analysis of the
generated code.

cheers




[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux