On Sun, 2015-12-20 at 16:49 +0100, Helge Deller wrote: > On 20.12.2015 15:09, Mathieu Desnoyers wrote: > > ----- On Dec 20, 2015, at 8:59 AM, Mathieu Desnoyers > > mathieu.desnoyers@xxxxxxxxxxxx wrote: > > > > > ----- On Dec 18, 2015, at 6:30 PM, Helge Deller deller@xxxxxx wro > > > te: > > > > > > > On parisc syscalls which are interrupted by signals sometimes > > > > fail to restart > > > > and instead return -ENOSYS which then in the worst case lead to > > > > userspace > > > > crashes. > > > > A similiar problem existed on MIPS and was fixed by commit > > > > e967ef02 > > > > ("MIPS: Fix restart of indirect syscalls"). > > > > > > > > On parisc the current syscall restart code assumes hat the > > > > syscall number is > > > > always loaded in the delay branch of the ble instruction as > > > > defined in the > > > > unistd.h header file and as such never restored %r20 before > > > > returning to > > > > userspace: > > > > ble 0x100(%sr2, %r0) > > > > ldi #syscall_nr, %r20 > > > > > > > > This assumption is at least not true for code which uses the > > > > syscall() glibc > > > > function, which instead uses this syntax: > > > > ble 0x100(%sr2, %r0) > > > > copy regX, %r20 > > > > where regX depend on how the compiler optimizes the code and > > > > register usage. > > > > > > > > This patch fixes this problem by adding code to analyze how the > > > > syscall number > > > > is loaded in the delay branch and - if needed - copy the > > > > syscall number to regX > > > > prior returning to userspace for the syscall restart. > > > > > > > > Signed-off-by: Helge Deller <deller@xxxxxx> > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> > > > > > > > > diff --git a/arch/parisc/kernel/signal.c > > > > b/arch/parisc/kernel/signal.c > > > > index dc1ea79..b0414ad 100644 > > > > --- a/arch/parisc/kernel/signal.c > > > > +++ b/arch/parisc/kernel/signal.c > > > > @@ -435,6 +435,48 @@ handle_signal(struct ksignal *ksig, struct > > > > pt_regs *regs, > > > > int in_syscall) > > > > regs->gr[28]); > > > > } > > > > > > > > +/* > > > > + * Check the delay branch in userspace how the syscall number > > > > gets loaded into > > > > + * %r20 and adjust as needed. > > > > > > I'm pretty sure "Check the delay branch in userspace how the > > > syscall..." > > > is not an English construct. ;-) Suggested rewording: > > > > > > "Check how the syscall number gets loaded into %r20 within > > > the delay branch in userspace and adjust as needed." > > Thanks! > I'll change that. > > > > > > + */ > > > > + > > > > +static void check_syscallno_in_delay_branch(struct pt_regs > > > > *regs) > > > > +{ > > > > + unsigned int opcode, source_reg; > > > > > > Why "unsigned int" above rather than u32 ? Since we're using > > > opcode as target variable for a get_user, it would be clearer > > > if the type of the __user * match the type of the target kernel > > > variable. (understood that those happen to have the same bitness > > > and type size on all Linux architectures, but it would be clearer > > > nevertheless). > > Yes, seems OK. > I'll change that. > > > > > + u32 __user *uaddr; > > > > + > > > > + /* Usually we don't have to restore %r20 (the system > > > > call number) > > > > + * because it gets loaded in the delay slot of the > > > > branch external > > > > + * instruction via the ldi instruction. > > > > + * In some cases a register-to-register copy > > > > instruction might have > > > > + * been used instead, in which case we need to copy > > > > the syscall > > > > + * number into the source register before returning to > > > > userspace. > > > > + */ > > > > + > > > > + /* A syscall is just a branch, so all > > > > + * we have to do is fiddle the return pointer. > > > > + */ > > > > + regs->gr[31] -= 8; /* delayed branching */ > > > > + > > > > + /* Get assembler opcode of code in delay branch */ > > > > + uaddr = (unsigned int *) (regs->gr[31] + 1); > > > > + get_user(opcode, uaddr); > > > > > > get_user() can fail due to EFAULT. This error should be > > > handled here, otherwise this could lead to the following > > > code using an uninitialized opcode variable, which could > > > indirectly leak a few bits of kernel stack information > > > to userspace (security concern). One attack vector I have > > > in mind for this is ptrace(), which might be able to tweak > > > those register values. > > Yes, generally get_user() can fail. > But this would be rather strange in that case, because > the syscall was started by userspace from this address. > So, without the code at that address in userspace, we would > never have reached this get_user(). Actually, that's not necessarily a safe assumption. Any memory allocation in a syscall path (except GFP_ATOMIC) can trigger reclaim and since this is a signal restart path, that's entirely possible. Reclaim could pull the backing page out from under the syscall, so in a low memory situation it is possible get_user() could fail with EFAULT unless get_user_page() has been called somewhere to pin the page. James -- To unsubscribe from this list: send the line "unsubscribe linux-parisc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html