This is not a subtle regression at all. It is in fact a very very serious one. On July 20, 2014 2:33:50 PM PDT, Sven Wegener <sven.wegener@xxxxxxxxxxx> wrote: >Commit 554086d ("x86_32, entry: Do syscall exit work on badsys >(CVE-2014-4508)") introduced a subtle regression in the x86_32 syscall >entry code, resulting in syscall() not returning proper errors for >non-existing syscalls on CPUs not supporting the sysenter feature. > >The following code: > >> int result = syscall(666); >> printf("result=%d errno=%d error=%s\n", result, errno, >strerror(errno)); > >results in: > >> result=666 errno=0 error=Success > >Obviously, the syscall return value is the called syscall number, but >it >should have been an ENOSYS error. When run under ptrace it behaves >correctly, which makes it hard to debug in the wild: > >> result=-1 errno=38 error=Function not implemented > >The %eax register is the return value register. For debugging via >ptrace >the syscall entry code stores the complete register context on the >stack. The badsys handlers only store the ENOSYS error code in the >ptrace register set and do not set %eax like a regular syscall handler >would. The old resume_userspace call chain contains code that clobbers >%eax and it restores %eax from the ptrace registers afterwards. The >same >goes for the ptrace-enabled call chain. When ptrace is not used, the >syscall return value is the passed-in syscall number from the >%eax register. > >Use %eax as the return value register in syscall_badsys and >sysenter_badsys, like a real syscall handler does, and have the caller >push the value onto the stack for ptrace access. > >Signed-off-by: Sven Wegener <sven.wegener@xxxxxxxxxxx> >--- > arch/x86/kernel/entry_32.S | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > >diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S >index dbaa23e..0d0c9d4 100644 >--- a/arch/x86/kernel/entry_32.S >+++ b/arch/x86/kernel/entry_32.S >@@ -425,8 +425,8 @@ sysenter_do_call: > cmpl $(NR_syscalls), %eax > jae sysenter_badsys > call *sys_call_table(,%eax,4) >- movl %eax,PT_EAX(%esp) > sysenter_after_call: >+ movl %eax,PT_EAX(%esp) > LOCKDEP_SYS_EXIT > DISABLE_INTERRUPTS(CLBR_ANY) > TRACE_IRQS_OFF >@@ -502,6 +502,7 @@ ENTRY(system_call) > jae syscall_badsys > syscall_call: > call *sys_call_table(,%eax,4) >+syscall_after_call: > movl %eax,PT_EAX(%esp) # store the return value > syscall_exit: > LOCKDEP_SYS_EXIT >@@ -675,12 +676,12 @@ syscall_fault: > END(syscall_fault) > > syscall_badsys: >- movl $-ENOSYS,PT_EAX(%esp) >- jmp syscall_exit >+ movl $-ENOSYS,%eax >+ jmp syscall_after_call > END(syscall_badsys) > > sysenter_badsys: >- movl $-ENOSYS,PT_EAX(%esp) >+ movl $-ENOSYS,%eax > jmp sysenter_after_call > END(syscall_badsys) > CFI_ENDPROC -- Sent from my mobile phone. Please pardon brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html