> On Oct 13, 2018, at 2:34 AM, Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > >> On Sat, Oct 13, 2018 at 04:14:16AM +0200, Eugene Syromiatnikov wrote: >>> On Wed, Oct 10, 2018 at 04:36:56PM +0100, Catalin Marinas wrote: >>>> On Wed, Oct 10, 2018 at 04:10:21PM +0200, Eugene Syromiatnikov wrote: >>>> I have some questions regarding AArch64 ILP32 implementation for which I >>>> failed to find an answer myself: >>>> * How ptrace() tracer is supposed to distinguish between ILP32 and LP64 >>>> tracees? For MIPS N32 and x32 this is possible based on syscall >>>> number, but for AArch64 ILP32 I do not see such a sign. There's also >>>> ARM_ip is employed for signalling entering/exiting, I wonder whether >>>> it's possible to employ it also for signalling tracee's personality. >>> >>> With the current implementation, I don't think you can distinguish. From >>> the kernel perspective, the register set is the same. What is the >>> use-case for this? >> >> Err, a ptrace()-based tracer trying to trace a process, for example? > > I first thought it wouldn't matter for ptrace-based tracers since the > syscall numbers are (mostly) the same. But the arguments layout in > register is indeed different, so I see your point now about having to > distinguish. > >>> We could add a new regset to expose the ILP32 state (NT_ARM_..., I can't >>> think of a name now but probably not PER* as this implies PER_LINUX_... >>> which is independent from TIF_32BIT_*). >> >> So that would require an additional ptrace() call on each syscall stop, >> is that correct? > > The ILP32 state does not change at run-time, so it could only do a > ptrace() call once and save the information. No need to re-read it on > each syscall stop. > Please solve this in an arch independent way. This situation is basically unusably broken on x86 right now. Please solve it for real, by, for example, adding a new ptrace operation that returns something like this: enum ptrace_syscall_state { NO_SYSCALL, SYSCALL_ENTRY, SYSCALL_EXIT, /* other values may be defined in the future. */ }; struct ptrace_syscall_info { enum ptrace_syscall_state state; unsigned long arch; union { struct { unsigned long nr; unsigned long args[6]; } entry; struct { unsigned long ret; } exit; }; where arch is an AUDIT_ARCH_XYZ constant. On x86, it's currently essentially impossible for tools like strace to correctly decode syscalls. > We could set a high bit in the syscall number reported to the ptrace > caller (though not changing the syscall ABI) but I haven't thought of > other consequences. For example, can the ptrace caller change the > syscall number? Yes it can. > >>>> * What's the reasoning behind capping syscall arguments to 32 bit? x32 >>>> and MIPS N32 do not have such a restriction (and do not need special >>>> wrappers for syscalls that pass 64-bit values as a result, except >>>> when they do, as it is the case for preadv2 on x32); moreover, that >>>> would lead to insurmountable difficulties for AArch64 ILP32 tracers >>>> that try to trace LP64 tracees, as it would be impossible to pass >>>> 64-bit addresses to process_vm_{read,write} or ptrace PEEK/POKE. >>> >>> We've attempted in earlier versions to allow a mix of 32 and 64-bit >>> register values from ILP32 but it got pretty complicated. The entry code >>> would need to know which registers need zeroing of the top 32-bit >> >> If kernel specifies 64-bit wide registers for syscalls, then it's the >> caller's (libc's) responsibility to properly sign-extend arguments when >> needed, and glibc, for example, already has proper type definitions that >> aimed to handle this. > > We tried, see my other reply. > > -- > Catalin