Denys Vlasenko wrote: > On Wednesday 25 January 2012 20:36, Oleg Nesterov wrote: > > On 01/18, Linus Torvalds wrote: > > > > > > Using the high bits of 'eflags' might work. > > > > I thought about changing eflags too, this looks very natural to me. > > > > But I do not understand the result of this discussion, are you going > > to apply this change? > > > > If not... > > > > Not sure this is really better, but there is another idea. Currently we > > have PTRACE_O_TRACESYSGOOD to avoid the confusion with the real SIGTRAP. > > Perhaps we can add PTRACE_O_TRACESYS_VERY_GOOD (or we can look at > > PT_SEIZED instead) and report TS_COMPAT via ptrace_report_syscall ? > > > > IOW. Currently ptrace_report_syscall() does > > > > ptrace_notify(SIGTRAP | ((ptrace & PT_TRACESYSGOOD) ? 0x80 : 0)); > > > > We can add the new events, > > > > PTRACE_EVENT_SYSCALL_ENTRY > > PTRACE_EVENT_SYSCALL_COMPAT_ENTRY > > PTRACE_EVENT_SYSCALL_EXIT > > PTRACE_EVENT_SYSCALL_COMPAT_EXIT > > We can get away with just the first one. > (1) It's unlikely people would want to get native sysentry events but not compat ones, > thus first two options can be combined into one; Tracers mainly want to know if it's a 32-bit or 64-bit syscall, not whether it's compat as such. I'm thinking it might be a little kinder like this: #define PTRACE_EVENT_SYSCALL_ENTRY_ABI32 (...) #define PTRACE_EVENT_SYSCALL_ENTRY_ABI64 (...) #ifdef CONFIG_64BIT # define PTRACE_EVENT_SYSCALL_ENTRY PTRACE_EVENT_SYSCALL_ENTRY_ABI64 # define PTRACE_EVENT_SYSCALL_ENTRY_COMPAT PTRACE_EVENT_SYSCALL_ENTRY_ABI32 #else # define PTRACE_EVENT_SYSCALL_ENTRY PTRACE_EVENT_SYSCALL_ENTRY_ABI32 #endif So the ABI is represented directly, with the _ENTRY referring to the tracer's own. (Other ABI numbers can exist, e.g. OABI and EABI for ARM, see below.) This has the two specific advantages: 1. It can match on specific ABI or regular/compat, as suits the tracer's code. 2. When a 32-bit *tracer* is running a 64-bit *tracee* as least it knows ;-) With your idea, what happens in situation 2? I'm not sure a 32-bit tracee can do anything useful, because it can't get the 64-bit registers, but at least it can see when it's got the wrong registers :-) > (2) syscall exit compat-ness is known from entry type - no need to indicate it; and > (3) if we would flag syscall entry with an event value in wait status, then syscall > exit will be already distinquisable. > > Thus, minimally we need one new option, PTRACE_O_TRACE_SYSENTRY - > "on syscall entry ptrace stop, set a nonzero event value in wait status" > , and two event values: PTRACE_EVENT_SYSCALL_ENTRY (for native entry), > PTRACE_EVENT_SYSCALL_ENTRY1 for compat one. PTRACE_EVENT_SYSCALL_EXIT would cleanly indicate that the new option is actually working without the tracer needing to do a fork+test, if PTRACE_ATTACH is used and for some reason the tracer sees a syscall exit first. I'm not sure if this can happen but I've heard rumour of it on some archs or kernel versions. > To future-proof this scheme we may reserve a few more event values > PTRACE_EVENT_SYSCALL_ENTRY2, PTRACE_EVENT_SYSCALL_ENTRY3, etc, > if we'll ever have arches with more than one non-native syscall > entry. > I'm no expert, but looking at strace code, ARM may already have > more than one additional convention how to pass syscall args. I was just looking at ARM and see exactly the same thing. The difference between EABI and OABI calls is significant on ARM, even though syscall numbers are the same; and the ABI is selected by the syscall instruction used, not process personality. The __NR_name values differ for each ABI, but (if I read arm/kernel/entry-common.S properly) strace sees the same _NR_name values for both ABIs. MIPS also has two different 32-bit ABIs, as well as 64-bit, but on MIPS the syscall numbers are distinct, and should be seen by ptrace. (Again if I read mips/kernel/ correctly.) PA-RISC also has two different ABIs, the Linux one and the HPUX one. The syscall numbers are different but overlap. I don't know if they are distinct to ptrace, in which case using the HPUX entry point might be used to subvert a ptracer unless the ABI number is exposed. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html