On Tue, Jan 17, 2012 at 4:56 PM, Indan Zupancic <indan@xxxxxx> wrote: > Wait: If a tasks is set to 64 bit mode, but calls into the kernel via > int 0x80 it's changed to 32 bit mode for that system call and back to > 64 bit mode when the system call is finished!? Well, saying it like that suggests that there is more of a "mode change" than really exists. It's simply that any task can use int $0x80 and this always means using the 32-bit syscall table with TS_COMPAT set. > Our ptrace jailer is checking cs to figure out if a task is a compat task > or not, if the kernel can change that behind our back it means our jailer > isn't secure for x86_64 with compat enabled. Or is cs changed before the > ptrace stuff and ptrace sees the "right" cs value? If not, we have to add > an expensive PTRACE_PEEKTEXT to check if it's an int 0x80 or not. Or is > there another way? I don't think there's another way. hpa and I once discussed adding a field to the extractable "register state" that would say which method the syscall in progress had taken to enter the kernel. That would tell you which flavor of syscall instruction was used (or none, i.e. a trap/interrupt). But nobody ever had a real need for it, and we didn't pursue it further. (We originally talked about it in the context of distinguishing whether a 32-bit task had used sysenter or syscall or int $0x80, I think.) > I think this behaviour is so unexpected that it can only cause security > problems in the long run. Is anyone counting on this? Where is this > behaviour documented? It's documented the same place the entire Linux machine-level ABI is documented, which is nowhere. Someone somewhere may once have been counting on it. (The story I heard was about an implementation of valgrind for 32-bit code that ran in 64-bit tasks, but I don't know for sure that it was really done.) The general rule is that if it ever worked before in a coherent way, we don't break binary compatibility. In the implementation, it would require a special check to make it barf. It's really just something that falls out of how the hardware and the kernel implementation works. I suppose you could add such a check under a new kconfig option that's marked as being potentially incompatible with some old applications. Good luck with that. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html