On Thu, Jan 12, 2012 at 11:02 AM, Andrew Lutomirski <luto@xxxxxxx> wrote: > On Wed, Jan 11, 2012 at 9:25 AM, Will Drewry <wad@xxxxxxxxxxxx> wrote: >> This patch adds support for seccomp mode 2. This mode enables dynamic >> enforcement of system call filtering policy in the kernel as specified >> by a userland task. The policy is expressed in terms of a BPF program, >> as is used for userland-exposed socket filtering. Instead of network >> data, the BPF program is evaluated over struct user_regs_struct at the >> time of the system call (as retrieved using regviews). >> >https://www.google.com/calendar?tab=mc&authuser=1 > There's some seccomp-related code in the vsyscall emulation path in > arch/x86/kernel/vsyscall_64.c. How should time(), getcpu(), and > gettimeofday() be handled? Nice catch: lxr.linux.no/linux+v3.2.1/arch/x86/kernel/vsyscall_64.c#L180 I'd missed it. > If you want filtering to work, there > aren't any real syscall registers to inspect, but they could be > synthesized. Hrm, I wonder if making sure orig_eax is populated with the vsyscall_nr would be enough. Unless I'm misreading, args 0 and 1 are correct, so there may be other noise, but performing a call to __secure_computing() (either in the case or with a pre-validate syscall nr: 0-2) should send the do_exit. Does that sound reasonable? I'll try to do the right thing in my next patch set. > Preventing a malicious task from figuring out approximately what time > it is is basically impossible because of the way that vvars work. I > don't know how to change that efficiently. There are other ways to guess the time too, so I don't think it's that bad. For those that are really worried, they could disable or otherwise attempt to limit vsyscall access from their sandbox. thanks! will -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html