On 06/03, Tycho Andersen wrote: > > On Tue, Jun 02, 2015 at 08:28:29PM +0200, Oleg Nesterov wrote: > > On 06/01, Tycho Andersen wrote: > > > > > > --- a/include/linux/seccomp.h > > > +++ b/include/linux/seccomp.h > > > @@ -25,6 +25,9 @@ struct seccomp_filter; > > > struct seccomp { > > > int mode; > > > struct seccomp_filter *filter; > > > +#ifdef CONFIG_CHECKPOINT_RESTORE > > > + bool suspended; > > > +#endif > > > > Then afaics you need to change copy_seccomp() to clear ->suspended. > > At least if the child is not traced. > > Yes, thank you. And if we really need to play with TIF_NOTSC, then copy_seccomp() should set it too if SUSPEND has cleared in parent's flags. > > But why do we bother to play with TIF_NOTSC, could you explain? > > The procedure for restoring is to call seccomp suspend, restore the > seccomp filters (and potentially other stuff), and then resume them at > the end. If the other stuff happens to use RDTSC, the process gets > killed because TIF_NOTSC has been set. This is clear, just I thought that CRIU doesn't use rdtsc on behalf of the traced task... > We can work around this in criu by doing the seccomp restore as the > very last thing before the final sigreturn, Not sure I understand... You need to suspend at "dump" time too afaics, otherwise, say, syscall_seized() can fail because this syscall is nacked by seccomp? > but that seems like the > seccomp suspend API is incomplete, IMO. However, since both you and > Andy complained, perhaps I should remove it :) Well, this is up to you ;) But. Note that a process can also disable TSC via PR_SET_TSC. So if dump or restore can't work without enabling TSC you probably want to handle this case too. And this makes me think that this needs a separate interface. I dunno. > > And I am not sure I understand why do we need the additional security > > check, but I leave this to you and Andy. > > Yes, it is required to prevent the case Pavel mentions (although there > are other ways to get around seccomp with ptrace, the goal here is to > not depend on that behavior so that when it is eventually fixed this > doesn't break). I still do not think it makes any sense. again, if you can trace this process then you can disable the filtering anyway. Lets assume that seccomp_run_filters() acks, say, sys_getpid(). Or fork() in the case Pavel mentioned, this doesn't matter. Now you can force the tracee to call this syscall, then change syscall_nr. But as I said I won't argue, please forget. > Ok, this has changed slightly with the "always resume on > detach/unlink" change Pavel suggested, To remind, it is not easy to restore TIF_NOTSC if the tracer dies. PTRACE_DETACH can do this because the tracee can't be woken up. But personally I'd prefer the expicit RESUME request rather than "rely on PTRACE_DETACH". If we avoid the TSC games, then, again, please consider PTRACE_O_SECCOMP_DISABLE. This will solve the problems with fork/detach/tracer-death automatically. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html