Re: [PATCH] seccomp: passthrough uretprobe systemcall without filtering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 18, 2025 at 07:39:25PM -0800, Eyal Birger wrote:
> Hi,
> 
> Thank you for the detailed response.
> 
> On Sat, Jan 18, 2025 at 6:25 PM Kees Cook <kees@xxxxxxxxxx> wrote:
> 
> >
> > On January 18, 2025 12:45:47 PM PST, Eyal Birger <eyal.birger@xxxxxxxxx> wrote:
> > >I think the difference is that this syscall is not part of the process's
> > >code - it is inserted there by another process tracing it.
> >
> > Well that's nothing like syscall_restart, and now I'm convinced seccomp must never ignore uretprobe -- a process might want to block uretprobe!
> >
> 
> I think I understand your point. But do you think this is intentional?
> i.e. seccomp couldn't have been used to block uretprobes before this
> syscall implementation afaict.
> 
> > So, no, sorry, this needs to be handled by the seccomp policy that is applied to the process.
> >
> 
> The problem we're facing is that existing workloads are breaking, and
> as mentioned I'm not sure how practical it is to demand replacing a
> working docker environment because of a new syscall that was added for
> performance reasons.
> 
> > >So this is different than desiring to deploy a new version of a binary
> > >that uses a new libc or a new syscall.
> >
> > Uh, no, the case I used as an example was no changes to anything except the kernel. Libc noticed the available syscall, uses it, and is instantly killed by the Docker seccomp policy which didn't know about that syscall.
> >
> 
> That's an interesting situation and quite unexpected :) I'm glad I didn't
> have to face that one in production.
> 
> > > Here the case is that there are
> > >three players - the tracer running out of docker, the tracee running in docker,
> > >and docker itself. All three were running fine in a specific kernel version,
> > >but upgrading the kernel now crashes the traced process.
> >
> > If uretprobe used to work without a syscall, then that seems to be the problem.
> 
> I agree.
> 
> > But I think easiest is just fixing the Docker policy. (Which is a text file configuration change; no new binaries, no rebuilds!).
> 
> As far as I can tell libseccomp needs to provide support for this new
> syscall and a new docker version would need to be deployed, so It's not
> just a configuration change. Also the default policy which comes packed in
> docker would probably need to be changed to avoid having to explicitly
> provide a seccomp configuration for each deployment.
> 
> >
> > >I think this syscall is different in that respect for the reasons described.
> >
> > I don't agree, sorry. Seccomp has a really singular and specific purpose, which is explicitly *externalizing* policy. I do not want to have policy within seccomp itself.
> >
> 
> Understood.
> 
> > >I don't know if seccomp is behaving correctly when it blocks a kernel
> > >implementation detail that isn't user created.
> >
> > But it is user created? Something added a uretprobe to a process who's seccomp policy is not expecting it. This seems precisely by design.
> 
> I think I wasn't accurate in my wording.
> The uretprobe syscall is added to the tracee by the kernel.
> The tracer itself is merely requesting to attach a uretprobe bpf
> function. In previous versions, this was implemented by the kernel
> installing an int3 instruction, and in the new implementation the kernel
> is installing a uretprobe syscall.
> The "user" in this case - the tracer program - didn't deliberately install
> the syscall, but anyway this is semantics.

that's correct, uretprobe syscall is installed by kernel to special user
memory map and it can be executed only from there and if process calls it
from another place it receives sigill

so at the end the process executes the uretprobe syscall, but it's up to
kernel to decide that and set it up..  but I don't know if that's strong
enough reason for seccomp to ignore the syscall

> 
> I think I understand your point that it is regarded as "policy", only that
> it creates a problem in actual deployments, where in order to be able to
> run the tracer software which has been working on newer kernels a new docker
> has to be deployed.
> 
> I'm trying to find a pragmatic solution to this problem, and I understand
> the motivation to avoid policy in seccomp.

I could think of sysctl for that.. you complained earlier about weird
semantics for that [1], but I think it's better than to remove it

jirka

> 
> Alternatively, maybe this syscall implementation should be reverted?
> 
> Thanks again,
> Eyal.

[1] https://lore.kernel.org/bpf/CAHsH6Gs03iJt-ziWt5Bye_DuqCbk3TpMmgPbkYh64XBvpGaDtw@xxxxxxxxxxxxxx/




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux