On 6/8/21 10:19 AM, Dave Martin wrote:
On Tue, Jun 08, 2021 at 12:33:18PM +0100, Mark Brown via Libc-alpha wrote:
On Mon, Jun 07, 2021 at 07:12:13PM +0100, Catalin Marinas wrote:
I don't think we can document all the filters that can be added on top
various syscalls, so I'd leave it undocumented (or part of the systemd
documentation). It was a user space program (systemd) breaking another
user space program (well, anything with a new enough glibc). The kernel
ABI was still valid when /sbin/init started ;).
Indeed. I think from a kernel point of view the main thing is to look
at why userspace feels the need to do things like this and see if
there's anything we can improve or do better with in future APIs, part
of the original discussion here was figuring out that there's not really
any other reasonable options for userspace to implement this check at
the minute.
Ack, that would be my policy -- just wanted to make it explicit.
It would be good if there were better dialogue between the systemd
and kernel folks on this kind of thing.
SECCOMP makes it rather easy to (attempt to) paper over kernel/user API
design problems, which probably reduces the chance of the API ever being
fixed properly, if we're not careful...
Well IMHO the problem is larger than just BTI here, what systemd is
trying to do by fixing the exec state of a service is admirable but its
a 90% solution without the entire linker/loader being in a more
privileged context. While BTI makes finding a generic gadget that can
call mprotect harder, it still seems like it might just be a little too
easy. The secomp filter is providing a nice bonus by removing the
ability to disable BTI via mprotect without also disabling X. So without
moving more of the linker into the kernel its hard to see how one can
really lock down X only pages.
Anyway, i'm testing this on rawhide now.
Thanks!