On Mon, 2016-01-18 at 12:51 +0100, Florian Weimer wrote: > glibc malloc can basically call *anything*. We don't know what the > future will bring. Currently, we use (off the top of my head, I > haven't > checked): > > * sbrk > * mmap > * mprotect > * munmap > * mremap > * madvise > * futex > * open > * read > * close > > (In some cases, there is some sort of fallback, or errors are ignored > and the optimization does not happen.) > > Future versions might reasonably use: > > * sched_getcpu > * clone > * clock_gettime > * more open/read/close > * readlink > * whatever system calls are used for memory protection keys > * whatever system calls are used for restartable sequences > > I appreciate what you are trying to do, but those seccomp filters > totally break encapsulation. I have no idea how to support this > properly, in a sustainable way. It appears very difficult to do this > for independently evolving libraries. No matter what, any seccomp whitelist is doomed to break in the future if your program uses shared libraries (including glibc). I think seccomp filters can reasonably be used with a blacklist of syscalls, but not with a whitelist. An anecdote: in WebKit (which has a seccomp filter sandbox not compiled by default, because it is unfinished and very fragile), the web process receives SIGSYS from seccomp when it calls open() or a related function, which it does not have permission to use; it then passes the filename of the file it wants to open via IPC to a broker process, which evaluates our filesystem policy, opens the file (if permissible), and sends the fd to the web process via a UNIX socket. This all goes awry if, in the web process's signal handler, malloc decides to call open(), triggering an infinite loop of SIGSYS handlers. So we have to open all files used by malloc (currently /proc/sys/vm/overcommit_memory and /sys/devices/system/cpu/online) and cache the fds in the web process before initializing seccomp filters. libseccomp could not help with that, since there are so many different ways to use seccomp; it doesn't know anything about our broker processs. The lesson: seccomp is also unsuitable for restricting access to the filesystem, and it's always going to be difficult to use in application programs. The best way to use it is probably via a library that only blacklists a few syscalls, and which can handle complicated problems like malloc for applications. Michael -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx http://lists.fedoraproject.org/admin/lists/devel@xxxxxxxxxxxxxxxxxxxxxxx