Re: Chromium sandbox on LoongArch and statx -- seccomp deep argument inspection again?

Icenowy Zheng <uwu@xxxxxxxxxx> · Mon, 26 Feb 2024 14:03:48 +0800

在 2024-02-25星期日的 15:32 +0800，Xi Ruoyao写道：
> On Sun, 2024-02-25 at 14:51 +0800, Icenowy Zheng wrote:
> > > From my point of view, I prefer to "restore fstat", because we
> > > need
> > > to
> > > use the Chrome sandbox everyday (even though it hasn't been
> > > upstream
> > > by now). But I also hope "seccomp deep argument inspection" can
> > > be
> > > solved in the future.
> > 
> > My idea is this problem needs syscalls to be designed with deep
> > argument inspection in mind; syscalls before this should be
> > considered
> > as historical error and get fixed by resotring old syscalls.
> 
> I'd not consider fstat an error as using statx for fstat has a
> performance impact (severe for some workflows), and Linus has
> concluded

Sorry for clearance, I mean statx is an error in ABI design, not fstat.

> "if the user wants fstat, give them fstat" for the performance issue:
> 
> https://sourceware.org/pipermail/libc-alpha/2023-September/151365.html
> 
> However we only want fstat (actually "newfstat" in fs/stat.c), and it
> seems we don't want to resurrect newstat, newlstat, newfstatat, etc.
> (or
> am I missing any benefit - performance or "just pleasing seccomp" -
> of
> them comparing to statx?) so we don't want to just define
> __ARCH_WANT_NEW_STAT.  So it seems we need to add some new #if to
> fs/stat.c and include/uapi/asm-generic/unistd.h.
> 
> And no, it's not a design issue of all other syscalls.  It's just the
> design issue of seccomp.  There's no way to design a syscall allowing
> seccomp to inspect a 100-character path in its argument unless
> refactoring seccomp entirely because we cannot fit a 100-character
> path
> into 8 registers.

Well my meaning is that syscalls should be designed to be simple to
prevent this kind of circumstance.

> 
> As at now people do use PTRACE_PEEKDATA for "deep inspection"
> (actually
> "debugging" the target process) but it obviously makes a very severe
> performance impact.
> 
> <rant>
> 
> Today the entire software industry is saying "do things in a
> declarative
> way" but seccomp is completely the opposite.  It's auditing *how* the
> sandboxed application is doing things instead of *what* will be done.
> 
> I've raised my against to seccomp and/or syscall allowlisting several
> times after seeing so many breakages like:
> 
> - https://github.com/NetworkConfiguration/dhcpcd/issues/120
> - https://gitlab.gnome.org/GNOME/tracker-miners/-/issues/252
> - https://blog.pintia.cn/2018/06/27/glibc-segmentation-fault/
> -
> http://web.archive.org/web/20210126121421/http://acm.xidian.edu.cn/discuss/thread.php?tid=148&cid=#
>  (comment 3)
> 
> but people just keep telling me "you are wrong, you don't understand
> security".  Some of them even complain "seccomp is broken" as well
> but
> still keep using it.
> 
> </rant>
>