在 2024-02-24星期六的 19:51 +0800,Huacai Chen写道: > Hi, Xuerui, > > On Wed, Feb 21, 2024 at 2:10 PM WANG Xuerui <kernel@xxxxxxxxxx> > wrote: > > > > Hi, > > > > Recently, we -- community LoongArch porters -- have noticed a > > problem > > where the Chromium sandbox apparently wants to deny statx [^1] so > > it > > could properly inspect arguments after the sandboxed process later > > falls > > back to fstat. The reasoning behind the change was not clear in the > > patch; but we found out it's basically because there's currently > > not a > > "fd-only" version of statx, so that the sandbox has no way to > > ensure the > > path argument is empty without being able to peek into the > > sandboxed > > process's memory. For architectures able to do newfstatat though, > > the > > glibc falls back to newfstatat after getting -ENOSYS for statx, > > then the > > respective SIGSYS handler [^2] takes care of inspecting the path > > argument, transforming allowed newfstatat's into fstat instead > > which is > > allowed and has the same type of return value. > > > > But, as loongarch is the first architecture to not have fstat nor > > newfstatat, the LoongArch glibc does not attempt falling back at > > all > > when it gets -ENOSYS for statx -- and you see the problem there! > > > > Actually, back when the loongarch port was under review, people > > were > > aware of the same problem with sandboxing clone3 [^3], so clone was > > eventually kept. Unfortunately it seemed at that time no one had > > noticed > > statx, so besides restoring fstat/newfstatat to loongarch uapi (and > > postponing the problem further), it seems inevitable that we would > > need > > to tackle seccomp deep argument inspection; this is obviously a > > decision > > that shouldn't be taken lightly, so I'm posting this to restart the > > conversation to figure out a way forward together. We basically > > could do > > one of below: > > > > - just restore fstat and be done with it; > > - add a flag to statx so we can do the equivalent of just fstat(fd, > > &out) with statx, and ensuring an error happens if path is not > > empty in > > that case; > > - tackle the long-standing problem of seccomp deep argument > > inspection (!). > From my point of view, I prefer to "restore fstat", because we need > to > use the Chrome sandbox everyday (even though it hasn't been upstream > by now). But I also hope "seccomp deep argument inspection" can be > solved in the future. My idea is this problem needs syscalls to be designed with deep argument inspection in mind; syscalls before this should be considered as historical error and get fixed by resotring old syscalls. > > > Huacai > > > > > Obviously, the simplest solution would be to "just restore fstat". > > But > > again, in my opinion this is not quite a solution but a workaround > > -- we > > have good reasons to keep just statx (mainly because its feature > > set is > > a strict superset of those of fstat/newfstatat), and we're not > > quite in > > a hurry to resolve this. Because one of the prerequisites for a new > > Chromium port is "inclusion in Debian stable" -- as the loong64 > > port > > [^4] is progressing and the next Debian release would not happen > > until > > 2025, we still have time for a more "elegant" solution. > > > > Alternatively, we could also introduce a new flag for statx, maybe > > named > > AT_STATX_NO_PATH or something like that, so that statx would ignore > > the > > path altogether or error on non-empty paths if called with flags > > containing AT_EMPTY_PATH | AT_STATX_NO_PATH. This way a seccomp > > policy > > could allow statx calls only with such flags (that are passed via > > register and accessible) and maintain the same level of safety as > > with > > fstat. But there is also disadvantage to this approach: the flag > > would > > be useful only for sandboxes, because in other cases there's no > > need to > > avoid reading from &path. This is also more of a workaround to > > avoid the > > deep argument inspection problem. > > > > Lastly, should we decide to go the hardest way, according to a > > previous > > mail [^5] (about clone3) and the LPC 2019 discussion [^6] [^7], we > > probably would try the metadata-annotation-based and piece-by-piece > > approach, as it's expected to provide the most benefit and involve > > less > > code changes. The implementation, as I surmise, will involve > > modifying > > the generic syscall entrypoint for early copying of user data, and > > corresponding changes to seccomp plumbing so this information is > > properly exposed. I don't have a roadmap for non-generic-entry > > arches > > right now, and I also haven't started designing the new seccomp ABI > > for > > that. As a matter of fact, members of the LoongArch community > > (myself > > included) are still fresh to this area of expertise, so a bit more > > of > > your feedback will be appreciated. > > > > Thanks to Miao Wang from AOSC for doing much of the investigation. > > > > [^1]: > > https://chromium-review.googlesource.com/c/chromium/src/+/2823150 > > [^2]: > > https://chromium.googlesource.com/chromium/src/sandbox/+/c085b51940bd/linux/seccomp-bpf-helpers/sigsys_handlers.cc#355 > > [^3]: > > https://lore.kernel.org/linux-arch/20220511211231.GG7074@xxxxxxxxxxxxxxxxxxxxx/ > > [^4]: https://wiki.debian.org/Ports/loong64 > > [^5]: > > https://lwn.net/ml/linux-kernel/201905301122.88FD40B3@keescook/ > > [^6]: https://lwn.net/Articles/799557/ > > [^7]: > > https://lpc.events/event/4/contributions/560/attachments/397/640/deep-arg-inspection.pdf > > > > -- > > WANG "xen0n" Xuerui > > > > Linux/LoongArch mailing list:https://lore.kernel.org/loongarch/ > >