Hi, Xuerui, On Wed, Feb 21, 2024 at 2:10 PM WANG Xuerui <kernel@xxxxxxxxxx> wrote: > > Hi, > > Recently, we -- community LoongArch porters -- have noticed a problem > where the Chromium sandbox apparently wants to deny statx [^1] so it > could properly inspect arguments after the sandboxed process later falls > back to fstat. The reasoning behind the change was not clear in the > patch; but we found out it's basically because there's currently not a > "fd-only" version of statx, so that the sandbox has no way to ensure the > path argument is empty without being able to peek into the sandboxed > process's memory. For architectures able to do newfstatat though, the > glibc falls back to newfstatat after getting -ENOSYS for statx, then the > respective SIGSYS handler [^2] takes care of inspecting the path > argument, transforming allowed newfstatat's into fstat instead which is > allowed and has the same type of return value. > > But, as loongarch is the first architecture to not have fstat nor > newfstatat, the LoongArch glibc does not attempt falling back at all > when it gets -ENOSYS for statx -- and you see the problem there! > > Actually, back when the loongarch port was under review, people were > aware of the same problem with sandboxing clone3 [^3], so clone was > eventually kept. Unfortunately it seemed at that time no one had noticed > statx, so besides restoring fstat/newfstatat to loongarch uapi (and > postponing the problem further), it seems inevitable that we would need > to tackle seccomp deep argument inspection; this is obviously a decision > that shouldn't be taken lightly, so I'm posting this to restart the > conversation to figure out a way forward together. We basically could do > one of below: > > - just restore fstat and be done with it; > - add a flag to statx so we can do the equivalent of just fstat(fd, > &out) with statx, and ensuring an error happens if path is not empty in > that case; > - tackle the long-standing problem of seccomp deep argument inspection (!). >From my point of view, I prefer to "restore fstat", because we need to use the Chrome sandbox everyday (even though it hasn't been upstream by now). But I also hope "seccomp deep argument inspection" can be solved in the future. Huacai > > Obviously, the simplest solution would be to "just restore fstat". But > again, in my opinion this is not quite a solution but a workaround -- we > have good reasons to keep just statx (mainly because its feature set is > a strict superset of those of fstat/newfstatat), and we're not quite in > a hurry to resolve this. Because one of the prerequisites for a new > Chromium port is "inclusion in Debian stable" -- as the loong64 port > [^4] is progressing and the next Debian release would not happen until > 2025, we still have time for a more "elegant" solution. > > Alternatively, we could also introduce a new flag for statx, maybe named > AT_STATX_NO_PATH or something like that, so that statx would ignore the > path altogether or error on non-empty paths if called with flags > containing AT_EMPTY_PATH | AT_STATX_NO_PATH. This way a seccomp policy > could allow statx calls only with such flags (that are passed via > register and accessible) and maintain the same level of safety as with > fstat. But there is also disadvantage to this approach: the flag would > be useful only for sandboxes, because in other cases there's no need to > avoid reading from &path. This is also more of a workaround to avoid the > deep argument inspection problem. > > Lastly, should we decide to go the hardest way, according to a previous > mail [^5] (about clone3) and the LPC 2019 discussion [^6] [^7], we > probably would try the metadata-annotation-based and piece-by-piece > approach, as it's expected to provide the most benefit and involve less > code changes. The implementation, as I surmise, will involve modifying > the generic syscall entrypoint for early copying of user data, and > corresponding changes to seccomp plumbing so this information is > properly exposed. I don't have a roadmap for non-generic-entry arches > right now, and I also haven't started designing the new seccomp ABI for > that. As a matter of fact, members of the LoongArch community (myself > included) are still fresh to this area of expertise, so a bit more of > your feedback will be appreciated. > > Thanks to Miao Wang from AOSC for doing much of the investigation. > > [^1]: https://chromium-review.googlesource.com/c/chromium/src/+/2823150 > [^2]: > https://chromium.googlesource.com/chromium/src/sandbox/+/c085b51940bd/linux/seccomp-bpf-helpers/sigsys_handlers.cc#355 > [^3]: > https://lore.kernel.org/linux-arch/20220511211231.GG7074@xxxxxxxxxxxxxxxxxxxxx/ > [^4]: https://wiki.debian.org/Ports/loong64 > [^5]: https://lwn.net/ml/linux-kernel/201905301122.88FD40B3@keescook/ > [^6]: https://lwn.net/Articles/799557/ > [^7]: > https://lpc.events/event/4/contributions/560/attachments/397/640/deep-arg-inspection.pdf > > -- > WANG "xen0n" Xuerui > > Linux/LoongArch mailing list:https://lore.kernel.org/loongarch/ >