On Tue, Dec 10, 2024 at 11:13:16AM +0100, Christian Brauner wrote: > On Mon, Dec 09, 2024 at 12:20:10PM -0500, Chuck Lever wrote: > > On 12/9/24 12:15 PM, Jeff Layton wrote: > > > On Mon, 2024-12-09 at 11:35 -0500, Chuck Lever wrote: > > > > On 12/9/24 11:30 AM, Amir Goldstein wrote: > > > > > On Mon, Dec 9, 2024 at 2:46 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, Dec 09, 2024 at 09:58:58AM +0100, Amir Goldstein wrote: > > > > > > > To be clear, exporting pidfs or internal shmem via an anonymous fd is > > > > > > > probably not possible with existing userspace tools, but with all the new > > > > > > > mount_fd and magic link apis, I can never be sure what can be made possible > > > > > > > to achieve when the user holds an anonymous fd. > > > > > > > > > > > > > > The thinking behind adding the EXPORT_OP_LOCAL_FILE_HANDLE flag > > > > > > > was that when kernfs/cgroups was added exportfs support with commit > > > > > > > aa8188253474 ("kernfs: add exportfs operations"), there was no intention > > > > > > > to export cgroupfs over nfs, only local to uses, but that was never enforced, > > > > > > > so we thought it would be good to add this restriction and backport it to > > > > > > > stable kernels. > > > > > > > > > > > > Can you please explain what the problem with exporting these file > > > > > > systems over NFS is? Yes, it's not going to be very useful. But what > > > > > > is actually problematic about it? Any why is it not problematic with > > > > > > a userland nfs server? We really need to settle that argumet before > > > > > > deciding a flag name or polarity. > > > > > > > > > > > > > > > > I agree that it is not the end of the world and users do have to explicitly > > > > > use fsid= argument to be able to export cgroupfs via nfsd. > > > > > > > > > > The idea for this patch started from the claim that Jeff wrote that cgroups > > > > > is not allowed for nfsd export, but I couldn't find where it is not allowed. > > > > > > > > > > > I think that must have been a wrong assumption on my part. I don't see > > > anything that specifically prevents that either. If cgroupfs is mounted > > > and you tell mountd to export it, I don't see what would prevent that. > > > > > > To be clear, I don't see how you would trick bog-standard mountd into > > > exporting a filesystem that isn't mounted into its namespace, however. > > > Writing a replacement for mountd is always a possibilty. > > > > > > > > I have no issue personally with leaving cgroupfs exportable via nfsd > > > > > and changing restricting only SB_NOUSER and SB_KERNMOUNT fs. > > > > > > > > > > Jeff, Chuck, what is your opinion w.r.t exportability of cgroupfs via nfsd? > > > > > > > > We all seem to be hard-pressed to find a usage scenario where exporting > > > > pseudo-filesystems via NFS is valuable. But maybe someone has done it > > > > and has a good reason for it. > > > > > > > > The issue is whether such export should be consistently and actively > > > > prevented. > > > > > > > > I'm not aware of any specific security issues with it. > > > > > > > > > > > > > > I'm not either, but we are in new territory here. nfsd is a network > > > service, so it does present more of an attack surface vs. local access. > > > > > > In general, you do have to take active steps to export a filesystem, > > > but if someone exports / with "crossmnt", everything mounted is > > > potentially accessible. That's obviously a dumb thing to do, but people > > > make mistakes, and it's possible that doing this could be part of a > > > wider exploit. > > > > > > I tend to think it safest to make exporting via nfsd an opt-in thing on > > > a per-fs basis (along the lines of this patchset). If someone wants to > > > allow access to more "exotic" filesystems, let them argue their use- > > > case on the list first. > > > > If we were starting from scratch, 100% agree. > > > > The current situation is that these file systems appear to be exportable > > (and not only via NFS). The proposal is that this facility is to be > > taken away. This can easily turn into a behavior regression for someone > > if we're not careful. > > So I'm happy to drop the exportfs preliminary we have now preventing > kernfs from being exported but then Christoph and you should figure out > what the security implications of allowing kernfs instances to be > exported areare because I'm not an NFS export expert. > > Filesystems that fall under kernfs that are exportable by NFS as I > currently understand it are at least: > > (1) sysfs > (2) cgroupfs > > Has anyone ever actually tried to export the two and tested what > happens? Because I wouldn't be surprised if this ended in tears but > maybe I'm overly pessimistic. > > Both (1) and (2) are rather special and don't have standard filesystem > semantics in a few places. > > - cgroupfs isn't actually namespace aware. Whereas most filesystems like > tmpfs and ramfs that are mountable inside unprivileged containers are > multi-instance filesystems, aka allocate a new superblock per > container cgroupfs is single-instance with a nasty implementation to > virtualize the per-container view via cgroup namespaces. I wouldn't be > surprised if that ends up being problematic. > > - Cgroupfs has write-time permission checks as the process that is moved > into a cgroup isn't known at open time. That has been exploitable > before this was fixed. > > - Even though it's legacy cgroup has a v1 and v2 mode where v1 is even > more messed up than v2 including the release-agent logic which ends up > issuing a usermode helper to call a binary when a cgroup is released. > > - sysfs potentially exposes all kinds of extremly low-level information > to a remote machine. > > None of this gives me the warm and fuzzy. But that's just me. > > Otherwise, I don't understand what it means that a userspace NFS server > can export kernfs instances. I don't know what that means and what the > contrast to in-kernel NFS server export is and whether that has the same > security implications. If so it's even scary that some random userspace > NFS server can just expose guts like kernfs. > > But if both of you feel that this is safe to do and there aren't any > security issues lurking that have gone unnoticed simply because no one > has really ever exported sysfs or cgroupfs then by all means continue > allowing that. I'm rather skeptical. Amir pointed that sysfs can't be exported as it opts out of kernfs export_operations being set.