On Sun, Sep 10, 2023 at 09:29:14PM -0400, Kent Overstreet wrote: > On Mon, Sep 11, 2023 at 11:05:09AM +1000, Dave Chinner wrote: > > On Sat, Sep 09, 2023 at 06:42:30PM -0400, Kent Overstreet wrote: > > > On Sat, Sep 09, 2023 at 08:50:39AM -0400, James Bottomley wrote: > > > > So why can't we figure out that easier way? What's wrong with trying to > > > > figure out if we can do some sort of helper or library set that assists > > > > supporting and porting older filesystems. If we can do that it will not > > > > only make the job of an old fs maintainer a lot easier, but it might > > > > just provide the stepping stones we need to encourage more people climb > > > > up into the modern VFS world. > > > > > > What if we could run our existing filesystem code in userspace? > > > > You mean like lklfuse already enables? > > I'm not seeing that it does? > > I just had a look at the code, and I don't see anything there related to > the VFS - AFAIK, a VFS -> fuse layer doesn't exist yet. Just to repeat what I said on #xfs here... It doesn't try to cut in half way through the VFS -> filesystem path. It just redirects the fuse operations to "lkl syscalls" and so runs the entire kernel VFS->filesystem path. https://github.com/lkl/linux/blob/master/tools/lkl/lklfuse.c > And that looks a lot heavier than what we'd ideally want, i.e. a _lot_ > more kernel code would be getting pulled in. The entire block layer, > probably the scheduler as well. Yes, but arguing that "performance sucks" misses the entire point of this discussion: that for the untrusted user mounts of untrusted filesystem images we already have a viable method for moving the dangerous processing out into userspace that requires almost *zero additional work* from anyone. As long as the performance of the lklfuse implementation doesn't totally suck, nobody will really care that much that isn't quite as fast as a native implementation. PLuggable drives (e.g. via USB) are already going to be much slower than a host installed drive, so I don't think performance is even really a consideration for these sorts of use cases.... > What I've got in bcachefs-tools is a much thinner mapping from e.g. > kthreads -> pthreads, block layer -> aio, etc. Right, and we've got that in userspace for XFS, too. If we really cared that much about XFS-FUSE, I'd be converting userspace to use ublk w/ io_uring on top of a port of the kernel XFS buffer cache as the basis for a performant fuse implementation. However, there's a massive amount of userspace work needed to get a native XFS FUSE implementation up and running (even ignoring performance), so it's just not a viable short-term - or even medium-term - solution to the current problems. Indeed, if you do a fuse->fs ops wrapper, I'd argue that lklfuse is the place to do it so that there is a single code base that supports all kernel filesystems without requiring anyone to support a separate userspace code base. Requiring every filesystem to do their own FUSE ports and then support them doesn't reduce the overall maintenance overhead burden on filesystem developers.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx