On Wed, May 17, 2023 at 5:50 AM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote: > > > > On 2023/5/2 17:07, Daniel Rosenberg wrote: > > On Mon, Apr 24, 2023 at 8:32 AM Miklos Szeredi <miklos@xxxxxxxxxx> wrote: > >> > >> > >> The security model needs to be thought about and documented. Think > >> about this: the fuse server now delegates operations it would itself > >> perform to the passthrough code in fuse. The permissions that would > >> have been checked in the context of the fuse server are now checked in > >> the context of the task performing the operation. The server may be > >> able to bypass seccomp restrictions. Files that are open on the > >> backing filesystem are now hidden (e.g. lsof won't find these), which > >> allows the server to obfuscate accesses to backing files. Etc. > >> > >> These are not particularly worrying if the server is privileged, but > >> fuse comes with the history of supporting unprivileged servers, so we > >> should look at supporting passthrough with unprivileged servers as > >> well. > >> > > > > This is on my todo list. My current plan is to grab the creds that the > > daemon uses to respond to FUSE_INIT. That should keep behavior fairly > > similar. I'm not sure if there are cases where the fuse server is > > operating under multiple contexts. > > I don't currently have a plan for exposing open files via lsof. Every > > such file should relate to one that will show up though. I haven't dug > > into how that's set up, but I'm open to suggestions. > > > >> My other generic comment is that you should add justification for > >> doing this in the first place. I guess it's mainly performance. So > >> how performance can be won in real life cases? It would also be good > >> to measure the contribution of individual ops to that win. Is there > >> another reason for this besides performance? > >> > >> Thanks, > >> Miklos > > > > Our main concern with it is performance. We have some preliminary > > numbers looking at the pure passthrough case. We've been testing using > > a ramdrive on a somewhat slow machine, as that should highlight > > differences more. We ran fio for sequential reads, and random > > read/write. For sequential reads, we were seeing libfuse's > > passthrough_hp take about a 50% hit, with fuse-bpf not being > > detectably slower. For random read/write, we were seeing a roughly 90% > > drop in performance from passthrough_hp, while fuse-bpf has about a 7% > > drop in read and write speed. When we use a bpf that traces every > > opcode, that performance hit increases to a roughly 1% drop in > > sequential read performance, and a 20% drop in both read and write > > performance for random read/write. We plan to make more complex bpf > > examples, with fuse daemon equivalents to compare against. > > > > We have not looked closely at the impact of individual opcodes yet. > > > > There's also a potential ease of use for fuse-bpf. If you're > > implementing a fuse daemon that is largely mirroring a backing > > filesystem, you only need to write code for the differences in > > behavior. For instance, say you want to remove image metadata like > > location. You could give bpf information on what range of data is > > metadata, and zero out that section without having to handle any other > > operations. > > A bit out of topic (although I'm not quite look into FUSE BPF internals) > After roughly listening to this topic in FS track last week, I'm not > quite sure (at least in the long term) if it might be better if > ebpf-related filter/redirect stuffs could be landed in vfs or in a > somewhat stackable fs so that we could redirect/filter any sub-fstree > in principle? It's just an open question and I have no real tendency > of this but do we really need a BPF-filter functionality for each > individual fs? I think that is a valid question, but the answer is that even if it makes sense, doing something like this in vfs would be a much bigger project with larger consequences on performance and security and whatnot, so even if (and a very big if) this ever happens, using FUSE-BPF as a playground for this sort of stuff would be a good idea. This reminds me of union mounts - it made sense to have union mount functionality in vfs, but after a long winding road, a stacked fs (overlayfs) turned out to be a much more practical solution. > > It sounds much like > https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/about-file-system-filter-drivers > Nice reference. I must admit that I found it hard to understand what Windows filter drivers can do compared to FUSE-BPF design. It'd be nice to get some comparison from what is planned for FUSE-BPF. Interesting to note that there is a "legacy" Windows filter driver API, so Windows didn't get everything right for the first API - that is especially interesting to look at as repeating other people's mistakes would be a shame. Thanks, Amir.