On Tue, Jun 22, 2021 at 3:07 AM Enrico Weigelt, metux IT consult <lkml@xxxxxxxxx> wrote: > > On 17.06.21 15:23, Peng Tao wrote: > > >> Just keeping fd's open while a server restarts ? > >> If that's what you want, I see much wider use far outside of fuse, > >> and that might call for some more generic approach - something like > >> Plan9's /srv filesystem. > >> > > 1. keeping FDs across userspace restart > > if application needs to be rewritten for that anyways, there're other > ways to achieve this, w/o touching the kernel at all - exec() doesn't > automatically close fd's (unless they're opened w/ O_CLOEXEC) Or application recovery after panic ;) > > > 2. help save FD in the FUSE fd passthrough use case as implemented by > > Alessio Balsini > > you mean this one ? > > https://lore.kernel.org/lkml/20210125153057.3623715-1-balsini@xxxxxxxxxxx > > I fail to see why an extra fd store within the fuse device is necessary > for that - I'd just let the fuse daemon(s) reply the open request with > the fd it already holds. Alessio already has a similar implementation in his patchset. The RPC patch tries to make it generic and thus usable for other use cases like fuse daemon upgrade and panic-recovery.b > > I'd hate to run into situations where even killing all processes holding > some file open leads to a situation where it remains open inside the > kernel, thus blocking e.g. unmounting. I already see operators getting > very angy ... :o This is really a different design approach. The idea is to keep an FD active beyond the lifetime of a running process so that we can do panic recovery. Alessio's patchset has similar side effect in some corner cases and this RFC patch makes it a semantic promise. Whether ops like it would really depend on what they want. > > by the way: alessio's approach is limited to simple read/write > operations anyways - other operations like ioctl() don't seem to work > easily that way. > > and for the creds switching: I tend to believe that cases where a fs or > device looks at the calling process' creds in operations on an already > open fd, it's most likely a bad implementation. > I agree but I understand the rationale as well. A normal FUSE read/write uses FUSE daemon creds so the semantics are the same. Otherwise as you outline below, we'd have to go through all the read/write callbacks to make sure none of them is checking process creds. > yes, some legacy drivers actually do check for CAP_SYS_ADMIN e.g. for > low level hardware configuration (e.g. IO and IRQ on ISA bus), but I > wonder whether these are use at all in the our use cases and should be > ever allowed to non-root. > > do you have any case where you really need to use the opener's creds ? > (after the fd is already open) > > >> Does FUSE actually manipulate the process' fd table directly, while > >> in the open() callback ? > > > > hmm, you are right. The open() callback cannot install FD from there. > > So in order for your use case to work, the VFS layer needs to be > > changed to transparently replace an empty file struct with another > > file struct that is prepared by the file system somewhere else. It is > > really beyond the current RFC patch's scope IMHO. > > Exactly. That's where I'm struggling right now. Yet have to find out > whether I could just copy from one struct file into another (probably > some refcnt'ing required). And that still has some drawback: fd state > like file position won't be shared. > > I've been thinking about changing the vfs_open() chain so that it > doesn't pass in an existing/prepared struct file, but instead returns > one, which is allocated further down the chain, right before the fs' > open operation is called. Then we could add another variant that > returns struct file. If the new one is present, it will be called, > otherwise a new struct file is allocated, the old variant is called > on the newly allocated one, and finally return this one. > > this is a bigger job to do ... > Agreed. Cheers, Tao -- Into Sth. Rich & Strange