On Wed, Jun 2, 2021 at 11:52 PM Alessio Balsini <balsini@xxxxxxxxxxx> wrote: > > On Tue, Jun 01, 2021 at 04:58:26PM +0800, Peng Tao wrote: > > Add a generic file store that userspace can save/restore any open file > > descriptor. These file descriptors can be managed by different > > applications not just the same user space application. > > > > A possible use case is fuse fd passthrough being developed > > by Alessio Balsini [1] where underlying file system fd can be saved in > > this file store. > > > > Another possible use case is user space application live upgrade and > > failover (upon panic etc.). Currently during userspace live upgrade and > > failover, open file descriptors usually have to be saved seprately in > > a different management process with AF_UNIX sendmsg. > > > > But it causes chicken and egg problem and such management process needs > > to support live upgrade and failover as well. With a generic file store > > in the kernel, application live upgrade and failover no longer require such > > management process to hold reference for their open file descriptors. > > > > This is an RFC to see if the approach makes sense to upstream and it can be > > tested with the following C programe. > > > > Why FUSE? > > - Because we are trying to solve FUSE fd passthrough and FUSE daemon > > live upgrade. > > > > Why global IDR rather than per fuse connnection one? > > - Because for live upgrade new process, we don't have a valid fuse connection > > in the first place. > > > > Missing cleanup method in case user space messes up? > > - We can limit the number of saved FDs and hey it is RFC ;). > > > > [1] https://lore.kernel.org/lkml/20210125153057.3623715-1-balsini@xxxxxxxxxxx/ > > -------- > > > > [...] > > > > > Hi Peng, Hi Alessio, > > This is a cool feature indeed. > > I guess we also want to ensure that restoring an FD can only be > performed by a trusted FUSE daemon, and not any other process attached > to /dev/fuse. Maybe adding some permission checks? > The idea is to allow any daemon capable of opening /dev/fuse to be able to restore an FD. I don't quite get which permissions do you like to check? SYS_ADMIN? > I also see that multiple restores can be done on the same FD, is that > intended? Shouldn't the IDR entry be removed once restored? > In a crash recovery scenario, if the kernel destroys the IDR once an FD is restored, and the recovering daemon panics again, the FD is lost forever. So I would prefer to keep it in the kernel until explicit FD removal. > As far as I understand, the main use case is to be able to replace a > FUSE daemon with another, preserving the opened lower file system files. > How would user space handle the unmounting of the old FUSE file system > and mounting of the new one? It can call FUSE_DEV_IOC_REMOVE_FD before or after unmounting the old FUSE file system. Either way, the last closer of the FUSE connection FD would actually close the FUSE connection. > I wonder if something can be done with a pair of ioctls similar to > FUSE_DEV_IOC_CLONE to transfer the FUSE connection from the old to the > new FUSE daemon. Maybe either the IDR or some other container to store > the files that are intended to be preserved can be put in fuse_conn > instead of keeping it global. > > Does it make sense? > It makes sense at first glance since obviously it helps IDR cleaning up as we can do it on a per fuse_conn basis. But giving it a second thought, how do we preserve the FUSE connection fd representing the same fuse_conn itself? We need to do it because we want to handle FUSE daemon crash recovery cases. Maybe we can have something like: 1. use a tag to uniquely identify a fuse conn (as being done for virtio-fs) 2. in each of these SAVE/GET/REMOVE ioctls, it takes a tag argument so FDs are kept locally to the specified fuse_conn 3. add FUSE_DEV_IOC_TRANSFER ioctl to transfer ownership of a saved FD to a new userspace daemon. It can be seen as a combination of GET and REMOVE ioctls Cheers, Tao -- Into Sth. Rich & Strange