On Tue, 14 Dec 2021 at 13:58, Robert Vasek <rvasek01@xxxxxxxxx> wrote: > > Hello fuse-devel, > > I'd like to ask about the feasibility of having a reconnect feature added into the FUSE kernel module. > > The idea is that when a FUSE driver disconnects (process exited due to a bug, signal, etc.), all pending and future ops for that session would wait for that driver to appear again, and then continue as normal. Waiting would be on a timer, with ENOTCONN returned in case it times out. Obviously, "continue as normal" isn't possible for all FUSE drivers, as it depends on what they do and how they implement things -- they would have to opt-in for this feature. > > Use-cases span across basically anything where the lifecycle of a FUSE driver is managed by some external component (e.g. systemd, container orchestrators). This is especially true in containerized environments: volume mounts provided by FUSE drivers running in containers may get killed / rescheduled by the Orchestrator, or they may crash due to bugs, memory pressure, ..., leading to very possible data corruption and severed mounts. Having the ability to recover from such situations would greatly improve reliability of these systems. > > I haven't looked at how this would be implemented yet though. I'm just wondering if this makes sense at all and if you folks would be interested in such a feature? A kernel patch[1] as well as example userspace code[2] has already been proposed. [1] https://lore.kernel.org/linux-fsdevel/CAPm50a+j8UL9g3UwpRsye5e+a=M0Hy7Tf1FdfwOrUUBWMyosNg@xxxxxxxxxxxxxx/ [2] https://lore.kernel.org/linux-fsdevel/CAPm50aLuK8Smy4NzdytUPmGM1vpzokKJdRuwxawUDA4jnJg=Fg@xxxxxxxxxxxxxx/ The example recovery is not very practical, but I can see how it would be possible to extend to a read-only fs. Is this what you had in mind? Thanks, Miklos