On 6/6/23 16:37, Miklos Szeredi wrote:
On Sun, 14 May 2023 at 00:04, Askar Safin <safinaskar@xxxxxxxxx> wrote:
Will this patch fix a long-standing fuse vs suspend bug? (
https://bugzilla.kernel.org/show_bug.cgi?id=34932 )
No.
The solution to the fuse issue is to freeze processes that initiate
fuse requests *before* freezing processes that serve fuse requests.
The problem is finding out which is which. This can be complicated by
the fact that a process could be both serving requests *and*
initiating them (even without knowing).
The best idea so far is to let fuse servers set a process flag
(PF_FREEZE_LATE) that is inherited across fork/clone. For example the
sshfs server would do the following before starting request processing
or starting ssh:
echo 1 > /proc/self/freeze_late
This would make the sshfs and ssh processes be frozen after processes
that call into the sshfs mount.
Hmm, why would this need to be done manually on the server (daemon)
side? It could be automated on the fuse kernel side, for example in
process_init_reply() using current task context?
A slightly better version would give scores, the later the daemon/server
is created the higher its freezing score - would help a bit with stacked
fuse file systems, although not perfectly. For that struct task would
need to be extended, though.
After normal (non-server) processes are frozen, server processes
should not be getting new requests and can be frozen.
Issues remaining:
- if requests are stuck (e.g. network is down) then the requester
process can't be frozen and suspend will still fail.
- if server process is generating filesystem activity (new fuse
requests) spontaneously, then there's nothing to differentiate between
server processes and we are back to the original problem
Solution to both these are probably non-kernel: impacted servers need
to receive notification from systemd when suspend is starting and act
accordingly.
Attaching work-in-progress patch. This needs to be improved to freeze
server processes in a separate phase from kernel threads, but it
should be able to demonstrate the idea.
Thanks,
Bernd