On Tue, Aug 6, 2024 at 10:11 AM Bernd Schubert <bernd.schubert@xxxxxxxxxxx> wrote: > > On 8/6/24 18:23, Joanne Koong wrote: > > >> > >> This is very interesting. These logs (and the ones above with the > >> lxcfs server running concurrently) are showing that the read request > >> was freed but not through the do_fuse_request_end path. It's weird > >> that fuse_simple_request reached fuse_put_request without > >> do_fuse_request_end having been called (which is the only place where > >> FR_FINISHED gets set and wakes up the wait events in > >> request_wait_answer). > >> > >> I'll take a deeper look tomorrow and try to make more sense of it. > > > > Finally realized what's happening! > > When we kill the cat program, if the request hasn't been sent out to > > userspace yet when the fatal signal interrupts the > > wait_event_interruptible and wait_event_killable in > > request_wait_answer(), this will clean up the request manually (not > > through the fuse_request_end() path), which doesn't delete the timer. > > > > I'll fix this for v3. > > > > Thank you for surfacing this and it would be much appreciated if you > > could test out v3 when it's submitted to make sure. > > It is still just a suggestion, but if the timer would have its own ref, > any oversight of another fuse_put_request wouldn't be fatal. > Thanks for the suggestion. My main concerns are whether it's worth the extra (minimal?) performance penalty for something that's not strictly needed and whether it ends up adding more of a burden to keep track of the timer ref (eg in error handling like the case above where the fatal signal is for a request that hasn't been sent to userspace yet, having to account for the extra timer ref if the timer callback didn't execute). I don't think adding a timer ref would prevent fatal crashes on fuse_put_request oversights (unless we also mess up not releasing a corresponding timer ref :)) I don't feel that strongly about this though so if you do, I can add this in for v3. Thanks, Joanne > > Thanks, > Bernd