On Tue, Aug 6, 2024 at 11:26 AM Joanne Koong <joannelkoong@xxxxxxxxx> wrote: > > On Tue, Aug 6, 2024 at 10:11 AM Bernd Schubert > <bernd.schubert@xxxxxxxxxxx> wrote: > > > > On 8/6/24 18:23, Joanne Koong wrote: > > > > >> > > >> This is very interesting. These logs (and the ones above with the > > >> lxcfs server running concurrently) are showing that the read request > > >> was freed but not through the do_fuse_request_end path. It's weird > > >> that fuse_simple_request reached fuse_put_request without > > >> do_fuse_request_end having been called (which is the only place where > > >> FR_FINISHED gets set and wakes up the wait events in > > >> request_wait_answer). > > >> > > >> I'll take a deeper look tomorrow and try to make more sense of it. > > > > > > Finally realized what's happening! > > > When we kill the cat program, if the request hasn't been sent out to > > > userspace yet when the fatal signal interrupts the > > > wait_event_interruptible and wait_event_killable in > > > request_wait_answer(), this will clean up the request manually (not > > > through the fuse_request_end() path), which doesn't delete the timer. > > > > > > I'll fix this for v3. > > > > > > Thank you for surfacing this and it would be much appreciated if you > > > could test out v3 when it's submitted to make sure. > > > > It is still just a suggestion, but if the timer would have its own ref, > > any oversight of another fuse_put_request wouldn't be fatal. > > > > Thanks for the suggestion. My main concerns are whether it's worth the > extra (minimal?) performance penalty for something that's not strictly > needed and whether it ends up adding more of a burden to keep track of > the timer ref (eg in error handling like the case above where the > fatal signal is for a request that hasn't been sent to userspace yet, > having to account for the extra timer ref if the timer callback didn't > execute). I don't think adding a timer ref would prevent fatal crashes > on fuse_put_request oversights (unless we also mess up not releasing a > corresponding timer ref :)) I amend this last sentence - I just realized your point about the fatal crashes is that if we accidentally miss a fuse_put_request altogether, we'd also miss releasing the timer ref in that path, which means the timer callback would be the one releasing the last ref. > > I don't feel that strongly about this though so if you do, I can add > this in for v3. > > Thanks, > Joanne > > > > > Thanks, > > Bernd